GroupScan (Apache Drill Root POM 1.18.0 API)

All Superinterfaces:

FragmentLeaf, GraphValue<PhysicalOperator>, HasAffinity, Iterable<PhysicalOperator>, Leaf, PhysicalOperator, Scan

All Known Subinterfaces:

DbGroupScan, FileGroupScan, IndexGroupScan

All Known Implementing Classes:

AbstractDbGroupScan, AbstractFileGroupScan, AbstractGroupScan, AbstractGroupScanWithMetadata, AbstractParquetGroupScan, BinaryTableGroupScan, DirectGroupScan, DruidGroupScan, EasyGroupScan, HBaseGroupScan, HiveDrillNativeParquetScan, HiveScan, HttpGroupScan, InfoSchemaGroupScan, JdbcGroupScan, JsonTableGroupScan, KafkaGroupScan, KuduGroupScan, MapRDBGroupScan, MetadataDirectGroupScan, MockGroupScanPOP, MongoGroupScan, OpenTSDBGroupScan, ParquetGroupScan, RestrictedJsonTableGroupScan, SchemalessScan, SystemTableScan
```
public interface GroupScan
extends Scan, HasAffinity
```
A GroupScan operator represents all data which will be scanned by a given physical plan. It is the superset of all SubScans for the plan.

Field Summary

Fields
Modifier and Type Field and Description

static List<SchemaPath> ALL_COLUMNS
columns list in GroupScan : 1) empty_column is for skipAll query.

Fields
Modifier and Type	Field and Description
`static List<SchemaPath>`	`ALL_COLUMNS` columns list in GroupScan : 1) empty_column is for skipAll query.

Method Summary

All Methods Instance Methods Abstract Methods Deprecated Methods
Modifier and Type	Method and Description
`void`	`applyAssignments(List<CoordinationProtos.DrillbitEndpoint> endpoints)`
`GroupScan`	`applyFilter(LogicalExpression filterExpr, UdfUtilities udfUtilities, FunctionImplementationRegistry functionImplementationRegistry, OptionManager optionManager)`
`GroupScan`	`applyLimit(int maxRecords)` Apply rowcount based prune for "LIMIT n" query.
`boolean`	`canPushdownProjects(List<SchemaPath> columns)` GroupScan should check the list of columns, and see if it could support all the columns in the list.
`GroupScan`	`clone(List<SchemaPath> columns)` Returns a clone of GroupScan instance, except that the new GroupScan will use the provided list of columns .
`boolean`	`enforceWidth()` Deprecated. Use `getMinParallelizationWidth()` to determine whether this GroupScan spans more than one fragment.
`AnalyzeInfoProvider`	`getAnalyzeInfoProvider()` Returns `AnalyzeInfoProvider` instance which will be used when running ANALYZE statement.
`List<SchemaPath>`	`getColumns()` Returns a list of columns scanned by this group scan
`long`	`getColumnValueCount(SchemaPath column)` Return the number of non-null value in the specified column.
`String`	`getDigest()` Returns a signature of the `GroupScan` which should usually be composed of all its attributes which could describe it uniquely.
`Collection<org.apache.hadoop.fs.Path>`	`getFiles()` Returns a collection of file names associated with this GroupScan.
`LogicalExpression`	`getFilter()`
`int`	`getMaxParallelizationWidth()`
`TableMetadataProvider`	`getMetadataProvider()` Returns `TableMetadataProvider` instance which is used for providing metadata for current `GroupScan`.
`int`	`getMinParallelizationWidth()` At minimum, the GroupScan requires these many fragments to run.
`List<SchemaPath>`	`getPartitionColumns()` Returns a list of columns that can be used for partition pruning
`ScanStats`	`getScanStats(PlannerSettings settings)`
`org.apache.hadoop.fs.Path`	`getSelectionRoot()` Returns path to the selection root.
`SubScan`	`getSpecificScan(int minorFragmentId)`
`TableMetadata`	`getTableMetadata()`
`boolean`	`hasFiles()` Return true if this GroupScan can return its selection as a list of file names (retrieved by getFiles()).
`boolean`	`isDistributed()`
`boolean`	`supportsFilterPushDown()` Checks whether this group scan supports filter push down.
`boolean`	`supportsLimitPushdown()` Whether or not this GroupScan supports limit pushdown
`boolean`	`supportsPartitionFilterPushdown()` Whether or not this GroupScan supports pushdown of partition filters (directories for filesystems)
`boolean`	`usedMetastore()` Returns `true` if current group scan uses metadata obtained from the Metastore.

Methods inherited from interface org.apache.drill.exec.physical.base.PhysicalOperator
accept, getCost, getInitialAllocation, getMaxAllocation, getNewWithChildren, getOperatorId, getOperatorType, getSVMode, getUserName, isBufferedOperator, isExecutable, setCost, setMaxAllocation, setOperatorId

Methods inherited from interface org.apache.drill.common.graph.GraphValue
accept

Methods inherited from interface java.lang.Iterable
forEach, iterator, spliterator

Methods inherited from interface org.apache.drill.exec.physical.base.HasAffinity
getDistributionAffinity, getOperatorAffinity

- Field Detail
  - ALL_COLUMNS
```
static final List<SchemaPath> ALL_COLUMNS
```
    columns list in GroupScan : 1) empty_column is for skipAll query. 2) NULL is interpreted as ALL_COLUMNS. How to handle skipAll query is up to each storage plugin, with different policy in corresponding RecordReader.
- Method Detail
  - applyAssignments
```
void applyAssignments(List<CoordinationProtos.DrillbitEndpoint> endpoints)
               throws PhysicalOperatorSetupException
```
    Throws:
    
    PhysicalOperatorSetupException
  - getSpecificScan
```
SubScan getSpecificScan(int minorFragmentId)
                 throws ExecutionSetupException
```
    Throws:
    
    ExecutionSetupException
  - getMaxParallelizationWidth
```
int getMaxParallelizationWidth()
```
  - isDistributed
```
boolean isDistributed()
```
  - getMinParallelizationWidth
```
int getMinParallelizationWidth()
```
    At minimum, the GroupScan requires these many fragments to run. Currently, this is used in SimpleParallelizer
    
    Returns:
    
    the minimum number of fragments that should run
  - enforceWidth
```
@Deprecated
boolean enforceWidth()
```
    Deprecated. Use getMinParallelizationWidth() to determine whether this GroupScan spans more than one fragment.
    
    Check if GroupScan enforces width to be maximum parallelization width. Currently, this is used in ExcessiveExchangeIdentifier
    
    Returns:
    
    if maximum width should be enforced
  - getDigest
```
String getDigest()
```
    Returns a signature of the GroupScan which should usually be composed of all its attributes which could describe it uniquely.
  - getScanStats
```
ScanStats getScanStats(PlannerSettings settings)
```
  - clone
```
GroupScan clone(List<SchemaPath> columns)
```
    Returns a clone of GroupScan instance, except that the new GroupScan will use the provided list of columns .
  - canPushdownProjects
```
boolean canPushdownProjects(List<SchemaPath> columns)
```
    GroupScan should check the list of columns, and see if it could support all the columns in the list.
  - getColumnValueCount
```
long getColumnValueCount(SchemaPath column)
```
    Return the number of non-null value in the specified column. Raise exception, if groupscan does not have exact column row count.
  - supportsPartitionFilterPushdown
```
boolean supportsPartitionFilterPushdown()
```
    Whether or not this GroupScan supports pushdown of partition filters (directories for filesystems)
  - getColumns
```
List<SchemaPath> getColumns()
```
    Returns a list of columns scanned by this group scan
  - getPartitionColumns
```
List<SchemaPath> getPartitionColumns()
```
    Returns a list of columns that can be used for partition pruning
  - supportsLimitPushdown
```
boolean supportsLimitPushdown()
```
    Whether or not this GroupScan supports limit pushdown
  - applyLimit
```
GroupScan applyLimit(int maxRecords)
```
    Apply rowcount based prune for "LIMIT n" query.
    
    Parameters:
    
    maxRecords - : the number of rows requested from group scan.
    
    Returns:
    
    a new instance of group scan if the prune is successful. null when either if row-based prune is not supported, or if prune is not successful.
  - hasFiles
```
boolean hasFiles()
```
    Return true if this GroupScan can return its selection as a list of file names (retrieved by getFiles()).
  - getSelectionRoot
```
org.apache.hadoop.fs.Path getSelectionRoot()
```
    Returns path to the selection root. If this GroupScan cannot provide selection root, it returns null.
    
    Returns:
    
    path to the selection root
  - getFiles
```
Collection<org.apache.hadoop.fs.Path> getFiles()
```
    Returns a collection of file names associated with this GroupScan. This should be called after checking hasFiles(). If this GroupScan cannot provide file names, it returns null.
    
    Returns:
    
    collection of files paths
  - getFilter
```
LogicalExpression getFilter()
```
  - applyFilter
```
GroupScan applyFilter(LogicalExpression filterExpr,
                      UdfUtilities udfUtilities,
                      FunctionImplementationRegistry functionImplementationRegistry,
                      OptionManager optionManager)
```
  - getMetadataProvider
```
TableMetadataProvider getMetadataProvider()
```
    Returns TableMetadataProvider instance which is used for providing metadata for current GroupScan.
    
    Returns:
    
    TableMetadataProvider instance the source of metadata
  - getTableMetadata
```
TableMetadata getTableMetadata()
```
  - usedMetastore
```
boolean usedMetastore()
```
    Returns true if current group scan uses metadata obtained from the Metastore.
    
    Returns:
    
    true if current group scan uses metadata obtained from the Metastore, false otherwise.
  - getAnalyzeInfoProvider
```
AnalyzeInfoProvider getAnalyzeInfoProvider()
```
    Returns AnalyzeInfoProvider instance which will be used when running ANALYZE statement.
    
    Returns:
    
    AnalyzeInfoProvider instance
  - supportsFilterPushDown
```
boolean supportsFilterPushDown()
```
    Checks whether this group scan supports filter push down.
    
    Returns:
    
    true if this group scan supports filter push down, false otherwise

Interface GroupScan

Field Summary

Method Summary

Methods inherited from interface org.apache.drill.exec.physical.base.PhysicalOperator

Methods inherited from interface org.apache.drill.common.graph.GraphValue

Methods inherited from interface java.lang.Iterable

Methods inherited from interface org.apache.drill.exec.physical.base.HasAffinity

Field Detail

ALL_COLUMNS

Method Detail

applyAssignments

getSpecificScan

getMaxParallelizationWidth

isDistributed

getMinParallelizationWidth

enforceWidth

getDigest

getScanStats

clone

canPushdownProjects

getColumnValueCount

supportsPartitionFilterPushdown

getColumns

getPartitionColumns

supportsLimitPushdown

applyLimit

hasFiles

getSelectionRoot

getFiles

getFilter

applyFilter

getMetadataProvider

getTableMetadata

usedMetastore

getAnalyzeInfoProvider

supportsFilterPushDown