Class GroupByNode
- java.lang.Object
-
- org.apache.derby.impl.sql.compile.QueryTreeNode
-
- org.apache.derby.impl.sql.compile.ResultSetNode
-
- org.apache.derby.impl.sql.compile.FromTable
-
- org.apache.derby.impl.sql.compile.SingleChildResultSetNode
-
- org.apache.derby.impl.sql.compile.GroupByNode
-
- All Implemented Interfaces:
Optimizable
,Visitable
class GroupByNode extends SingleChildResultSetNode
A GroupByNode represents a result set for a grouping operation on a select. Note that this includes a SELECT with aggregates and no grouping columns (in which case the select list is null) It has the same description as its input result set.For the most part, it simply delegates operations to its bottomPRSet, which is currently expected to be a ProjectRestrictResultSet generated for a SelectNode.
NOTE: A GroupByNode extends FromTable since it can exist in a FromList.
There is a lot of room for optimizations here:
- agg(distinct x) group by x => agg(x) group by x (for min and max)
- min()/max() use index scans if possible, no sort may be needed.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static class
GroupByNode.ExpressionSorter
Comparator class for GROUP BY expression substitution.-
Nested classes/interfaces inherited from class org.apache.derby.impl.sql.compile.ResultSetNode
ResultSetNode.QueryExpressionClauses
-
-
Field Summary
Fields Modifier and Type Field Description private boolean
addDistinctAggregate
private int
addDistinctAggregateColumnNum
private AggregatorInfoList
aggInfo
Information that is used at execution time to process aggregates.private java.util.List<AggregateNode>
aggregates
The list of all aggregates in the query block that contains this group by.(package private) GroupByList
groupingList
The GROUP BY listprivate ValueNode
havingClause
private SubqueryList
havingSubquerys
private boolean
isInSortedOrder
(package private) FromTable
parent
The parent to the GroupByNode.private boolean
singleInputRowOptimization
-
Fields inherited from class org.apache.derby.impl.sql.compile.SingleChildResultSetNode
childResult, hasTrulyTheBestAccessPath
-
Fields inherited from class org.apache.derby.impl.sql.compile.FromTable
ADD_PLAN, bestAccessPath, bestCostEstimate, bestSortAvoidancePath, correlationName, corrTableName, currentAccessPath, hashKeyColumns, initialCapacity, level, LOAD_PLAN, loadFactor, maxCapacity, origTableName, REMOVE_PLAN, tableNumber, tableProperties, trulyTheBestAccessPath, userSpecifiedJoinStrategy
-
Fields inherited from class org.apache.derby.impl.sql.compile.QueryTreeNode
AUTOINCREMENT_CREATE_MODIFY, AUTOINCREMENT_CYCLE, AUTOINCREMENT_INC_INDEX, AUTOINCREMENT_IS_AUTOINCREMENT_INDEX, AUTOINCREMENT_START_INDEX
-
-
Constructor Summary
Constructors Constructor Description GroupByNode(ResultSetNode bottomPR, GroupByList groupingList, java.util.List<AggregateNode> aggregates, ValueNode havingClause, SubqueryList havingSubquerys, int nestingLevel, ContextManager cm)
Constructor for a GroupByNode.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
addAggregateColumns()
In the query rewrite involving aggregates, add the columns for aggregation.private void
addAggregates()
Add the extra result columns required by the aggregates to the result list.private void
addDistinctAggregatesToOrderBy()
Add any distinct aggregates to the order by list.private void
addNewColumnsForAggregation()
Add a whole slew of columns needed for aggregation.private void
addNewPRNode()
Add a new PR node for aggregation.private java.util.ArrayList<SubstituteExpressionVisitor>
addUnAggColumns()
In the query rewrite for group by, add the columns on which we are doing the group by.(package private) void
considerPostOptimizeOptimizations(boolean selectHasPredicates)
Consider any optimizations after the optimizer has chosen a plan.CostEstimate
estimateCost(OptimizablePredicateList predList, ConglomerateDescriptor cd, CostEstimate outerCost, Optimizer optimizer, RowOrdering rowOrdering)
Estimate the cost of scanning this Optimizable using the given predicate list with the given conglomerate.(package private) boolean
flattenableInFromSubquery(FromList fromList)
Evaluate whether or not the subquery in a FromSubquery is flattenable.(package private) void
generate(ActivationClassBuilder acb, MethodBuilder mb)
generate the sort result set operating over the source result set.private void
genGroupedAggregateResultSet(ActivationClassBuilder acb, MethodBuilder mb)
Generate the code to evaluate grouped aggregates.private void
genScalarAggregateResultSet(ActivationClassBuilder acb, MethodBuilder mb)
Generate the code to evaluate scalar aggregates.private ResultColumn
getColumnReference(ResultColumn targetRC, DataDictionary dd)
Method for creating a new result column referencing the one passed in.(package private) boolean
getIsInSortedOrder()
Get whether or not the source is in sorted order.(package private) FromTable
getParent()
Return the parent node to this one, if there is one.(package private) boolean
isOneRowResultSet()
Return whether or not the underlying ResultSet tree will return a single row, at most.(package private) ResultColumnDescriptor[]
makeResultDescriptors()
(package private) ResultSetNode
optimize(DataDictionary dataDictionary, PredicateList predicates, double outerRows)
Optimize this GroupByNode.CostEstimate
optimizeIt(Optimizer optimizer, OptimizablePredicateList predList, CostEstimate outerCost, RowOrdering rowOrdering)
Choose the best access path for this Optimizable.(package private) void
printSubNodes(int depth)
Prints the sub-nodes of this object.boolean
pushOptPredicate(OptimizablePredicate optimizablePredicate)
Push an OptimizablePredicate down, if this node accepts it.java.lang.String
toString()
Convert this object to a String.-
Methods inherited from class org.apache.derby.impl.sql.compile.SingleChildResultSetNode
acceptChildren, addNewPredicate, adjustForSortElimination, adjustForSortElimination, changeAccessPath, decrementLevel, ensurePredicateList, forUpdate, getChildResult, getFinalCostEstimate, getFromTableByName, getTrulyTheBestAccessPath, initAccessPaths, isNotExists, isOrderedOn, modifyAccessPaths, preprocess, pullOptPredicates, pushExpressions, referencesSessionSchema, referencesTarget, reflectionNeededForProjection, setChildResult, setLevel, subqueryReferencesTarget, updateBestPlanMap, updateTargetLockMode
-
Methods inherited from class org.apache.derby.impl.sql.compile.FromTable
assignCostEstimate, canBeOrdered, columnsAreUpdatable, considerSortAvoidancePath, convertAbsoluteToRelativeColumnPosition, cursorTargetTable, feasibleJoinStrategy, fillInReferencedTableMap, flatten, getBaseTableName, getBestAccessPath, getBestSortAvoidancePath, getCorrelationName, getCostEstimate, getCurrentAccessPath, getExposedName, getLevel, getMergeTableID, getName, getNumColumnsReturned, getOrigTableName, getProperties, getResultColumnsForList, getSchemaDescriptor, getSchemaDescriptor, getScratchCostEstimate, getTableDescriptor, getTableName, getTableNumber, getUserSpecifiedJoinStrategy, hashKeyColumns, hasLargeObjectColumns, hasTableNumber, initialCapacity, isBaseTable, isCoveringIndex, isFlattenableJoinNode, isJoinColumnForRightOuterJoin, isMaterializable, isOneRowScan, isTargetTable, legalJoinOrder, loadFactor, LOJ_reorderable, markUpdatableByCursor, maxCapacity, memoryUsageOK, modifyAccessPath, needsSpecialRCLBinding, nextAccessPath, optimizeSubqueries, rememberAsBest, rememberJoinStrategyAsBest, rememberSortAvoidancePath, resetJoinStrategies, setCostEstimateCost, setHashKeyColumns, setMergeTableID, setOrigTableName, setProperties, setTableNumber, startOptimizing, supportsMultipleInstantiations, tellRowOrderingAboutConstantColumns, transformOuterJoins, uniqueJoin, verifyProperties
-
Methods inherited from class org.apache.derby.impl.sql.compile.ResultSetNode
assignResultSetNumber, bindExpressions, bindExpressionsWithTables, bindNonVTITables, bindResultColumns, bindResultColumns, bindTargetExpressions, bindUntypedNullsToResultColumns, bindVTITables, columnTypesAndLengthsMatch, considerMaterialization, enhanceRCLForInsert, generateNormalizationResultSet, generateResultSet, genProjectRestrict, genProjectRestrict, genProjectRestrictForReordering, getAllResultColumns, getCandidateFinalCostEstimate, getCostEstimate, getCursorTargetTable, getFromList, getMatchingColumn, getNewCostEstimate, getOptimizer, getOptimizerImpl, getRCLForInsert, getReferencedTableMap, getResultColumns, getResultSetNumber, getScratchCostEstimate, isCursorTargetTable, isInsertSource, isPossibleDistinctScan, isStatementResultSet, isUpdatableCursor, LOJgetReferencedTables, makeResultDescription, markAsCursorTargetTable, markForDistinctScan, markStatementResultSet, modifyAccessPaths, notCursorTargetTable, notFlattenableJoin, numDistinctAggregates, parseDefault, performMaterialization, printQueryExpressionSuffixClauses, projectResultColumns, pushOffsetFetchFirst, pushOrderByList, pushQueryExpressionSuffix, rejectParameters, rejectXMLValues, renameGeneratedResultNames, replaceOrForbidDefaults, returnsAtMostOneRow, setCandidateFinalCostEstimate, setCostEstimate, setCursorTargetTable, setInsertSource, setOptimizer, setReferencedTableMap, setResultColumns, setResultSetNumber, setResultToBooleanTrueNode, setScratchCostEstimate, setTableConstructorTypes, verifySelectStarSubquery
-
Methods inherited from class org.apache.derby.impl.sql.compile.QueryTreeNode
accept, addTag, addUDTUsagePriv, addUDTUsagePriv, bindOffsetFetch, bindRowMultiSet, bindUserCatalogType, bindUserType, checkReliability, checkReliability, convertDefaultNode, copyTagsFrom, createTypeDependency, debugFlush, debugPrint, disablePrivilegeCollection, formatNodeString, generateAuthorizeCheck, getBeginOffset, getClassFactory, getCompilerContext, getContext, getContextManager, getDataDictionary, getDependencyManager, getEndOffset, getExecutionFactory, getGenericConstantActionFactory, getIntProperty, getLanguageConnectionContext, getLongProperty, getNullNode, getOffsetOrderedNodes, getOptimizerFactory, getOptimizerTracer, getParameterTypes, getSchemaDescriptor, getSchemaDescriptor, getStatementType, getTableDescriptor, getTypeCompiler, getUDTDesc, isAtomic, isPrivilegeCollectionRequired, isSessionSchema, isSessionSchema, makeConstantAction, makeTableName, makeTableName, nodeHeader, optimizerTracingIsOn, orReliability, parseSearchCondition, parseStatement, printLabel, resolveTableToSynonym, setBeginOffset, setEndOffset, setRefActionInfo, stackPrint, taggedWith, treePrint, treePrint, verifyClassExist
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface org.apache.derby.iapi.sql.compile.Optimizable
getDataDictionary, getOptimizerTracer, getReferencedTableMap, getResultSetNumber, optimizerTracingIsOn
-
Methods inherited from interface org.apache.derby.iapi.sql.compile.Visitable
accept, addTag, taggedWith
-
-
-
-
Field Detail
-
groupingList
GroupByList groupingList
The GROUP BY list
-
aggregates
private java.util.List<AggregateNode> aggregates
The list of all aggregates in the query block that contains this group by.
-
aggInfo
private AggregatorInfoList aggInfo
Information that is used at execution time to process aggregates.
-
parent
FromTable parent
The parent to the GroupByNode. If we need to generate a ProjectRestrict over the group by then this is set to that node. Otherwise it is null.
-
addDistinctAggregate
private boolean addDistinctAggregate
-
singleInputRowOptimization
private boolean singleInputRowOptimization
-
addDistinctAggregateColumnNum
private int addDistinctAggregateColumnNum
-
isInSortedOrder
private final boolean isInSortedOrder
-
havingClause
private ValueNode havingClause
-
havingSubquerys
private SubqueryList havingSubquerys
-
-
Constructor Detail
-
GroupByNode
GroupByNode(ResultSetNode bottomPR, GroupByList groupingList, java.util.List<AggregateNode> aggregates, ValueNode havingClause, SubqueryList havingSubquerys, int nestingLevel, ContextManager cm) throws StandardException
Constructor for a GroupByNode.- Parameters:
bottomPR
- The child FromTablegroupingList
- The groupingListaggregates
- The list of aggregates from the query block. Since aggregation is done at the same time as grouping, we need them here.havingClause
- The having clause.havingSubquerys
- subqueries in the having clause.nestingLevel
- NestingLevel of this group by node. This is used for error checking of group by queries with having clause.cm
- The context manager- Throws:
StandardException
- Thrown on error
-
-
Method Detail
-
getIsInSortedOrder
boolean getIsInSortedOrder()
Get whether or not the source is in sorted order.- Returns:
- Whether or not the source is in sorted order.
-
addAggregates
private void addAggregates() throws StandardException
Add the extra result columns required by the aggregates to the result list.- Throws:
standard
- exceptionStandardException
-
addDistinctAggregatesToOrderBy
private void addDistinctAggregatesToOrderBy()
Add any distinct aggregates to the order by list. Asserts that there are 0 or more distincts.
-
addNewPRNode
private void addNewPRNode() throws StandardException
Add a new PR node for aggregation. Put the new PR under the sort.- Throws:
standard
- exceptionStandardException
-
addUnAggColumns
private java.util.ArrayList<SubstituteExpressionVisitor> addUnAggColumns() throws StandardException
In the query rewrite for group by, add the columns on which we are doing the group by.- Returns:
- havingRefsToSubstitute visitors array. Return any havingRefsToSubstitute visitors since it is too early to apply them yet; we need the AggregateNodes unmodified until after we add the new columns for aggregation (DERBY-4071).
- Throws:
StandardException
- See Also:
addNewColumnsForAggregation()
-
addNewColumnsForAggregation
private void addNewColumnsForAggregation() throws StandardException
Add a whole slew of columns needed for aggregation. Basically, for each aggregate we add 3 columns: the aggregate input expression and the aggregator column and a column where the aggregate result is stored. The input expression is taken directly from the aggregator node. The aggregator is the run time aggregator. We add it to the RC list as a new object coming into the sort node.At this point this is invoked, we have the following tree:
-
PR - (PARENT): RCL is the original select list
|
PR - GROUP BY: RCL is empty
|
PR - FROM TABLE: RCL is empty
For each ColumnReference in PR RCL
- clone the ref
- create a new RC in the bottom RCL and set it to the col ref
- create a new RC in the GROUPBY RCL and set it to point to the bottom RC
- reset the top PR ref to point to the new GROUPBY RC
aggregates
- create RC in FROM TABLE. Fill it with aggs Operator.
- create RC in FROM TABLE for agg result
- create RC in FROM TABLE for aggregator
- create RC in GROUPBY for agg input, set it to point to FROM TABLE RC
- create RC in GROUPBY for agg result
- create RC in GROUPBY for aggregator
- replace Agg with reference to RC for agg result
For a query like,
select c1, sum(c2), max(c3) from t1 group by c1;
the query tree ends up looking like this:ProjectRestrictNode RCL -> (ptr to GBN(column[0]), ptr to GBN(column[1]), ptr to GBN(column[4])) | GroupByNode RCL->(C1, SUM(C2), <agg-input>,
The RCL of the GroupByNode contains all the unagg (or grouping columns) followed by 3 RC's for each aggregate in this order: the final computed aggregate value, the aggregate input and the aggregator function., MAX(C3), <agg-input>, <aggregator>) | ProjectRestrict RCL->(C1, C2, C3) | FromBaseTable The Aggregator function puts the results in the first of the 3 RC's and the PR resultset in turn picks up the value from there.
The notation (ptr to GBN(column[0])) basically means that it is a pointer to the 0th RC in the RCL of the GroupByNode.
The addition of these unagg and agg columns to the GroupByNode and to the PRN is performed in addUnAggColumns and addAggregateColumns.
Note that that addition of the GroupByNode is done after the query is optimized (in SelectNode#modifyAccessPaths) which means a fair amount of patching up is needed to account for generated group by columns.
- Throws:
standard
- exceptionStandardException
-
addAggregateColumns
private void addAggregateColumns() throws StandardException
In the query rewrite involving aggregates, add the columns for aggregation.- Throws:
StandardException
- See Also:
addNewColumnsForAggregation()
-
getParent
final FromTable getParent()
Return the parent node to this one, if there is one. It will return 'this' if there is no generated node above this one.- Returns:
- the parent node
-
optimizeIt
public CostEstimate optimizeIt(Optimizer optimizer, OptimizablePredicateList predList, CostEstimate outerCost, RowOrdering rowOrdering) throws StandardException
Description copied from interface:Optimizable
Choose the best access path for this Optimizable.- Specified by:
optimizeIt
in interfaceOptimizable
- Overrides:
optimizeIt
in classFromTable
- Parameters:
optimizer
- Optimizer to use.predList
- The predicate list to optimize againstouterCost
- The CostEstimate for the outer tables in the join order, telling how many times this Optimizable will be scanned.rowOrdering
- The row ordering for all the tables in the join order, including this one.- Returns:
- The optimizer's estimated cost of the best access path.
- Throws:
StandardException
- Thrown on error- See Also:
Optimizable.optimizeIt(org.apache.derby.iapi.sql.compile.Optimizer, org.apache.derby.iapi.sql.compile.OptimizablePredicateList, org.apache.derby.iapi.sql.compile.CostEstimate, org.apache.derby.iapi.sql.compile.RowOrdering)
-
estimateCost
public CostEstimate estimateCost(OptimizablePredicateList predList, ConglomerateDescriptor cd, CostEstimate outerCost, Optimizer optimizer, RowOrdering rowOrdering) throws StandardException
Description copied from interface:Optimizable
Estimate the cost of scanning this Optimizable using the given predicate list with the given conglomerate. It is assumed that the predicate list has already been classified. This cost estimate is just for one scan, not for the life of the query.- Specified by:
estimateCost
in interfaceOptimizable
- Overrides:
estimateCost
in classFromTable
- Parameters:
predList
- The predicate list to optimize againstcd
- The conglomerate descriptor to get the cost ofouterCost
- The estimated cost of the part of the plan outer to this optimizable.optimizer
- The optimizer to use to help estimate the costrowOrdering
- The row ordering for all the tables in the join order, including this one.- Returns:
- The estimated cost of doing the scan
- Throws:
StandardException
- Thrown on error- See Also:
Optimizable.estimateCost(org.apache.derby.iapi.sql.compile.OptimizablePredicateList, org.apache.derby.iapi.sql.dictionary.ConglomerateDescriptor, org.apache.derby.iapi.sql.compile.CostEstimate, org.apache.derby.iapi.sql.compile.Optimizer, org.apache.derby.iapi.sql.compile.RowOrdering)
-
pushOptPredicate
public boolean pushOptPredicate(OptimizablePredicate optimizablePredicate) throws StandardException
Description copied from interface:Optimizable
Push an OptimizablePredicate down, if this node accepts it.- Specified by:
pushOptPredicate
in interfaceOptimizable
- Overrides:
pushOptPredicate
in classFromTable
- Parameters:
optimizablePredicate
- OptimizablePredicate to push down.- Returns:
- Whether or not the predicate was pushed down.
- Throws:
StandardException
- Thrown on error- See Also:
Optimizable.pushOptPredicate(org.apache.derby.iapi.sql.compile.OptimizablePredicate)
-
toString
public java.lang.String toString()
Convert this object to a String. See comments in QueryTreeNode.java for how this should be done for tree printing.
-
printSubNodes
void printSubNodes(int depth)
Prints the sub-nodes of this object. See QueryTreeNode.java for how tree printing is supposed to work.- Overrides:
printSubNodes
in classSingleChildResultSetNode
- Parameters:
depth
- The depth of this node in the tree
-
flattenableInFromSubquery
boolean flattenableInFromSubquery(FromList fromList)
Evaluate whether or not the subquery in a FromSubquery is flattenable. Currently, a FSqry is flattenable if all of the following are true: o Subquery is a SelectNode. o It contains no top level subqueries. (RESOLVE - we can relax this) o It does not contain a group by or having clause o It does not contain aggregates.- Overrides:
flattenableInFromSubquery
in classSingleChildResultSetNode
- Parameters:
fromList
- The outer from list- Returns:
- boolean Whether or not the FromSubquery is flattenable.
-
optimize
ResultSetNode optimize(DataDictionary dataDictionary, PredicateList predicates, double outerRows) throws StandardException
Optimize this GroupByNode.- Overrides:
optimize
in classSingleChildResultSetNode
- Parameters:
dataDictionary
- The DataDictionary to use for optimizationpredicates
- The PredicateList to optimize. This should be a join predicate.outerRows
- The number of outer joining rows- Returns:
- ResultSetNode The top of the optimized subtree
- Throws:
StandardException
- Thrown on error
-
makeResultDescriptors
ResultColumnDescriptor[] makeResultDescriptors()
- Overrides:
makeResultDescriptors
in classResultSetNode
-
isOneRowResultSet
boolean isOneRowResultSet() throws StandardException
Return whether or not the underlying ResultSet tree will return a single row, at most. This is important for join nodes where we can save the extra next on the right side if we know that it will return at most 1 row.- Overrides:
isOneRowResultSet
in classSingleChildResultSetNode
- Returns:
- Whether or not the underlying ResultSet tree will return a single row.
- Throws:
StandardException
- Thrown on error
-
generate
void generate(ActivationClassBuilder acb, MethodBuilder mb) throws StandardException
generate the sort result set operating over the source result set. Adds distinct aggregates to the sort if necessary.- Overrides:
generate
in classQueryTreeNode
- Parameters:
acb
- The ActivationClassBuilder for the class being builtmb
- The method for the generated code to go into- Throws:
StandardException
- Thrown on error
-
genScalarAggregateResultSet
private void genScalarAggregateResultSet(ActivationClassBuilder acb, MethodBuilder mb)
Generate the code to evaluate scalar aggregates.
-
genGroupedAggregateResultSet
private void genGroupedAggregateResultSet(ActivationClassBuilder acb, MethodBuilder mb) throws StandardException
Generate the code to evaluate grouped aggregates.- Throws:
StandardException
-
getColumnReference
private ResultColumn getColumnReference(ResultColumn targetRC, DataDictionary dd) throws StandardException
Method for creating a new result column referencing the one passed in.- Parameters:
targetRC
- the sourcedd
-- Returns:
- the new result column
- Throws:
StandardException
- on error
-
considerPostOptimizeOptimizations
void considerPostOptimizeOptimizations(boolean selectHasPredicates) throws StandardException
Consider any optimizations after the optimizer has chosen a plan. Optimizations include: o min optimization for scalar aggregates o max optimization for scalar aggregates- Parameters:
selectHasPredicates
- true if SELECT containing this vector/scalar aggregate has a restriction- Throws:
StandardException
- on error
-
-