As part of the OLAPOperations effort, we'd like to add support for ROLLUP multi-level grouping.
Consider a table of sales transaction data, which tracks information about particular sales:
create table sales_history (region varchar(10), state char(2), product varchar(10), date_of_sale timestamp, sales int);
Such a table has one row for each sales transaction, recording the region and state where the sale occurred, the product that was sold, the date on which it was sold, and the amount of the sale.
Now suppose that we'd like to analyze our sales data, to study the amount of sales that is occurring for different products, in different states and regions. Using the ROLLUP feature of SQL 2003, we could issue the query:
select region, state, product, sum(sales) total_sales from sales_history group by rollup (region, state, product)
Semantically, the above query is equivalent to
select region, state, product, sum(sales) total_sales from sales_history group by region, state, product union select region, state, null, sum(sales) total_sales from sales_history group by region, state union select region, null, null, sum(sales) total_sales from sales_history group by region union select null, null, null, sum(sales) total_sales from sales_history
The query might produce results that looked something like:
REGION STATE PRODUCT TOTAL_SALES ------ ----- ------- ----------- null null null 6200 EAST MA BOATS 100 EAST MA CARS 1500 EAST MA null 1600 EAST NY BOATS 150 EAST NY CARS 1000 EAST NY null 1150 EAST null null 2750 WEST CA BOATS 750 WEST CA CARS 500 WEST CA null 1250 WEST AZ BOATS 2000 WEST AZ CARS 200 WEST AZ null 2200 WEST null null 3450
We could actually compute the ROLLUP query by transforming it into the equivalent UNION query. However, we should seek to find an implementation in which the grouped aggregates can be computed using a minimum number of sorts of the data, and a minimum number of passes over the input data. This is discussed in more detail later on this page, but our goal should be that the expense of computing this query is roughly equivalent to the expense of computing the single sub-query
select region, state, product, sum(sales) total_sales from sales_history group by region, state, product
because the other aggregation levels can be computed at the same time.
In the SQL 2003 standard, the ROLLUP keyword impacts the following portions of the SQL grammar:
<grouping element> ::= <ordinary grouping set> | <rollup list> | <cube list> | <grouping sets specification> | <empty grouping set> <rollup list> ::= ROLLUP <left paren> <ordinary grouping set list> <right paren> <grouping set> ::= <ordinary grouping set> | <rollup list> | <cube list> | <grouping sets specification> | <empty grouping set>
The above grammar is just a tiny subset of the overall GROUP BY grammar, namely those parts that involve ROLLUP.
Current GROUP BY implementation
The current implementation of GROUP BY aggregation is (over-simplified, but hopefully accurate):
- Determine the rows to be grouped, by executing the indicated query
- For each row, augment the row with additional hidden columns which hold information about the aggregation to be performed (SUM, MAX, COUNT, etc.)
- Process the input row set, using either
- inline aggregation if the data is already in sorted order (or has to be sorted in order to determine the DISTINCT values), or
- duplicate-eliminating sort aggregation
- return the resulting grouped aggregates.
The grouped aggregate processing occurs in GroupedAggregateResultSet.java
Techniques for Implementing Grouped Aggregates
It seems that there are basically 3 ways to perform grouped aggregation:
- Inline Aggregation
- Duplicate-eliminating sort with sort-observer aggregation
- Duplicate-eliminating hash with hash-collision aggregation
There also appear to be numerous ways to combine these three variants, which we refer to as "hybrid" algorithms below.
If the records to be grouped are already in sorted order, perhaps due to a scan of an index, or a previous merge-sort-join operation, or a duplicate-eliminating distinct sort, then the grouping operation can be implemented by comparing each pair of records on the grouping attributes; each time a new value of the grouping attributes is encountered, the computation of the previous group can be finalized and emitted.
This algorithm is currently implemented by Derby for a single set of grouping attributes; it seems possible to extend this logic for a ROLLUP set of groups, as follows:
Duplicate-eliminating Sort with Sort Observer
An unsorted set of records can be grouped by feeding them into the sorter, configured for duplicate elimination, with a sort observer which aggregates each row's values into the retained row's columns during duplicate processing. Each such sort groups the records by a specific set of grouping attributes.
- Sort the rows by the grouping columns, eliminating duplicates during the sort
- Each time a duplicate row is eliminated, aggregate its information into the row which the sorter retains
- At the end of the sort, there is one remaining row for each distinct set of values of the grouping columns, and the augmented aggregation columns in that row hold the computed aggregate values.
Thus aggregate computation is a side-effect of sorting and duplicate elimination. Note that as a side effect of using the sorter for this purpose, the result set is emitted in ORDER BY the grouping attributes.
This algorithm is currently implemented by Derby for a single set of grouping attributes; it seems possible to extend this logic for implementing a ROLLUP set of groups, as follows:
- Use N sorts to group the records by N different sets of grouping attributes.
- The input rows are read only once, but then are added to each of the N sorts.
This should be substantially more efficient than the alternate query-rewriting algorithm of constructing N GROUP BY statements and then UNION'ing their results together, since in this implementation the input rows are materialized and read only once.
However, if multiple sorts are used to compute multiple grouping sets, whether ROLLUP or unrelated, the overall result set may not necessarily be emitted in ORDER BY the grouping attributes.
This algorithm also seems like it could be extended to implement other types of GROUPING SETs and CUBE aggregation, since each sort can independently group the records by a different set of unrelated grouping attributes.
Hash-based Grouped Aggregation
An unsorted set of records can also be grouped by feeding them into a hash table, whose keys are the grouping attributes, with a hash-collision observer which aggregates each row's values into the retained row's columns during duplicate key elimination. This is fundamentally similar to the sort-based technique, with the important difference that at the end of the processing, the rows will NOT be emitted in sorted order.
I am not aware of any code which implements a hash-based technique for grouped aggregates in Derby.
The inline-aggreation and sort-based techniques can be combined into an algorithm which computes the N ROLLUP groups using a single sort, by first sorting (and aggregating) the rows on their finest level of grouping detail.
All we have to do is to extend GroupedAggregateResultSet so that instead of computing a single result row at a time, it maintains N pending result rows, where N is the number of levels of ROLLUP in the query (N == 4 in the query above).
So, when GroupedAggregateResultSet is processing a row of data for (region=EAST,state=MA,product=BOATS), it should not only sum the sales for that level of aggregation, but should also at the same time sum the sales for the aggregation levels (region=EAST,state=MA), and (region=EAST), and ().
Then, when GroupedAggregateResultSet encounters the end of the (EAST,MA,BOATS) data, and starts to encounter (EAST,MA,CARS) data, it should finalize the result row for (EAST,MA,BOATS) and start working on computing the result row for (EAST,MA,CARS), but it should simply continue aggregating the result rows for (EAST,MA), (EAST), and ().
And when GroupedAggregateResultSet encounters the end of the (EAST,MA,CARS) data, and starts to encounter (EAST,NY,BOATS) data,it should finalize the result row for (EAST,MA,CARS), and it should also finalize the result row for (EAST,MA), and it should start working on computing the result row for (EAST,NY,BOATS), but it should continue aggregating the result rows for (EAST) and for ().