As part of query "preprocessing", which is the first of Derby's three phases of optimization, every result set node in the query tree figures out what its "referenced table map" is. In Derby terminology a "table map" is a bit map which, as a general rule, contains one bit for every Optimizable in the query that has a non-negative table number. An "Optimizable" is defined as any result set node that can occur in the FROM clause of a SELECT statement. So for a given result set node rsn in some query, rsn's "referenced table map" is a table map in which the bits corresponding to any Optimizables that rsn references have been set. Note that a "referenced table map" is also called a "referenced set" in some parts of the Derby code.

That said, each Optimizable is responsible for building its own "referenced table map". A FromBaseTable, for example, simply sets the bit that corresponds to itself. Likewise for a RowResultSetNode. Any class that extends SetOperatorNode, though,has a referenced map that is the "or" of its children's maps. And so on. At the time of writing the following result set nodes set their referenced maps as indicated here:

As an example, assume we have the following query:

select, ppl_info.fullname
  from happy_ppl_ids, ppl_info
  where =

In this query there are two Optimizables: the first is happy_ppl_ids, the second is ppl_info. So all "table maps" for this query will have two bits in them, one for each Optimizable. As part of query binding each Optimizable will get assigned a specific "table number". In the the example above happy_ppl_ids will end up with table number "0" and ppl_info will end up with table number "1". The referenced table map for the SelectNode will then have both bits set because it (the SelectNode) references both tables. The referenced map for happy_ppl_info will only have bit "0" set; the referenced map for ppl_info will only have bit "1" set.

In addition to result set nodes, there are other classes in Derby that contain "referenced table" information, as well. Of particular interest are the Predicate and the ColumnReference classes. The referencedSet field of a Predicate indicates which Optimizables are referenced by that Predicate. In the example above, the predicate ( = will reflect the tables referenced by the equality, and thus it will have both bits "0" and bits "1" set. As for ColumnReferences, a specific column reference does not have a "bit map" per se, but it does have a field to hold the table number of the Optimizable to which it (the ColumnReference) points. In the example query there are actually four column references:

  1. from the SELECT's result column list

  2. ppl_info.fullname from the SELECT's result column list

  3. from the left side of the predicate

  4. from the right side of the predicate

The third column reference in this list will have its tableNumber field set to "0" because it is pointing to happy_ppl_ids and we said above that happy_ppl_ids was designated as table "0". All other column references in this list will have table number "1" because they point to ppl_info.

For more on table numbers, see OptimizerTableNumbers.

Both Optimizables in the example above are base tables, but that is not a requirement. Nor is it a requirement that the Optimizables occur as part of a SELECT query; they may be created in other ways, as well. For example if we have the following query:

values 'hi', 'bye', 'salut', 'ciao'

Derby will actually create SEVEN Optimizables, as demonstrated by the following query tree ("RRSN" stands for RowResultSetNode, which is an instance of Optimizable representing a single row in the result set returned by a VALUES query):

                      /    \
                 UNION[4]   RRSN[5] ("ciao")
                /      \
          UNION[2]      RRSN[3] ("salut")
            /    \
  RRSN[0] ("hi")  RRSN[1] ("bye")

In this tree the bracketed digit for each Optimizable represents the "table number" assigned to that Optimizable during binding. The quoted strings in parentheses correspond to rows in the VALUES result set. For the above query, then, the "referenced table map" for each UNION and RowResultSetNode will have seven bits in it, with the bits set as follows (where "{ i }" means that the bit corresponding to the Optimizable whose table number is i has been set):

Generally speaking "referenced table maps" are useful for determining what dependencies exist between the different result set nodes, columns, and column references within a query. They also make it possible for the Derby optimizer to figure out when and where to push predicates down the query tree, as described in more detail on the PredicatePushdown page.

ReferencedTableMaps (last edited 2009-09-20 22:12:37 by localhost)