Provides a collection of tips for debugging Trafodion code.

Trafodion Process Debug Hooks

For most purposes, one can simply attach a gdb debugger to a running Trafodion process. Occasionally, though, one needs to debug an issue pertaining to process initialization. That is, you need a way to attach gdb at the very beginning of a process. Of course, for a process started at a shell prompt, one can simply use gdb at the outset (e.g. "gdb sqlci"). But for processes that are started by other Trafodion processes or started by the sqstart script, we need another technique.

Trafodion processes provide a debug hook for this purpose. Very early in the main, there is a call to msg_debug_hook. If the HOOK_ENABLE environment variable is set to a non-zero value, this function checks for the existence of a file in the current working directory. The name of the file is given by the second parameter to msg_debug_hook. If this file doesn't exist, it sleeps for one second and tries again. This continues until the file exists.

Each kind of Trafodion process uses a different file name for this purpose.

For Trafodion processes started by sqstart, the current working directory is $TRAF_HOME/sql/scripts. Environment variables for these processes are set in the $TRAF_HOME/etc/ms.env file.

There is a script, createhooks, in $TRAF_HOME/sql/scripts, that will create a complete set of hook files. To hook a specific process (say, tdm_arkcmp), simply delete its associated hook file (ark.hook, for tdm_arkcmp). Then cause the process to be created (for tdm_arkcmp, one can start it from sqlci by doing "UPDATE STATISTICS LOG OFF;"). Then use gdb to attach to the process. Then use touch to create the appropriate file, and continue with your debugging.

When you are done, don't forget to reset HOOK_ENABLE to zero in $TRAF_HOME/etc/ms.env. Also, if you wish, you can delete the hook files in $TRAF_HOME/sql/scripts by using the delhooks script there.

Using Loggers in Trafodion Java code

The Trafodion Java code uses log4j. To turn on logging for a class, edit the configuration file $TRAF_HOME/conf/log4j.hdfs.config. (Note that on a cluster, you'll need to do this on every node if you want to track processes on every node.) The resulting log files are written to $TRAF_HOME/logs/trafodion.hdfs.log (again, on a cluster, there is one of these directories per node).

Note that Trafodion Java modules have package name org.trafodion.sql. So, for example, if I want to turn on logging for the Trafodion HBaseClient class, I would add the following line to log4j.hdfs.config:

log4j.logger.org.trafodion.sql.HBaseClient=DEBUG

Note also that ALL of the logging for ALL processes on a given node goes to just one log file on that node, trafodion.hdfs.log. So, your results will be best when you set this debug parameter for just a short while.

Environment Variables and Trafodion Processes

Many processes in Trafodion are started by the Trafodion monitor component. Those processes get their environment settings from the monitor process. Those in turn come from the file $TRAF_HOME/etc/ms.env. So to set an environment variable in processes started by the monitor, edit ms.env and add your variable there. You'll have to do an sqstop + sqstart for the change to take effect.

Examples of processes started by the monitor are: mxosrvr, tdm_arkcmp, tdm_arkesp, tdm_udr.

Some processes, such as sqlci, are started directly from a shell prompt. Those processes of course inherit any environment variables in the shell environment.

Catching SQL Errors at the Source

When debugging SQL compiler and executor issues, often one wishes to find the place in the code where the error is raised. This can sometimes be done by grepping, then using gdb to add debug stops.

Sometimes this is inconvenient, though. For example, the error might be occurring during execution of a parallel plan in some tdm_arkesp process. It can be painful to put them all into debug to try to catch an error that occurs in only one of them. There is a hook that can help you here though.

By setting the environment variable ABORT_ON_ERROR to the number of the SQL error, any SQL process will abort when it encounters that error. So, you get a core file that you can then look at with gdb. For example, setting ABORT_ON_ERROR=8446 will cause any SQL process to core if it encounters a SQL error 8446.

If you prefer debugging the process live, you can instead use the environment variable, LOOP_ON_ERROR, which will cause any SQL process to loop indefinitely when it encounters that error. You can watch your terminal output for a message from such a process, then attach a gdb debugger to it.

As described in the section above, set these variables in $TRAF_HOME/etc/ms.env if the SQL process of interest is started by the monitor. If it is sqlci, it is enough to set the variable in your shell.

Debugging an mxosrvr Process

For purposes of debugging compiler and executor code, using gdb on sqlci is the simplest and easiest debugging environment.

However, occasionally you may be debugging an issue that occurs only via ODBC/JDBC and cannot be reproduced via sqlci. For these issues, you may need to debug in the mxosrvr process. These are persistent server processes on the Trafodion cluster that service ODBC/JDBC connections.

Finding the Right mxosrvr Process

You can put an mxosrvr process into debug via gdb by starting gdb on the proper node and using the gdb attach command. To do this, though, you will need to know which mxosrvr process your client is connected to and its Linux pid. If you are using trafci (the type-4 JDBC interactive client to Trafodion), you can use the "show remoteprocess" command as in the following example:

SQL>show remoteprocess;
REMOTE PROCESS \venkatsentry-2.novalocal:1.$Z0112LJ

In the output above, the node hosting the mxosrvr process is venkatsentry-2.novalocal, and the Trafodion process name is $Z0112LJ.

If you now start a shell on that node, you can do an "sqps | grep Z0112LJ" command to see the Linux pid of the process.

If you are on a workstation using an instance created via "install_local_hadoop", there is another possibility. You can reconfigure DCS to have just one mxosrvr process. (Out of the box, install_local_hadoop configures four.) To do this, edit the conf/servers file as described here: http://trafodion.apache.org/docs/dcs_reference/index.html#_configuration_files. Then perform a dcsstop + a dcsstart to restart DCS.

Dealing with Timeouts

The mxosrvr process is designed to be persistent and relies on Zookeeper and the DCS Master process for this purpose. There is timeout logic to determine if an mxosrvr process is still alive. If mxosrvr is unresponsive for longer than that time it may kill itself or be killed (if it still exists) and a new mxosrvr process is created. This can be a problem in debugging; slowly stepping through code in gdb can cause one or another timeout to be exceeded. To mitigate this, you can set the timeouts to higher values. For example, add the following to the conf/dcs-site.xml file on each node in the cluster:

   <property>
      <name>dcs.server.user.program.zookeeper.session.timeout</name>
      <value>3600</value>
   </property>
   <property>
      <name>zookeeper.session.timeout</name>
      <value>3600000</value>
  </property>

After changing conf/dcs-site.xml, you will need to stop and restart DCS (use the "dcsstop" and "dcsstart" scripts) in order for the change to take effect.

For more detailed information about mxosrvr configuration parameters, see the Trafodion Data Connectivity Services Reference Guide at http://trafodion.apache.org/docs/dcs_reference/index.html.

Turning off Repository Writes

If you are debugging a compiler or executor issue in an mxosrvr process, you may find that your breakpoints are hitting on writes to the Trafodion Repository tables. There is a separate thread in mxosrvr that periodically flushes out statistical data to the Repository using SQL DML statements. This can be annoying. You can turn off Repository writes by adding the following to conf/dcs-site.xml:

   <property>
      <name>dcs.server.user.program.statistics.enabled</name>
      <value>false</value>
   </property>

Debugging Mixed C++/Java Processes

Many Trafodion processes (such as sqlci and mxosrvr) have a C++ main and substantial amounts of Java code invoked via JNI.

You can debug the C++ parts using a debugger such as gdb. One gotcha is that JVM threads often throw signals such as SIGSEGV as part of their normal processing. (The HotSpot JVM for example is reputed to use SIGSEGV to trigger lazy compilation of Java methods.) Unfortunately, gdb catches the signals first. This can be quite annoying.

A way to work around this is to enter the following command into gdb:

handle SIGSEGV pass noprint nostop

Alternatively, place this command in your .gdbinit file.

Another issue is that threads come into and out of existence fairly frequently in Trafodion processes. So you will see a message in gdb every time this happens. If this is distracting to you, you can turn that off with this gdb command:

set print thread-events off

Again, you can place this command in your .gdbinit file.

Navigating Around a Cluster

Many developers do their work on workstations, in an HBase standalone environment. Sometimes, though, you may be doing development work on a cluster. Here are some tips for getting around on a cluster.

Finding out what nodes are part of a cluster

If you are logged onto one node as the trafodion user, there are several environment variables that have a list of the node names. $MY_NODES is one example. You can simply do “env | grep <node name>”, where <node name> is the node you’re logged onto, to find these environment variables.

Logs

Logs for SeaQuest processes are found in $TRAF_HOME/logs. You can get there using the “cdl” command from a trafodion logon. Note that there is one of these directories per node in the cluster. If you know which node your process of interest is running on, you can go directly to that node.

For many uses, a single log file is kept per node. For example, all C++ SQL compiler and executor logic on a given node writes to the file trafodion.sql_0.log. Similarly, Java SQL compiler and executor logic write to trafodion.sql.java.log.

UPDATE STATISTICS behaves a bit differently. Its logging is very chatty. So, in normal operation a log file is generated for each command, and deleted on completion if there were no severe errors in that command. If there was a severe error (e.g. 9200, internal error), then the log is kept. This behavior can be changed using UPDATE STATISTICS LOG OFF (in which case no log is generated) or UPDATE STATISTICS LOG ON (in which case a log is kept for the life of the process). Log files for UPDATE STATISTICS have file names of the form ustat.[node number].[Linux pid].[timestamp].txt.

Scripts

Scripts can be found in the $TRAF_HOME/sql/scripts directory. You can get there using the “cds” command from a trafodion logon.

Figuring out why a node went down

In your testing, you might suddenly discover that the node you’re logged into went down. (I had this happen when doing a control-C in sqlci while a massive UPDATE STATISTICS command was running.) Look at the SeaQuest logs to discover why. For example, do “cdl” to get to the logs directory, then do “ls -g” to see which logs were most recently updated. In my case, the watch dog timer killed the node, and this was revealed in the watch dog timer log. One caveat: “ls -g” shows times in local time. The log messages in the log files themselves bear timestamps in UTC time.

Bringing a node back up

In your testing, you might find that the instance seems to go down. If you are logged on as the trafodion user, you can use the "trafcheck" script to discover what nodes in the cluster are up or down.

If just one node is down, you can use “sqshell” to bring the node back up. When bringing it back up, use the full DNS name of the node (e.g. “abc031.yourcompany.local”). Use the help command in “sqshell” to get details on the command to bring a node up.

HBase Status

HBase provides a GUI that gives information about its status. The URL to access this GUI is like http://abc031.yourcompany.local:60010/master-status. The port number 60010 is typical on Cloudera(TM) clusters. Port number 16010 is used instead on Hortonworks(TM) clusters. From this tool you can obtain informatino about the number of region servers, region server state, the set of regions per table and so on

Vendors often provide additional managerial GUIs in their distributions. For example, Hortonworks distributions package Ambari. The URL for that is like http://abc031.yourcompany.local:8080. For Cloudera, it is Cloudera Manager, with a URL like http://abc031.yourcompany.local:7180/cmf/services/11/status. Both of these products typically require a login; get the user name and password from your cluster administrator.

Figuring out which pid is a RegionServer

On a workstation, the “jps” command shows you which process is the HMaster (which is also the RegionServer on standalone HBase installations). On a cluster, however, “jps” only shows information about java processes associated with your logon ID, which is usually “trafodion”. Unfortunately the HBase processes run under a different ID, typically “hbase”. Even so, you can still figure it out by using “ps -fu hbase”. You’ll need a wide session scroll to see it (or pipe it to a text file), but a java process with “-Dproc_regionserver” in its command parameters identifies a region server.

Figuring out what a RegionServer is up to

If you see in “top” that a java process is very busy, and you identify that process as a RegionServer, you might want to know what the RegionServer is doing. The HBase status GUI can tell you. Use the URL http://abc031.yourcompany.local:60030/dump (substituting your node name; also the port number will be 16030 on a Hortonworks cluster). This gives you a map of all threads in the process, along with their stack traces.

Another URL that might be interesting is http://abc031.yourcompany.local:60030/jmx (port number 16030 on a Hortonworks cluster), which gives JMX statistics for the process.

The SHOWPLAN Utility

SHOWPLAN is an extremely useful utility for Trafodion compiler and executor developers who are debugging issues with generated plans, particularly in the exp (expressions) component. It shows detailed information about the contents of TDBs (Task Description Blocks, that is, the relational operators in a plan), and the scalar expressions associated with each TDB.

For example, I may be interested to know where in the data flow the UPPER function is executed in the query "select upper(name) from t1;". The following example shows how to use SHOWPLAN, and its output.

>>showplan select upper(name) from t1;
MASTER Executor fragment
========================
Fragment ID: 0, Length: 11768
Contents of EX_ROOT [2]:
------------------------
For ComTdb :
Class Version = 1, Class Size = 552
InitialQueueSizeDown = 4, InitialQueueSizeUp = 4
queueResizeLimit = 9, queueResizeFactor = 4
queueSizeDown = 0, queueSizeUp = 0, numBuffers = 0, bufferSize = 0
estimatedRowUsed = 0.000000, estimatedRowsAccessed = 0.000000, expressionMode = 0
Flag = 29
For ComTdbRoot :
FirstNRows = -1, baseTablenamePosition = -1
queryType_ = 2, planVersion_ = 2600
rtFlags1_ = 2030a00
rtFlags2_ = 08
rtFlags3_ = 0
rtFlags4_ = 02000
rtFlags5_ = 00
queryType_ = 2
inputVarsSize_ = 0
explain_plan_size = 2336
# of Expressions = 4
Expression: inputExpr_ is NULL
Expression: outputExpr_
Expr Len: 416, Consts Len: 0
flags_ = 88
Clause #1: ex_function_clause
OperatorTypeEnum = ITM_UPPER(2265), NumOperands = 2
ex_clause::flags_ = 80
ex_function_clause::flags_ = 0
PCODE = supported
Operand #0 (result):
Datatype = REC_BYTE_F_ASCII(0), Length = 10, Null Flag = 0
Precision = 0, Scale = 1, Collation = 1, flags_ = 218
Tuple Data Format = SQLARK_EXPLODED_FORMAT
Atp = 0, AtpIndex = 1 (Temporary)
Offset = 0, NullIndOffset = -1, VClenIndOffset = -1
NullIndLength = 0, VClenIndLength = 0
ValueId = 5
Text = upper
Operand #1:
Datatype = REC_BYTE_F_ASCII(0), Length = 10, Null Flag = 0
Precision = 0, Scale = 1, Collation = 1, flags_ = 200
Tuple Data Format = SQLMX_ALIGNED_FORMAT
Atp = 0, AtpIndex = 2
Offset = 16, NullIndOffset = -1, VClenIndOffset = -1
RelOffset = 4, VoaOffset = -1, NullBitIdx = -1
NullIndLength = 0, VClenIndLength = 0
ValueId = 3
Text = TRAFODION.SCH.T1.NAME
Clause #2: ex_inout_clause
OperatorTypeEnum = ITM_NATYPE(2454), NumOperands = 1
ex_clause::flags_ = a0
ex_inout_clause::flags_ = 0
PCODE = not supported
Operand #0 (result):
Datatype = REC_BYTE_F_ASCII(0), Length = 10, Null Flag = 0
Precision = 0, Scale = 1, Collation = 1, flags_ = 218
Tuple Data Format = SQLARK_EXPLODED_FORMAT
Atp = 0, AtpIndex = 1 (Temporary)
Offset = 0, NullIndOffset = -1, VClenIndOffset = -1
NullIndLength = 0, VClenIndLength = 0
ValueId = 30
Text = CHAR(10) CHARACTER SET ISO88591
PCode:
PCode Expr Length: 96
[1]
GENFUNC_MATTR5_MATTR5_IBIN32S (288) 2 0 -1 10 0 4 16 -1 10 0 2265
OPDATA_MPTR32_IBIN32S (0) 2 0 10
CLAUSE_EVAL (108) -1599940160 32662 0
Expression: pkeyExpr_ is NULL
Expression: predExpr_ is NULL
Contents of EX_TRAF_SELECT [1]:
-------------------------------
For ComTdb :
Class Version = 1, Class Size = 600
InitialQueueSizeDown = 4, InitialQueueSizeUp = 4
queueResizeLimit = 9, queueResizeFactor = 4
queueSizeDown = 2048, queueSizeUp = 2048, numBuffers = 4, bufferSize = 1048648
estimatedRowUsed = 100.000000, estimatedRowsAccessed = 0.000000, expressionMode = 0
Flag = 9
For ComTdbHbaseAccess :
accessType_ = SELECT_
accessDetail_ = EX_TRAF_SELECT
tableName_ = TRAFODION.SCH.T1
asciiTI_ = 4, convertTI_ = 2, rowIdTI_ = 3, returnedTI_ = 2
rowIdAsciiTI_ = 5, updateTI_ = 0, mergeInsertTI_ = 0
returnedFetchedTI_ = 0, returnedUpdatedTI_ = 0, mergeInsertRowIdTI_ = 0
hbaseTimestampTI_ = 0, hbaseVersionTI_ = 0
asciiRowLen_ = 28, convertRowLen_ = 28, rowIdLen_ = 10, outputRowLen_ = 28
updateRowLen_ = 0, returnFetchedRowLen_ = 0, returnUpdateedRowLen_ = 0
mergeInsertRowLen_ = 0, keyLen_ = 0
Flag = 4004
server_ = , zkPort_ =
listOfFetchedColNames_(numEntries = 1):
Entry #1: #1:1
listOfScanRows_(numEntries = 1):
Entry #1:
beginRowId_(incl) =
0:
endRowId_(incl) =
0:
colTS_=-1
# of Expressions = 18
Expression: Convert Expr
Expr Len: 456, Consts Len: 8
flags_ = 88
Clause #1: ex_function_clause
OperatorTypeEnum = ITM_HEADER(2375), NumOperands = 1
ex_clause::flags_ = 80
ex_function_clause::flags_ = 0
PCODE = supported
Operand #0 (result):
Datatype = REC_BYTE_F_ASCII(0), Length = 12, Null Flag = 0
Precision = 0, Scale = 1, Collation = 1, flags_ = 208
Tuple Data Format = SQLMX_ALIGNED_FORMAT
Atp = 1, AtpIndex = 2
Offset = 0, NullIndOffset = -1, VClenIndOffset = -1
RelOffset = 0, VoaOffset = -1, NullBitIdx = -1
NullIndLength = 0, VClenIndLength = 0
ValueId = 0
Text = Hdr
Clause #2: ex_conv_clause
OperatorTypeEnum = ITM_CAST(2452), NumOperands = 2
ex_clause::flags_ = 80
ex_conv_clause::flags_ = 0
PCODE = supported
instruction: CONV_ASCII_F_F(136), instrArrayIndex_: 285
Operand #0 (result):
Datatype = REC_BYTE_F_ASCII(0), Length = 10, Null Flag = 0
Precision = 0, Scale = 1, Collation = 1, flags_ = 208
Tuple Data Format = SQLMX_ALIGNED_FORMAT
Atp = 1, AtpIndex = 2
Offset = 16, NullIndOffset = -1, VClenIndOffset = -1
RelOffset = 4, VoaOffset = -1, NullBitIdx = -1
NullIndLength = 0, VClenIndLength = 0
ValueId = 24
Text = cast
Operand #1:
Datatype = REC_BYTE_F_ASCII(0), Length = 10, Null Flag = 0
Precision = 0, Scale = 1, Collation = 1, flags_ = 208
Tuple Data Format = SQLMX_ALIGNED_FORMAT
Atp = 1, AtpIndex = 4
Offset = 16, NullIndOffset = -1, VClenIndOffset = -1
RelOffset = 4, VoaOffset = -1, NullBitIdx = -1
NullIndLength = 0, VClenIndLength = 0
ValueId = 23
Text = CHAR(10) CHARACTER SET ISO88591
Clause #3: ex_conv_clause
OperatorTypeEnum = ITM_CAST(2452), NumOperands = 2
ex_clause::flags_ = 86
ex_conv_clause::flags_ = 0
PCODE = supported
instruction: CONV_BIN32S_BIN32S(29), instrArrayIndex_: 76
Operand #0 (result):
Datatype = REC_BIN32_SIGNED(132), Length = 4, Null Flag = 1
Precision = 0, Scale = 0, Collation = 1, flags_ = 209
Tuple Data Format = SQLMX_ALIGNED_FORMAT
Atp = 1, AtpIndex = 2
Offset = 12, NullIndOffset = 8, VClenIndOffset = -1
RelOffset = 0, VoaOffset = -1, NullBitIdx = 0
NullIndLength = 0, VClenIndLength = 0
ValueId = 26
Text = cast
Operand #1:
Datatype = REC_BIN32_SIGNED(132), Length = 4, Null Flag = 1
Precision = 0, Scale = 0, Collation = 1, flags_ = 209
Tuple Data Format = SQLMX_ALIGNED_FORMAT
Atp = 1, AtpIndex = 4
Offset = 12, NullIndOffset = 8, VClenIndOffset = -1
RelOffset = 0, VoaOffset = -1, NullBitIdx = 0
NullIndLength = 0, VClenIndLength = 0
ValueId = 25
Text = INTEGER SIGNED
PCode:
PCode Expr Length: 160
[1]
HDR_MPTR32_IBIN32S_IBIN32S_IBIN32S_IBIN32S_IBIN32S (303) 4 0 12 32780 4 8 4
MOVE_MBIN8_MBIN8_IBIN32S (4) 4 16 5 16 10
NOT_NULL_BRANCH_MBIN32S_MBIN32S_IATTR3_IBIN32S (248) 4 8 5 8 134219270 0 0 13 (Tgt: 3)
[2]
MOVE_MBIN32S_IBIN32S (3) 4 12 0
RETURN (264)
[3] (Preds: 1 )
MOVE_MBIN32U_MBIN32U (202) 4 12 5 12
RETURN (264)
Expression: ScanExpr is NULL
Expression: RowIdExpr
Expr Len: 432, Consts Len: 8
flags_ = 88
Clause #1: ex_conv_clause
OperatorTypeEnum = ITM_CAST(2452), NumOperands = 2
ex_clause::flags_ = a0
ex_conv_clause::flags_ = 10
PCODE = not supported
instruction: CONV_ASCII_V_F(138), instrArrayIndex_: 359
Operand #0 (result):
Datatype = REC_BYTE_F_ASCII(0), Length = 10, Null Flag = 0
Precision = 0, Scale = 1, Collation = 1, flags_ = 208
Tuple Data Format = SQLARK_EXPLODED_FORMAT
Atp = 0, AtpIndex = 1 (Temporary)
Offset = 0, NullIndOffset = -1, VClenIndOffset = -1
NullIndLength = 0, VClenIndLength = 0
ValueId = 28
Text = cast
Operand #1:
Datatype = REC_BYTE_V_ASCII(64), Length = 8, Null Flag = 0
Precision = 0, Scale = 1, Collation = 1, flags_ = 208
Tuple Data Format = SQLARK_EXPLODED_FORMAT
Atp = 1, AtpIndex = 5
Offset = 2, NullIndOffset = -1, VClenIndOffset = 0
NullIndLength = 0, VClenIndLength = 2
ValueId = 27
Text = VARCHAR(8) CHARACTER SET ISO88591
Clause #2: ex_function_clause
OperatorTypeEnum = ITM_COMP_ENCODE(2114), NumOperands = 2
ex_clause::flags_ = 80
ex_function_clause::flags_ = 0
PCODE = supported
Operand #0 (result):
Datatype = REC_BYTE_F_ASCII(0), Length = 10, Null Flag = 0
Precision = 0, Scale = 1, Collation = 1, flags_ = 208
Tuple Data Format = SQLARK_EXPLODED_FORMAT
Atp = 1, AtpIndex = 3
Offset = 0, NullIndOffset = -1, VClenIndOffset = -1
NullIndLength = 0, VClenIndLength = 0
ValueId = 29
Text = comp_encode
Operand #1:
Datatype = REC_BYTE_F_ASCII(0), Length = 10, Null Flag = 0
Precision = 0, Scale = 1, Collation = 1, flags_ = 208
Tuple Data Format = SQLARK_EXPLODED_FORMAT
Atp = 0, AtpIndex = 1 (Temporary)
Offset = 0, NullIndOffset = -1, VClenIndOffset = -1
NullIndLength = 0, VClenIndLength = 0
ValueId = 28
Text = cast
PCode:
PCode Expr Length: 136
[1]
OPDATA_MPTR32_IBIN32S (0) 2 -1 5
OPDATA_MPTR32_IBIN32S (0) 5 0 6
OPDATA_MPTR32_IBIN32S (0) 2 0 10
OPDATA_MPTR32_IBIN32S (0) 5 2 11
CLAUSE_EVAL (108) -1599944080 32662 0
ENCODE_NXX (93) 4 0 2 0 10 0
RETURN (264)
Expression: UpdateExpr is NULL
Expression: MergeInsertExpr is NULL
Expression: LowKeyExpr is NULL
Expression: HighKeyExpr is NULL
Expression: ReturnFetchExpr is NULL
Expression: ReturnUpdateExpr is NULL
Expression: ReturnMergeInsertExpr is NULL
Expression: mergeUpdScanExpr is NULL
Expression: mergeInsertRowIdExpr is NULL
Expression: encodedKeyExpr is NULL
Expression: keyColValExpr is NULL
Expression: hbaseFilterExpr is NULL
Expression: preCondExpr is NULL
Expression: insConstraintExpr is NULL
Expression: updConstraintExpr is NULL

--- SQL operation complete.

If we log this example to a text file, we can search it to understand the plan's structure. Searching on "Contents of", we can find all the TDBs in the plan. In this example, there are two: The root node (EX_ROOT), and a scan node (EX_TRAF_SELECT). If we search for UPPER, we find it in the outputExpr_ of the root node. So, in this example, the plan fetches raw data from the scan node, and that flows up to the root node where the UPPER function is then applied. Some other things to notice: In the scan node, the LowKeyExpr and the HighKeyExpr are both null (that is, there is no LowKeyExpr nor HighKeyExpr), which means that the scan node will do a full table scan. Also, the ScanExpr is null, which means that no filtering will be applied at the scan node. (If instead we had done "select upper(name) from t1 where value = 4", then the predicate "value = 4" would appear as a ScanExpr in the scan node.)

When using SHOWPLAN, it is often convenient to also use EXPLAIN OPTIONS 'F' to have a high level view of the plan handy. In our example, we can do:

>>explain options 'f' select upper(name) from t1;
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
1 . 2 root 1.00E+002
. . 1 trafodion_scan T1 1.00E+002
--- SQL operation complete.
>>

This shows quite clearly that the scan node is the only child of the root; that is, data from the scan node will flow up to the root node.

  • No labels