This document captures the requirements for the Error Handling feature in Pig.
Robustness of software is an implicit requirement. Users expect and rightfully so, that software reports clearly and in an understandable fashion about the errors that occur when using the software. The errors encountered while using the software could be due to multiple sources. Some of the popular ones are:
- Environment issues: file not found, out of disk space, etc.
- Bugs in the software: null pointer exceptions, core dumps, out of bound access, etc.
- User/Programmer error: Syntax errors, divide by zero, incorrect use of casts, etc.
Users rely on the error messages to inform them about the source of the error along with a reasonable message that will influence the corrective course of action. While most errors cannot be handled in the system, at the least they should be reported in a reliable and readable manner.
Using the approach mentioned in Mika, Pig can be divided into three components for the purpose of error handling. A schematic view of the system is illustrated via the diagram.
- The user interface. This could be the grunt shell or the command line execution of a script or using Pig via the Java APIs
- The backend execution framework, i.e., Hadoop
Grunt is an interactive shell that allows users to submit Pig commands. The command line offers a mechanism for batch mode execution via scripts. The Java APIs provide a programmatic mechanism of accessing Pig. Irrespective of the mechanism, the control and data flow through Pig which in turn uses Hadoop as the execution framework. Errors could occur within each system and across systems.
Early error detection
Errors that occur in each system should be caught as early as possible. Pig relies on Hadoop for run time execution. Detection and reporting errors early will improve turnaround time by avoiding invoking Hadoop till most errors are fixed. A few examples that demonstrate this behavior are:
- Syntax errors. E.g.: Missing ';'
- Semantic errors. E.g: Mismatch in cogroup arity, Type mismatch when trying to add a string to an integer
Readable Error Messages
Provide users with readable error messages and not just for developers. Stack traces provide a good mechanism to help debugging but do not mean much to the user. Readable and simple error messages will be presented on STDERR.
Error codes will be devised for all error messages. The error codes will be classified into categories to enable further action based on the type of error.
Logs with error message details
Exceptions will not be swallowed, enabling detection of the root cause of the issue. Detailed information like stack trace will be logged into client side logs. The logs will be stored in the current working directory or the user's home directory. The location of the log file will be configurable. Users can send logs that contain the details of the error like stack trace to assist developers in resolving issues.
E.g.: For a semantic error like mismatch in cogroup arity, the STDERR messages and the log messages will look like. Note that this example is used to illustrate a possible use case.
1 //STDERR 2 grunt> a = load 'input1' as (x, y, z); 3 grunt> b = load 'input2' as (m, n, o); 4 grunt> c = cogroup a by (x, y), b by m; 5 ERROR org.apache.pig.tools.grunt.GruntParser - [Error 42] The arity of the group by columns do not match. 6 7 //LOG 8 ERROR org.apache.pig.tools.grunt.GruntParser - [Error 42] The arity of the group by columns do not match. 9 Stack Trace: 10 at org.apache.pig.PigServer.parseQuery(PigServer.java:298) 11 at org.apache.pig.PigServer.registerQuery(PigServer.java:263) 12 at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:439) 13 at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:249) 14 at org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:94) 15 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58) 16 at org.apache.pig.Main.main(Main.java:282) 17 Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: The arity of the group by columns do not match. 18 at org.apache.pig.impl.logicalLayer.parser.QueryParser.parseCogroup(QueryParser.java:169) 19 at org.apache.pig.impl.logicalLayer.parser.QueryParser.CogroupClause(QueryParser.java:1739) 20 at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:941) 21 at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:755) 22 at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:550) 23 at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60) 24 at org.apache.pig.PigServer.parseQuery(PigServer.java:295) 25 ... 6 more
- Users are responsible for purging error logs
- Users will be able switch on/off the detailed error messages on STDERR.
- Since Pig depends on Hadoop for execution, Hadoop error messages will be reported by Pig. An error during execution due to a bug in Pig will be shown differently from that of an error in Hadoop itself. The error code and the error message will indicate that the error was due to Hadoop.
Warning message aggregation
With the introduction of types and NULLs into Pig, there are several scenarios where Pig warns the user about introduction of casts, divide by zero uses, etc. The warning messages are issued on each occurrence of the warning. While this message is useful, the increased frequency of the messages is annoying and distracts the user from possible error messages. In order to alleviate this problem, warning message aggregation will be supported to report the warning message and the number of occurrences of the warning message. The warning message and the frequency of each warning message will be presented on STDERR. The logs will also contain the same information. The location of the log file will be the same as that of the error log file.
According to the Pig types functional specification, divide by zero will result in NULL. Pig warns the user about these conversions.
E.g.: Warning: Divide by zero: 30 times
Similarly, Pig emits warning messages when implicit casts are introduced for type promotion (adding an integer and a double)
E.g.: Warning: Implicit cast to double: 12 times
- Users will be able switch to turn on/off warning message aggregation. Turning warning message aggregation will result in one warning message per warning.
Mika Raento, "What should Exceptions look like?" July 30, 2006, http://www.errorhandling.org/wordpress/
Bruce Eckel, "Thinking in Java", 3rd Edition Revision 4.0, November 20, 2002, http://www.faqs.org/docs/think_java/TIJ3_c.htm
Alan Gates, "Pig Types Functional Specification", May 19, 2008, http://wiki.apache.org/pig/PigTypesFunctionalSpec
The following comments have not been incorporated in the requirements but are listed for reference.
Pig should indicate the expression that caused the warning and/or error message. E.g.: B = FILTER A BY (f1 > 10) and (f2 != 'asdf') and (f3 > 2L). If f1 requires a cast for the comparison, Pig should indicate that (f1 > 10) resulted in the warning.
- Pig should indicate the line number that resulted in the error ala general purpose programming languages like C++, Java
- Pig should be more transparent. Retries in Hadoop should be communicated to the user in a programmatic fashion, to allow layers above Pig to take appropriate action. Pig should also log/share information about the processes, machines and logs that are spawned by Hadoop to enable users to take quick action.