Objective
The objective of this document to provide development guidelines for contributors to Pig project.
Error Handling
There are several types of errors in Pig:
User Errors. This includes invalid syntax, working with non-existent data, referring to non-existent relationships, etc. The desired behavior should be to show a meaningful error message to the user and abort the processing. (This does not mean that in the interactive shell we exit though.)
Internal Errors. This are internal problems with pig code that would be handled with asserts in languages like C/C++. An example would be a unchecked NULL pointer. The desired behavior is to notify the user of the failure and to log the stack trace to the client side log file.
Frontend Errors. This are the errors that occur on the Pig client. An example is for instance failure to connect to HOD or access metadata repository when we add support for that. The proper behavior in this case would be to retry a few times and then to the same handling as for Internal Errors.
Backend Errors. This are the errors that happened on the backend during the course of the program execution. An example would be failure to read a DFS file. The desired behavior in this case is to propagate the error from the backend to the frontend and then perform the processing similar to the internal error.
It is helpful to be able to separate different types of errors in our code. Here is the proposal on how to handle them:
For User Errors, throw ParseException that contains meaningful message. For batch and interactive processing, catch the exception in Grunt and log the exception message to stderr. The same can be done with the developer in the embedded case. In debug mode, log the exception stack into the client side log.
For Internal Errors, throw RuntimeException or its derivation. Catch the exception in main, log it, including the stack, to the client side log. Write failure message to stderr pointing to the log file.
For Frontend Errors, throw newly created FrontendException that is a subclass of RuntimeException. Catch the exception in main, log it, including the stack, to the client side log. Write failure message to stderr pointing to the log file. In the future, we might find it useful to further subdivide this errors in which case we might subclass the FrontendException class.
For Backend Errors there will need to be backend specific way to get the error from the backend to the frontend. Once this is done, log the error into client side log file and throw ExecuteException. Catch the exception in main, log it, including the stack, to the client side log. Write failure message to stderr pointing to the log file.