Differences between revisions 2 and 3
Revision 2 as of 2009-09-20 23:38:20
Size: 1102
Editor: localhost
Comment: converted to 1.6 markup
Revision 3 as of 2009-11-24 23:19:23
Size: 1220
Editor: AlanGates
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:

'''AS OF PIG 0.2 GROUP FUNCTIONS HAVE BEEN REMOVED FROM THE LANGUAGE. THE FOLLOWING APPLIES ONLY TO PIG 0.1.'''

AS OF PIG 0.2 GROUP FUNCTIONS HAVE BEEN REMOVED FROM THE LANGUAGE. THE FOLLOWING APPLIES ONLY TO PIG 0.1.

Group Functions

A Group Function assigns tuples to group(s) (note: if desired, a tuple can be assigned to multiple groups).

To create your own Group Function, create a Java class that extends the following abstract class:

public abstract class GroupFunc{
    
        /**
         * @param input the tuple to be processed.
         * @return the different groups that the specified tuple belongs to.
         * @throws IOException
         */
        abstract public Datum[] exec(Tuple input);

}

Example

Our built-in GFAll() function puts all tuples into a single group labeled "all". The code is:

public class GFAll extends GroupFunc {
        
        public Datum[] exec(Tuple input) {
                return new Datum[]{new DataAtom("all")};
        }
}

Advanced Features

  • As in EvalFunction, Group functions can define the schema of their output by overriding the outputSchema method.

  • As in EvalFunction, Group functions can perform some final cleanup by overriding the finish method.

GroupFunction (last edited 2009-11-24 23:19:23 by AlanGates)