Pig Wiki
Pig is a platform for analyzing large data sets. Pig's language, Pig Latin, lets you specify a sequence of data transformations such as merging data sets, filtering them, and applying functions to records or groups of records. Pig comes with many built-in functions but you can also create your own user-defined functions to do special-purpose processing.
Pig Latin programs run in a distributed fashion on a cluster (programs are complied into Map/Reduce jobs and executed using Hadoop). For quick prototyping, Pig Latin programs can also run in "local mode" without a cluster (all processing takes place in a single local JVM).
Do you Pig? At Yahoo! 30% of all Hadoop jobs are run with Pig. Come join us!
News
Why Pig Latin instead of SQL?
Pig Latin: A Not-So-Foreign Language ...
Pig Has Grown Up!. On 10/22/08 Pig graduated from the
Incubator and joined
Hadoop as a subproject.
Pig is Getting Faster! 2-6 times faster, for many queries. We've created a set of benchmarks and run them against the pig 0.1.0 release (modified to run on hadoop 0.18) and against the current trunk (previously types branch.) Joins and order bys in particular made large performance gains. For complete details see PigMix.
Interested in Pig Guts? We are completely redesigning the Pig execution and optimization framework. For design details see PigOptimizationWishList and PigExecutionModel.
General Information
PigTalksPapers - Pig talks, papers, interviews
User Documentation
Check it out ... updates and new additions.
New to Pig? Getting Started ...
PigOverview - An overview of Pig's capabilities
Pig Quick Start - How to build and run Pig
Pig Tutorial- Tackle a real task with pig, start to finish
Online Pig Training - Complete with video lectures, exercises, and a pre-configured virtual machine. Developed by Cloudera and Yahoo!
Pig Language
Pig Latin Reference Manual - Includes Pig Latin, built-in functions, and shell commands
Pig Functions
PiggyBank - User-defined functions (UDFs) contributed by Pig users!
UDF Manual - Write your own UDFs
Pig Latin Editors PigPen - A plugin for Eclipse that provides syntax highlighting, graphical script construction, example result generation, schema descriptions, and enables running your pig scripts locally and on a hadoop cluster.
A TextMate bundle for Pig Latin -
http://github.com/kevinweil/pig.tmbundle/tree/master A Vim plugin for Pig Latin -
http://www.vim.org/scripts/script.php?script_id=2186
More Pig
Pig Cookbook - Want Pig to fly? Tips and tricks on how to write efficient Pig scripts
Javadocs - Refer to the Javadocs for embedded Pig and UDFs
FAQ - The answer to your question may be here
Developer Documentation
How tos
Road map
Specification Proposals
Performance
PigPerformance (current performance numbers)