Is Pig the right platform for your scenario?
Pig is right for you if you:
Need to analyze data that is small (kilobytes), tall (megabytes), grande (gigabytes), or venti (terabytes).
- Want to be able to create, modify and reuse your analysis logic easily.
- Process one data set at a time
- ... or need to combine multiple data sets.
- Do simple processing (e.g., count the number of images on the web)
- ... or complex processing (e.g., count the number of images that contain faces).
Pig is not right for you if you:
- Need to retrieve individual records, or small ranges of records, from a very large data set (e.g., lookup Joe Smith's customer profile).
- Have real-time data serving requirements (e.g., assemble a web page for Joe in under 100ms).
- Need to be able to do random writes to specific data records.
- Don't like barnyard animals.