Is Pig the right platform for your scenario?

Pig is right for you if you:

  • Need to analyze data that is small (kilobytes), tall (megabytes), grande (gigabytes), or venti (terabytes).

  • Want to be able to create, modify and reuse your analysis logic easily.
  • Process one data set at a time
  • ... or need to combine multiple data sets.
  • Do simple processing (e.g., count the number of images on the web)
  • ... or complex processing (e.g., count the number of images that contain faces).

Pig is not right for you if you:

  • Need to retrieve individual records, or small ranges of records, from a very large data set (e.g., lookup Joe Smith's customer profile).
  • Have real-time data serving requirements (e.g., assemble a web page for Joe in under 100ms).
  • Need to be able to do random writes to specific data records.
  • Don't like barnyard animals.

