...
- Salesforce.com
- We have multiple 20-node clusters in production, a 10 node and 20 node development clusters
- Hadoop (native Java MapReduce) is used for Search and Recommendations
- We are using Apache Pig for log processing and Search, and to generate usage reports for several products and features at SFDC
- Pig makes it easy to develop custom UDFs. We developed our own library containing UDFs and loaders and are actively contributing back to the community
- The goal is to allow Hadoop/Pig to be used across Data Warehouse, Analytics and other teams making it easier for folks outside engineering to use data
...