Here are some notes on a talk given October 5th, 2007 by Utkarsh Srivastava on PNUTS, a Platform for Nimble Universal Table Storage.

From the seminar description:

The PNUTS project is to build a data management service for providing back-end support to Yahoo!'s web applications. To obtain acceptable latency and throughput while operating at Yahoo!'s scale, PNUTS uses massive parallelism and distribution---data is partitioned and replicated over thousands of servers. At the same time, PNUTS provides clean abstractions for data access that hide all this system complexity from the application programmer...In contrast to traditional database solutions, PNUTS is a centrally hosted and managed data service. Such a shared shared service model frees applications from the burden of having to set up, maintain and scale their own data store, and also amortizes the operational cost across all of Yahoo!'s applications. ...

Utkarsh, who also works on PIG, started with an outline of hurdles a startup faces building applications at web-scale (millions of users, terrabyte+ of data). You'll need a replicated, persistent datastore, caching, messaging between all systems to manage coherency, etc. You'll then throw away your first implementation. "If you have funding left, THEN you can start to work on application logic." PNUTS wants to make it so deploying a web-scale application requires nought but "...three guys, a weekend, and some PHP." (This latter quote is apparently from Mr.

For a folksy intro on how most of the big web apps can make do w/ just basic insert, update, and delete, see the PNUTS home page. Utkarsh had his own spin using flickr and for illustration (not that these apps currently run on PNUTS).

PNUTS Nuggets:

Hbase/PNUTS (last edited 2009-09-20 23:54:11 by localhost)