HDFS Futures

Below is a categorized list and descriptions of HDFS Future Features

Goal: HDFS for Production Use

  1. Reliable and Secure: The file system is solid enough for user to feel comfortable to use in "production"
    • Availability and integrity of HDFS is good enough
      • Availability of NN and integrity of NN Data
      • Availability if the file data and its integrity
    • Secure
      • Access control - in 0.16
      • Secure authentication 0.19
  2. Good Enough Performance: HDFS should not limit the scaling of the Grid and the utilization of the nodes in the Grid
    • Handle large number of files
    • Handle large number of clients
    • Low latency of HDFS operation - this will affect the utilization of the client nodes
    • High throughput of HDFS operations
  3. Rich Enough FS Features for applications
    • e.g. append
    • e.g. good performance for random IO
  4. Sufficient Operations and Management features to manage large 4K Cluster
    • Easy to configure, upgrade etc
    • BCP, snapshots, backups

Service Scaling

This means scaling the Name Service (aka Namenode) and the number of Datanodes that can be present in a HDFS system.

For scaling the Name service (Namenode), there are two main issues here

Improving one may improve the other.

Summary of various options that scale name space and its performance (details below)

Scaling Name Service Throughput and Response Time

Scaling Namespace (i.e. number of files/dirs)

Since the name node stores block and name objects in memory, the size of the name space (and hence the number of files) is limited by amount of heap memory. Currently a 14GB heap (ie 16GB machine) allows 60 million block and name objects. Hence if one has 2 blocks per file, then one is limited to 20 million files. This is a significant restriction for large clusters. Besides adding more memory, several options are listed below.

Partition/distribute Name node (will also help performance)

Name Service Availability (includes integrityof NN data, HA, etc)

Integrity of NN Image and Journal

Faster Startup

Restart and Failover

Security: Authorization and ACLs

File Features

File IO Performance

Namespace Features

File Data Integrity (For NN see NN data integrity above)

Operations and Management Features

Hadoop Protocol RPC

RPC Timeouts, Connection handling, Q handling, threading

Client-side recovery from NN restarts and faIlovers

Versioning

Multiple Language Support

Benchmarks and Performance Measurements

Diagnosability

Development Support

Intercluster Features

BCP support

Attachments

HdfsFutures (last edited 2010-03-19 23:44:31 by SanjayRadia)