中文论坛:http://bbs.hadoopor.com
中文社区:http://www.hadoopor.com

Apache Hadoop is a Java software framework that supports data intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google's MapReduce and Google File System (GFS) papers.

Hadoop is a top level Apache project, being built and used by a community of contributors from all over the world. Yahoo! has been the largest contributor to the project and uses Hadoop extensively in its web search and advertising businesses. IBM and Google have announced a major initiative to use Hadoop to support university courses in distributed computer programming.

Hadoop was created by Doug Cutting (now a Cloudera employee), who named it after his child's stuffed elephant. It was originally developed to support distribution for the Nutch search engine project.

Hadoop是基于shared-nothing架构的海量数据存储和计算的分布式系统,它由若干个成员组成,主要包括:HDFS、MapReduce、Hive、HBase、Pig和ZooKeeper,其中HDFS是Google的GFS开源实现,而ZooKeeper是Google的Chubby开源版本,而HBase是Google的BigTable开源版本。HDFS具体高容错性和强线性扩展特点。

Hadoop由Doug Cutting在2004年开始开发,2008年开始流行于中国,2009年在中国已经火红,包括中国移动、百度、网易、淘宝、腾讯、金山和华为等众多公司都在研究和使用它,另外还有中科院、暨南大学、浙江大学、曲阜师范等众多高校在研究它。

  • No labels