Chukwa is a Hadoop subproject devoted to large-scale log collection and analysis. Chukwa is built on top of the Hadoop distributed filesystem (HDFS) and MapReduce framework and inherits Hadoop’s scalability and robustness. Chukwa also includes a flexible and powerful toolkit for displaying monitoring and analyzing results, in order to make the best use of this collected data.
Documentation
FAQ - In progress...
Chukwa_Quick_Start - Getting Chukwa running on your cluster.
Chukwa_Console_Integration_Guide - Guide to integrate with Chukwa UI console.
Sending_information_to_Chukwa - A tutorial walking through the process of sending a log file to chukwa and how Chukwa parses records from the datasink file.
Chukwa_Configuration - A short description of what each configuration file in Chukwa's conf directory is used for
Chukwa_Adaptors_List - A description of the prebuilt adaptors that you can turn on to collect data from the nodes in your cluster
Chukwa_Startup_and_Shutdown_Scripts - A description of the scripts found in Chukwa's bin directory which are used to start and stop various parts of the Chukwa framework.
Chukwa Readme.txt (part of the distribution - WARNING: this text file may be out of date).
Chukwa_Test_Plan - A description of automated test cases and detailed test plans for release quality control.
Anomaly_Detection_Framework_with_Chukwa - A description of Anomaly Detection Framework design for Chukwa 0.2.
Presentations
ChukwaPoster.pdf - Chukwa Poster
chukwa_presentation.pdf - An overview of the Chukwa Monitoring System
chukwa_presentation_cca08.pdf - A talk presented about Chukwa by Berkeley graduate students at Cloud Computing and its Applications 08 (http://cca08.org) October 2008.
Download
Chukwa is part of the Hadoop distribution. You can view the source as part of the Hadoop Apache SVN repository here
Papers
chukwa_cca08.pdf - Cloud Computing and its Applications (CCA) 2008
Links
JIRA HADOOP-3719 - The original Apache JIRA ticket for contributing Chukwa to Hadoop as a contrib project.
JIRA HADOOP-4709 - A batch update to the JIRA in Hadoop/src/contrib. After this update the Chukwa team will be fully embracing the Apache JIRA development model, as suggested in the comments on this JIRA.