Apache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources. By supporting SQL standards and leveraging advanced database techniques, Tajo allows direct control of distributed execution and data flow across a variety of query evaluation strategies and optimization opportunities.

General Information

Developer Documentation

User Documentation

User documentations is located at http://tajo.apache.org/docs/current/index.html.

Others

Google Summer of Code 2013

Child pages

Apache TAJO Home

General Information

Developer Documentation

User Documentation

Others