Apache Tajo (incubating)

Apache Tajo is a relational and distributed data warehouse system for Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation and ETL on large-data sets by leveraging advanced database techniques. It supports SQL standards. Tajo uses HDFS as a primary storage layer and has its own query engine which allows direct control of distributed execution and data flow. As a result, Tajo has a variety of query evaluation strategies and more optimization opportunities. In addition, Tajo will have a native columnar execution and and its optimizer.

General Information

User Documentation

Developer Documentation

Others


Interesting starting points:

This wiki is powered by MoinMoin.

FrontPage (last edited 2013-05-08 02:41:48 by HyunsikChoi)