Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Welcome to Kylin Wiki.

In this pages, I would like to list all new feature and break changes. Some features are in the status of IN PROGRESS, means Kylin team is going to implement this feature, and we will obey its priority. So if you have your suggestion/opinion on current priority, please let us know.


Release Plan

Release versionExpected DateCommentRelease Detail
4.0.0-alpha2020-09Release core features, including new build engin & query engine.s'c
4.0.0-beta2020-12 ~ 2021-01Implement other important features. 
4.0.0-gamma2021-04 Bug fix & Promotion
  • TODO
4.0.0 2021-07GA (Ready for production)
  • TODO
4.1.0Far future ...

Features

FeatureDescriptionCommentStatusComponentPriorityArrival(Expected)
Kafka Source(NRT)Ingest streaming data in batch wayIn design phase, did not have a conclusion of how to implement.

IN PROGRESS

SOURCE

P1

4.1.0

Kafka Source(Real-time OLAP)Ingest streaming data in stream/micro-batch wayIn design phase, did not have a conclusion of how to implement.

IN PROGRESS

SOURCE

P1

4.1.0

JDBC Source(Original version)Ingest data via JDBC contractIn design phase, did not have a conclusion of how to implement.

IN PROGRESS

SOURCE

P2

4.1.0

JDBC Source(Datasource SDK)Ingest data via JDBC contractIn design phase, did not have a conclusion of how to implement.

IN PROGRESS

SOURCE

P2

4.1.0

MapReduce Build EngineBuild pre-calculated cuboid data by Hadoop MapreduceThis feature maybe useless

DELETED

BUILD ENGINE



Spark Build EngineBuild pre-calculated cuboid data by Apache SparkNew implementation provided

READY

BUILD ENGINE

P0

4.0.0-ALPHA

Flink Build EngineBuild pre-calculated cuboid data by Apache FlinkSupport in Kylin 3.1

DELETED

BUILD ENGINE



HBase StorageUse HBase to store pre-calculated cuboid data.Discussion in mailing list

DELETED

STORAGE ENGINE



Parquet StorageUse Parquet to store pre-calculated cuboid data.Discussion in mailing list

READY

STORAGE ENGINE

P0

4.0.0-ALPHA

Distributed Query Engine / SparderUse calcite&catayst(a.k.a. Spark SQL) to parse/analyse/excute a SQL query.New implementation provided

READY

QUERY ENGINE

P0

4.0.0-ALPHA

Measure - BitmapPrecise count distinct.N/A

READY

MEASURE

P0

4.0.0-ALPHA

Measure - HLLNon-precise count distinct but low cost.N/A

READY

MEASURE

P0

4.0.0-ALPHA

Measure - TopNTopN MeasureN/A

IN PROGRESS

MEASURE

P0

4.0.0-BETA

Measure - PercentilePercentileN/A

READY

MEASURE

P0

4.0.0-ALPHA

Query CacheCache query result in query's memory or external cache service.N/A

READY

QUERY ENGINE

P0

BEFORE 4.0

HBase MetastoreUse HBase as metastore.I guess it will be removed in the GA version. (xxyu)

DEPRECATED

METASTORE

P2

BEFORE 4.0

RDBMS MetastoreUse RDBMS as metastore.Should as the first choice of metastore.

READY

METASTORE

P0

BEFORE 4.0

Cardinality ComputationCalculate cardinality of fact table and dimension table.Planning

IN PROGRESS

TOOL

P1

4.0.0-BETA

Storage CleanupRemove useless data from storage or metastore.New implementation

READY

TOOL

P0

4.0.0-ALPHA

CSV SourceBuild cube from user-side csv file.New implementation

READY

SOURCE

P1

4.0.0-ALPHA

SQL Standardto be updatedIn testing.

IN PROGRESS

QUERY ENGINE

P0

4.0.0-BETA

Global Dictionary(Hive)Use hive and MR to build global dictionaryNew global dictionary will replace this feature.

DELETED

BUILD ENGINE



Global Dictionary(AppendTireDictionary)Tire dictionaryNew global dictionary will replace this feature.

DELETED

BUILD ENGINE



Global Dictionary(Spark Bucket Dictionary)Use apache spark to build global dictionaryNew implementation

READY

BUILD ENGINE

P0

4.0.0-ALPHA

Cube Plannerto be updatedIn design phase, did not have a conclusion of how to implement.

IN PROGRESS

ADVANCED

P0

4.0.0-BETA

System Cube and Dashboardto be updatedNot well tested, planning

IN PROGRESS

ADVANCED

P0

4.0.0-BETA

Read write SeperatationThe query engine and build engine use different Hadoop cluster.New implementation provided

READY

ADVANCED

P0

4.0.0-ALPHA

Pushdown Engineto be updatedNew pushdown engine will only support SparkSQL.

READY

QUERY ENGINE

P0

4.0.0-ALPHA

Shrunken Dictionaryto be updatedThis feature maybe useless

DELETED

ADVANCED



UHC dictionaryto be updatedThis feature maybe useless

DELETED

ADVANCED



Deploy on AWS EMR

Support deploy Kylin on EMR5.x, EMR 6.x .

Support Glue.

Planning

IN PROGRESS

ENV

P0

4.0.0-BETA

All-in-one containerProvided a quick-start container for learning purpose.How to learn Kylin in Docker

READY

ENV

P0

4.0.0-ALPHA

Hadoop3 support

Going to support Hadoop3 + Hive2 in 2020-Q4. Not sure when to suppoort Hive 3.

  • AWS EMR 6.X
  • CDH 6.X (latest 6.3.2)
Planning

IN PROGRESS

ENV

P0

4.0.0-BETA

Hive3 support
N/A

IN PROGRESS

ENV

P2

FUTURE

Spark3 supportSupport use Spark3 for build and query.N/A

IN PROGRESS

ENV

P2

4.1.0

Hybrid Model / Flexible cuboid buildAdd dimension or remove dimension without purge whole cube data.Planning

IN PROGRESS

BUILD ENGINE

P1

4.1.0

Multi-level partition segmentLooke like Hive's multi-level partition design.N/A

IN PROGRESS

BUILD ENGINE

P1

4.1.0

Use catalyst to replace calciteMake query analysis quicker and lighter.N/A

IN PROGRESS

QUERY ENGINE

P1

4.1.0


Deprecated~Development Plan Kylin 4.0


  • No labels