Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Advanced Tables - Table Plus
sortDescendingtrue
rowStylesbackground:lightyellow; border: 3px dotted green; text-align: center;
sortColumnStatus
allowExporttrue
displayDataFiltertrue
sortTipSort by if it is implemented
sortIcontrue

Release Plan

Release versionExpected DateCommentRelease Detail
4.0.0-alpha2020-09Release core features, including new build engin & query engine.s'c
4.0.0-beta2020-12 ~ 2021-01Implement other important features. 
4.0.0-gamma2021-04 Bug fix & Promotion
  •  TODO
4.0.0 2021-07GA (Ready for production)
  •  TODO
4.1.0Far future ...

Features

FeatureDescriptionCommentStatusComponentPriorityArrival(Expected)
Kafka Source(NRT)Ingest streaming data in batch wayIn design phase, did not have a conclusion of how to implement.

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleSource

Status
colourYellow
titleP1

Status
subtletrue
colourRed
title4.1.0


Kafka Source(Real-time OLAP)Ingest streaming data in stream/micro-batch wayIn design phase, did not have a conclusion of how to implement.

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleSource

Status
colourYellow
titleP1

Status
subtletrue
colourRed
title4.1.0

JDBC Source(Original version)Ingest data via JDBC contractIn design phase, did not have a conclusion of how to implement.

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleSource

Status
colourGreen
titleP2

Status
subtletrue
colourRed
title4.1.0

JDBC Source(Datasource SDK)Ingest data via JDBC contractIn design phase, did not have a conclusion of how to implement.

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleSource

Status
colourGreen
titleP2

Status
subtletrue
colourRed
title4.1.0

MapReduce Build EngineBuild pre-calculated cuboid data by Hadoop MapreduceThis feature maybe useless

Status
subtletrue
colourRed
titleDeleted

Status
subtletrue
colourBlue
titleBuild Engine



Spark Build EngineBuild pre-calculated cuboid data by Apache SparkNew implementation provided

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleBuild Engine

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-alpha

Flink Build EngineBuild pre-calculated cuboid data by Apache FlinkSupport in Kylin 3.1

Status
subtletrue
colourRed
titleDELETED

Status
subtletrue
colourBlue
titleBuild Engine



HBase StorageUse HBase to store pre-calculated cuboid data.Discussion in mailing list

Status
subtletrue
colourRed
titleDeleted

Status
subtletrue
colourBlue
titleStorage Engine



Parquet StorageUse Parquet to store pre-calculated cuboid data.Discussion in mailing list

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleStorage Engine

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-alpha

Distributed Query Engine / SparderUse calcite&catayst(a.k.a. Spark SQL) to parse/analyse/excute a SQL query.New implementation provided

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleQuery Engine

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-alpha

Measure - BitmapPrecise count distinct.N/A

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleMeasure

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-alpha

Measure - HLLNon-precise count distinct but low cost.N/A

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleMeasure

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-alpha

Measure - TopNTopN MeasureN/A

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleMeasure

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-BETA

Measure - PercentilePercentileN/A

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleMeasure

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-alpha

Query CacheCache query result in query's memory or external cache service.N/A

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleQuery Engine

Status
colourRed
titleP0

Status
subtletrue
colourYellow
titleBefore 4.0

HBase MetastoreUse HBase as metastore.I guess it will be removed in the GA version. (xxyu)

Status
subtletrue
colourRed
titleDeprecated

Status
subtletrue
colourBlue
titleMetastore

Status
colourGreen
titleP2

Status
subtletrue
colourYellow
titleBefore 4.0

RDBMS MetastoreUse RDBMS as metastore.Should as the first choice of metastore.

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleMetastore

Status
colourRed
titleP0

Status
subtletrue
colourYellow
titleBefore 4.0

Cardinality ComputationCalculate cardinality of fact table and dimension table.Planning

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleTOOL

Status
colourYellow
titleP1

Status
subtletrue
colourYellow
title4.0.0-BETA

Storage CleanupRemove useless data from storage or metastore.New implementation

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleTOOL

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-alpha

CSV SourceBuild cube from user-side csv file.New implementation

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleSource

Status
colourYellow
titleP1

Status
subtletrue
colourYellow
title4.0.0-alpha

SQL Standardto be updatedIn testing.

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleQuery Engine

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-BETA

Global Dictionary(Hive)Use hive and MR to build global dictionaryNew global dictionary will replace this feature.

Status
subtletrue
colourRed
titleDELETED

Status
subtletrue
colourBlue
titleBuild Engine



Global Dictionary(AppendTireDictionary)Tire dictionaryNew global dictionary will replace this feature.

Status
subtletrue
colourRed
titleDELETED

Status
subtletrue
colourBlue
titleBuild Engine



Global Dictionary(Spark Bucket Dictionary)Use apache spark to build global dictionaryNew implementation

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleBuild Engine

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-alpha

Cube Plannerto be updatedIn design phase, did not have a conclusion of how to implement.

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourYellow
titleadvanced

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-BETA

System Cube and Dashboardto be updatedNot well tested, planning

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourYellow
titleadvanced

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-BETA

Read write SeperatationThe query engine and build engine use different Hadoop cluster.New implementation provided

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourYellow
titleadvanced

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-alpha

Pushdown Engineto be updatedNew pushdown engine will only support SparkSQL.

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleQuery Engine

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-alpha

Shrunken Dictionaryto be updatedThis feature maybe useless

Status
subtletrue
colourRed
titleDELETED

Status
subtletrue
colourYellow
titleadvanced



UHC dictionaryto be updatedThis feature maybe useless

Status
subtletrue
colourRed
titleDELETED

Status
subtletrue
colourYellow
titleADVANCED



Deploy on AWS EMR

Support deploy Kylin on EMR5.x, EMR 6.x .

Support Glue.

Planning

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleEnV

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-BETA

All-in-one containerProvided a quick-start container for learning purpose.How to learn Kylin in Docker

Status
subtletrue
colourGreen
titleReady

Status
subtletrue
colourBlue
titleEnV

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-alpha

Hadoop3 support

Going to support Hadoop3 + Hive2 in 2020-Q4. Not sure when to suppoort Hive 3.

  •  AWS EMR 6.X
  •  CDH 6.X (latest 6.3.2)
Planning

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleEnV

Status
colourRed
titleP0

Status
subtletrue
colourYellow
title4.0.0-BETA

Hive3 support
N/A

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleEnV

Status
colourGreen
titleP2

Status
subtletrue
colourRed
titleFUTURE

Spark3 supportSupport use Spark3 for build and query.N/A

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleEnV

Status
colourGreen
titleP2

Status
subtletrue
colourRed
title4.1.0

Hybrid Model / Flexible cuboid buildAdd dimension or remove dimension without purge whole cube data.Planning

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleBuild Engine

Status
colourYellow
titleP1

Status
subtletrue
colourYellow
title4.1.0

Multi-level partition segmentLooke like Hive's multi-level partition design.N/A

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleBuild Engine

Status
colourYellow
titleP1

Status
subtletrue
colourRed
title4.1.0

Use catalyst to replace calciteMake query analysis quicker and lighter.N/A

Status
subtletrue
colourBlue
titleIn Progress

Status
subtletrue
colourBlue
titleQuery Engine

Status
colourYellow
titleP1

Status
subtletrue
colourRed
title4.1.0


...