Apache Kylin : Analytical Data Warehouse for Big Data
Page History
...
Code Block | ||||
---|---|---|---|---|
| ||||
git clone https://github.com/apache/kylin.git -b kylin-on-parquet-v2
# Compile
mvn clean install -DskipTests |
...
The environment on
...
dev machine
Install Maven
The latest maven can be found at http://maven.apache.org/download.cgi, we create a symbolic symbol link so that mvn
can be run anywhere.
...
Manually install the Spark binary in in a local folder like /usr/local/spark. Kylin support supports the community version of Spark. You can go to apache spark's official website and download spark2spark 2.4.6.
How to Debug
There are two modes to debug source code: Debug with local metadata(recommend) and debug with hadoop recommended), or debug with Hadoop sandbox.
Configuration
Debug with local metadata
...
VM options "-Dspark.local=true" is is for query engine.
Debug with Hadoop sandbox
Local configuration must be modified to point to your hadoop Hadoop sandbox (or CLI) machine.
- In examples/test_case_data/sandbox/kylin.properties
- Find
sandbox
and replace with your hadoop Hadoop hosts (if you’re using HDP sandbox, this can be skipped) - Find
kylin.job.use-remote-cli
and change it to “true” (in the code repository the default is false, which assume assumes running it on hadoop Hadoop CLI) - Find
kylin.job.remote.cli.username
andkylin.job.remote.cli.password
, fill in the user name and password used to login hadoop Hadoop cluster for hadoop Hadoop command execution; If you’re using HDP sandbox, the default username isroot
and password ishadoop
.
- Find
- In examples/test_case_data/sandbox
- For each configuration xml XML file, find all occurrences of
sandbox
andsandbox.hortonworks.com
, replace with your hadoop Hadoop hosts; (if you’re using HDP sandbox, this can be skipped)
- For each configuration xml XML file, find all occurrences of
...
Code Block | ||
---|---|---|
| ||
cd webapp npm install -g bower bower --allow-root install |
If you encounter a network problem when run “bower install”, you may try:
...
Note, if on Windows, after install installing bower, need to add the path of “bower.cmd” to system environment variable ‘PATH’, and then run:
...
In IDE, launch org.apache.kylin.rest.DebugTomcat
. Please set the path of the “server” module as the “Working directory”, set “kylin-server” for “Use classpath of module”, and check the “Include dependencies with ‘Provided’ scope” option in IntelliJ IDEA 2018. If you’re using IntelliJ IDEA 2017 and older, you need modify “server/kylin-server.iml” file, replace all “PROVIDED” to “COMPILE”, otherwise an a “java.lang.NoClassDefFoundError: org/apache/catalina/LifecycleListener” error may be thrown..
You may also need to tune the VM options:
Code Block |
---|
-Dhdp.version=2.4.0.0-169 -DSPARK_HOME=/usr/local/spark -Dkylin.hadoop.conf.dir=/workspace/kylin/examples/test_case_data/sandbox -Xms800m -Xmx800m -XX:PermSize=64M -XX:MaxNewSize=256m -XX:MaxPermSize=128m |
Also remeber remember that if you debug with local mode, you should add a VM option for the query engine:
Code Block |
---|
-Dspark.local=true |
...
By default Kylin server will listen on the 7070 port; If you want to use another port, please specify it as a parameter when run DebugTomcat
.
...