Abstract

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines. This project aims to address the CPU computational bottleneck to offload JVM operators to native engines in data loading and various scenarios. With advancements in IO technologies, especially the widespread use of SSDs and 10GbE NICs or higher bandwidth, CPU computation has gradually become the primary limiting factor for performance. However, optimizing CPU instructions based on the JVM is relatively challenging compared to other native languages like C++, as the JVM provides fewer optimization capabilities. At this moment, Apache Spark is the first engine it can plug into. Support for other engines like Trino, Apache Flink are on the roadmap.

Proposal

The Gluten project utilizes JVM-based SQL engines' (currently Apache Spark) plugin mechanism to intercept and send query plans to native engines for execution, bypassing the original engine's less efficient execution path. The project supports multiple native engines as backends, including Velox, ClickHouse, and Apache Arrow. For operations that the native engines cannot handle, Gluten falls back to the SQL engine's normal execution path. In terms of thread models, Gluten utilizes JNI (Java Native Interface) library calls to invoke native code directly within original engine's executor task threads, avoiding the introduction of complex thread models.

Background

Apache Spark is a stable, mature project that has been under development for many years. The project has proven to be one of the best frameworks for processing petabyte-scale datasets. However, the Spark community has had to address performance challenges that required various optimizations over time. A key optimization introduced in Spark 2.0 replaced Volcano mode with whole-stage code generation to achieve a 2x speedup. Most of the optimization works at the query plan level.

However, there is a need to address query performance more broadly. The industry understands the current performance bottleneck. This motivated Intel and Kyligence to initiate the Gluten project to unleash the power of Advanced Vector Extensions (AVX) technology using SIMD instructions within a vectorized SQL engine, which enables Apache Spark (as well as other engines in the future) to break through its row-based data processing and JVM limitations. 

You can find more information on Gluten at the existing open-source website:

https://oap-project.github.io/gluten/

Rationale

The Gluten project aims to bridge the gap between Spark SQL's scalability and native libraries' performance benefits. By reusing Spark's control flow and JVM code while offloading compute-intensive data processing to native code, we seek to significantly improve performance without requiring changes to existing SparkSQL jobs. This approach involves transforming Spark's physical plan into a Substrait plan and passing it to native libraries, enabling the seamless execution of SparkSQL jobs with enhanced performance.

Multiple native Backend Support

There are numerous mature open-source native SQL engine products and libraries available in the market, including Velox, ClickHouse, and Apache Arrow, among others. Gluten has opted for Velox and ClickHouse as backend support but remains open to expanding its support to incorporate other esteemed open-source native SQL engines.

Meta has launched Velox, an open-source unified execution engine designed to enhance data management system efficiency and simplify development.

ClickHouse is an open-source column-oriented database management system designed for high-performance analytics and data warehousing, capable of handling massive amounts of data with lightning-fast query processing.

Plan Conversion

Gluten uses Substrait.io to build a unified query plan tree and connect it to an individual backend engine. Gluten converts Spark’s physical plan to a Substrait plan for each backend, then shares the Substrait plan over JNI to trigger the execution pipeline in the native library.

Memory Management

Gluten leverages Spark’s existing memory management system. It calls the Spark memory registration API for every native memory allocation/deallocation action. Spark manages the memory for each task thread. If the thread needs more memory than is available, it can call the spill interface for operators that support this capability. Spark’s memory management system protects against memory leaks and out-of-memory issues.

Columnar Shuffle

Shuffle itself is a crucial factor affecting Spark performance. It involves multiple steps such as serialization/deserialization, network transmission, and disk I/O. To achieve high performance and avoid becoming a bottleneck, careful considerations are needed. Since the Native Engine utilizes a columnar data structure to store data, simply adopting Spark's row-based data model for Shuffle would introduce data column-to-row conversion in the Shuffle Write phase and data row-to-column conversion in the Shuffle Read phase. This is necessary to ensure smooth data circulation. However, both row-to-column and column-to-row conversions come at a cost. Therefore, Gluten must provide a comprehensive Columnar Shuffle mechanism to bypass these conversion overheads. In terms of the specific implementation of columnar shuffle, it can be broadly divided into two parts: shuffle data writing and shuffle data reading.

Gluten also integrated with Apache Celeborn(incubating), which is a mature general-purpose Remote Shuffle Service that can effectively address the stability, performance, and elasticity issues present in local shuffling of big data engines. The Apache Celeborn community and the Gluten community have been cooperating with each other for some time, successfully integrating Celeborn into Gluten. This integration allows Spark to better embrace the Cloud Native approach.

Shim Layer

To seamlessly integrate with Spark, Gluten incorporates a Shim Layer to effectively manage diverse API versions across different Spark releases, enabling seamless extension for supporting multiple Spark versions. Presently, Gluten offers support for Spark 3.2 and 3.3, with additional support for further Spark versions in the pipeline.

Fallback Mechanism

Gluten utilizes the established Spark JVM engine to validate operator compatibility with the native library. In cases where the operator is not supported, Gluten seamlessly reverts to the pre-existing Spark-JVM-based operator. However, this fallback mechanism incurs a performance trade-off due to the necessitated columnar-to-row and row-to-column data conversions.

Spark Metrics Extension

Gluten greatly enhances Spark’s Metrics functionality by seamlessly integrating with it. While the default Spark metrics are tailored for Java row-based data processing, Project Gluten takes it a step further. We enrich this functionality with a specialized column-based API and introduce supplementary metrics. This augmentation not only optimizes the use of Gluten but also offers developers valuable tools for debugging these native libraries effectively.

Initial Goals

  • Unified Plan Transformation: Implement a robust mechanism to transform sql plan into an unified plan.
  • Seamless Native Integration: Develop a smooth integration of native libraries, optimizing the offloading of performance-critical data processing tasks for improved computational speed.
  • Efficient Communication Infrastructure: Define clear JNI interfaces to facilitate efficient communication between big data framework and native libraries.
  • Expend the Community: Foster the growth and diversification of the Gluten community, empowering development teams and strengthening the project's foundation.

Current Status

In 2022, Intel and Kyligence initiated the development of Gluten, initially released as an open-source Spark plugin. During this period, there was a recognized need for a project capable of harnessing hardware capabilities and seamlessly integrating with native libraries to deliver superior performance, surpassing the limitations of the existing Java-based Spark SQL.

Meritocracy:

This proposal aims to build a diverse community for Gluten, following the Apache Software Foundation's approach. Since Gluten became open source, many companies have adopted it for their big data solutions. The code is managed by developers from over 10 companies, and we invite individual developers to play key roles too. We're committed to creating an environment that values meritocracy principles. 

Community:

The project has had a strong interest from numerous companies and individuals, engaging in discussions about its roadmap, issues, and design for the last one year. Gluten already has active contributors from various organizations. And we believe that embracing the Apache Way will further enhance the growth of both the community and the project.

Users:

Gluten has been adopted into companies including Baidu/BIGO/Meituan’s on-premise data warehouse, where it efficiently handles thousands of tasks daily. Additionally, Alibaba's E-MapReduce product on Alibaba Cloud incorporates Gluten as a key feature, serving numerous customers. A significant number of our users express a strong willingness to actively contribute to the project, fostering community growth and strength.

Core Developers:

Gluten boasts a diverse developer base spanning multiple organizations, including but not limited to Intel, Kyligence, BIGO, Alibaba, Meituan, Microsoft, Baidu, and Netease. Many of these developers hold key roles as PMC members, and actively contribute not only to Gluten but also to various other Apache projects.

Alignment:

Gluten is constructed using Apache Spark and incorporates several other Apache projects, including Hadoop and YARN. The codebase of Gluten is already licensed under Apache License Version 2.0. Moreover, our team includes core developers with significant experience contributing to diverse Apache projects. Leveraging these community connections, we prioritize development practices that emphasize community engagement, aligning ourselves with the Apache Software Foundation's path to meritocratic recognition seamlessly.

Known Risks

Project Name

“Gluten” is Latin for glue. Main goal of project Gluten is to “glue" the JVM based SQL engine like Spark SQL and native libraries. So we can take use of and benefit from the high scalability of Spark SQL framework, as well as the high performance of native libraries.

Orphaned products

There is a certain level of risk associated with the potential abandonment of the Gluten project, particularly given its status as a young and relatively small community. It is imperative that we address and mitigate this risk promptly during the Apache Incubation phase. Numerous organizations rely on Gluten to construct vital big data pipelines, making it crucial to engage and encourage their involvement in nurturing the Gluten community, especially if it transitions into an Apache Software Foundation (ASF) project.

Inexperience with Open Source

Numerous Gluten contributors possess extensive experience in collaborating on open-source projects. Additionally, they actively contribute and serve as committers to various other Apache projects.

Length of Incubation:

Expect to enter incubation in two months and graduate in 18-24 months.

Homogenous Developers

The present contributors are affiliated with diverse organizations such as Intel, Kyligence, and more. We remain dedicated to recruiting additional committers based on their significant contributions to the project. The Gluten project is inherently polyglot, supporting development in a diverse array of languages such as Scala, Java, C++, Python, and Shell Script. This versatility appeals to developers with a broad spectrum of language skills, encouraging their active contributions to the Gluten project.

Reliance on Salaried Developers

Salaried engineers from companies such as Intel and Kyligence have made valuable contributions to the Gluten project, dedicating both their salaried work hours and volunteer time. Their enthusiasm for the project is palpable, and we remain steadfast in our commitment to expanding our team, welcoming more members from various backgrounds, including non-salaried developers. Our goal is to foster a more diverse Gluten user and contributor base as we move forward.

Relationships with Other Apache Products

  • Apache Spark: Gluten's endorsement of Spark as its primary big data framework of choice stems from Spark's reputation as a potent, open-source distributed computing framework, integral to the core of big data analytics.
  • Apache Arrow : Gluten utilizes Apache Arrow as a data format to empower high-performance data interchange across diverse programming languages, frameworks, and backends.
  • Apache Celeborn(incubating) Gluten is closely integrated with Apache Celeborn for remote shuffle service support. The design goal of integrating Gluten with Celeborn is to simultaneously preserve the core designs of Gluten Columnar Shuffle and Celeborn Remote Shuffle, allowing the advantages of both to be combined.
  • Apache Uniffle(incubating) : Uniffle, a project offering high performance remote shuffle service capabilities, represents another promising integration opportunity that Gluten is considering. Gluten will be supported in the Apache Uniffle v0.8 release.
  • Apache Flink: Apache Flink emerges as another promising big data framework that Gluten aims to incorporate as an intermediary layer, facilitating the seamless offloading of data processing to the native engine.

An Excessive Fascination with the Apache Brand

The main objective behind submitting Gluten to the ASF is to cultivate a robust and diverse community while fostering stability for sustainable development. Additionally, we aspire to promote the widespread adoption of Gluten by diverse organizations, encouraging their contributions without any apprehensions regarding ownership or licensing.

Documentation

Documentations can be found on:

 You can find the specific version of Gluten documentation listed below:

  • main
  • branch-0.5.0
  • branch-1.0
  • branch-1.1 

Initial Source

Gluten Source Code (https://github.com/oap-project/gluten)

Initial Source and Intellectual Property Submission Plan

Upon Gluten's approval to join the Apache Incubator, Intel and Kyligence will submit Software Grant Agreement (SGA) and CCLA (Kyligence has aleady signed; Intel agreed to sign it once entered the incubator); our initial committers will promptly submit their iCLA. Rest assured, the codebase is already licensed under the Apache License 2.0, ensuring compliance and seamless integration.

External Dependencies

The list is very long so put it in the "table 1" at the end of this page.

Cryptography

Gluten does not currently include any cryptography-related code.

Required Resources

Mailing lists:

Git Repositories:

Upon entering incubation, we want to move the existing repo to the Apache Software Foundation:

Issue Tracking:

  • We request the creation of an Apache-hosted JIRA.
  • Jira ID: GLUTEN

Initial Committers

  • Hongze Zhang (Github ID: zhztheplayer) <Hongze.Zhang at intel dot com >
  • Rui Mo (Github ID: rui-mo) <rui.mo at intel dot com >
  • Rong Ma (Github ID:marin-ma) <rong.ma at intel dot com >
  • Feilong He (Github ID: PHILO-HE)) <Feilong.He at intel dot com >
  • Zhichao Zhang (Github ID: zzcclp) <zhangzc at apache dot org>
  • Jia Ke (Github ID: JkSelf) <ke.a.jia at intel dot com >
  • Yang Li (Github ID:taiyang-li) <liyang910910 at gmail dot com>
  • Yang Zhang (Github ID: Yohahaha) <yangchuan.zy at alibaba-inc dot com>
  • Yuan Zhou (Github ID:zhouyuan) <yuan.zhou at intel dot com >
  • Xiduo You (Github ID: ulysses-you) <ulyssesyou at apache dot org>
  • Jiabiao Liang (Github ID: lgbo-ustc) <lgbo.ustc at gmail dot com>
  • Chunwei Zuo (Github ID:zuochunwei) <zuochunwei at meituan dot com >
  • Chang Chen (Github ID: baibaichen) <chang.chen at kyligence dot io>
  • Shuai Li (Github ID: loneylee) <shuai.li at kyligence dot io>
  • Binwei Yang (Github ID: FelixYBW) <binwei.yang at intel dot com>
  • Hongbin Ma (Github ID: binmahone) <mahongbin at apache dot org>
  • Neng Liu (Github ID:liuneng1994) <neng.liu at kyligence dot io>
  • Zhen Li (Github ID: zhli1142015) <zhli at microsoft dot com >
  • Weiting Chen (Github ID: weiting-chen) <weiting.chen at intel dot com >
  • Jacky Lee (Github ID: jackylee-ch) <qcsd2011 at gmail dot com>
  • Zhibiao Zhang (Github ID: zhanglistar) < zhanglinuxstar at gmail dot com>
  • Kuo Zhao (Github ID: kecookier) <zhaokuo_game at 163 dot com >: Meituan team leader who has helped to adopt Gluten into Meituan production ready environment as Gluten’s 1st real use case.
  • Keyong Zhou (Github ID: waitinfuture) <zky.zhoukeyong at alibaba-inc dot com >: Alibaba team leader and Apache Celeborn Committer who has helped to integrate Gluten into Alibaba EMR and brought Celeborn support with Gluten.

Affiliations

  • Intel: Binwei Yang, Feilong He, Hongze Zhang, Jia Ke, Rong Ma, Rui Mo, Weiting Chen, Yuan Zhou
  • Kyligence: Chang Chen, Hongbin Ma, Neng Liu, Shuai Li, Zhichao Zhang
  • BIGO: Jiabiao Liang, Yang Li,  Zhibiao Zhang
  • Alibaba: Yang Zhang, Keyong Zhou
  • Meituan: Chunwei Zuo, Kuo Zhao
  • Baidu: Jacky Lee
  • Netease: Xiduo You
  • Microsoft: Zhen Li

Sponsors

Champion

  • Shaofeng Shi (shaofengshi@apache.org)

Nominated Mentors

  • Yu Li (liyu@apache.org)
  • Wenli Zhang (ovilia@apache.org)
  • Kent Yao (yao@apache.org)
  • Shaofeng Shi (shaofengshi@apache.org)
  • Felix Cheung (felixcheung@apache.org)

Sponsoring Entity

We are expecting the Apache Incubator could sponsor this project.

Table 1: External dependencies

Apache 1.1

  • oro:oro:jar:2.0.8

Apache2.0

Totally 237 dependencies, as listed below:

  • cglib:cglib-nodep:jar:2.1_3
  • com.clearspring.analytics:stream:jar:2.9.6
  • com.fasterxml.jackson.core:jackson-annotations:jar:2.13.5
  • com.fasterxml.jackson.core:jackson-core:jar:2.13.5
  • com.fasterxml.jackson.core:jackson-databind:jar:2.13.5
  • com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:jar:2.13.4:runtime
  • com.fasterxml.jackson.datatype:jackson-datatype-jdk8:jar:2.13.4:runtime
  • com.fasterxml.jackson.module:jackson-module-scala_2.12:jar:2.13.5
  • com.github.ben-manes.caffeine:caffeine:jar:2.9.3
  • com.github.joshelser:dropwizard-metrics-hadoop-metrics2-reporter:jar:0.1.2
  • com.google.code.findbugs:jsr305:jar:3.0.0
  • com.google.code.findbugs:jsr305:jar:3.0.0
  • com.google.code.findbugs:jsr305:jar:3.0.0:runtime
  • com.google.code.gson:gson:jar:2.2.4
  • com.google.code.gson:gson:jar:2.8.6
  • com.google.code.gson:gson:jar:2.8.9
  • com.google.crypto.tink:tink:jar:1.6.0
  • com.google.errorprone:error_prone_annotations:jar:2.10.0
  • com.google.guava:guava:jar:11.0.2
  • com.google.guava:guava:jar:26.0-jre
  • com.google.guava:guava:jar:32.0.1-android
  • com.google.j2objc:j2objc-annotations:jar:2.8
  • com.jolbox:bonecp:jar:0.8.0.RELEASE
  • com.madgag:animated-gif-lib:jar:1.4
  • com.ning:compress-lzf:jar:1.0.3
  • com.tdunning:json:jar:1.8
  • com.twitter:chill_2.12:jar:0.10.0
  • com.twitter:chill-java:jar:0.10.0
  • com.univocity:univocity-parsers:jar:2.9.1
  • com.zaxxer:HikariCP:jar:2.5.1
  • commons-beanutils:commons-beanutils:jar:1.7.0
  • commons-beanutils:commons-beanutils-core:jar:1.8.0
  • commons-cli:commons-cli:jar:1.2
  • commons-codec:commons-codec:jar:1.15
  • commons-codec:commons-codec:jar:1.4
  • commons-collections:commons-collections:jar:3.2.2
  • commons-configuration:commons-configuration:jar:1.6
  • commons-dbcp:commons-dbcp:jar:1.4
  • commons-digester:commons-digester:jar:1.8
  • commons-httpclient:commons-httpclient:jar:3.1
  • commons-io:commons-io:jar:2.11.0
  • commons-io:commons-io:jar:2.4
  • commons-io:commons-io:jar:2.8.0
  • commons-lang:commons-lang:jar:2.6
  • commons-logging:commons-logging:jar:1.1.3
  • commons-logging:commons-logging:jar:1.2
  • commons-net:commons-net:jar:3.1
  • commons-pool:commons-pool:jar:1.5.4
  • de.rototor.pdfbox:graphics2d:jar:0.27
  • io.airlift:aircompressor:jar:0.21
  • io.dropwizard.metrics:metrics-core:jar:4.2.0
  • io.dropwizard.metrics:metrics-graphite:jar:4.2.0
  • io.dropwizard.metrics:metrics-jmx:jar:4.2.0
  • io.dropwizard.metrics:metrics-json:jar:4.2.0
  • io.dropwizard.metrics:metrics-jvm:jar:4.2.0
  • io.jsonwebtoken:jjwt-api:jar:0.10.5
  • io.jsonwebtoken:jjwt-impl:jar:0.10.5
  • io.jsonwebtoken:jjwt-jackson:jar:0.10.5
  • io.netty:netty-all:jar:4.0.23.Final
  • io.netty:netty-all:jar:4.1.68.Final
  • io.substrait:core:jar:0.5.0
  • io.trino.tpcds:tpcds:jar:1.4
  • io.trino.tpch:tpch:jar:1.1
  • jakarta.validation:jakarta.validation-api:jar:2.0.2
  • javax.inject:javax.inject:jar:1
  • javax.jdo:jdo-api:jar:3.0.1
  • javax.xml.stream:stax-api:jar:1.0-2
  • joda-time:joda-time:jar:2.10.10
  • log4j:log4j:jar:1.2.17
  • net.bytebuddy:byte-buddy:jar:1.9.3
  • net.bytebuddy:byte-buddy-agent:jar:1.9.3
  • net.sf.opencsv:opencsv:jar:2.3
  • net.sourceforge.cssparser:cssparser:jar:0.9.16
  • net.sourceforge.htmlunit:htmlunit:jar:2.18
  • net.sourceforge.htmlunit:htmlunit-core-js:jar:2.17
  • net.sourceforge.nekohtml:nekohtml:jar:1.9.22
  • org.apache.avro:avro:jar:1.10.2
  • org.apache.avro:avro:jar:1.7.4
  • org.apache.avro:avro-ipc:jar:1.10.2
  • org.apache.avro:avro-mapred:jar:1.10.2
  • org.apache.commons:commons-compress:jar:1.20
  • org.apache.commons:commons-compress:jar:1.4.1
  • org.apache.commons:commons-compress:jar:1.9
  • org.apache.commons:commons-crypto:jar:1.1.0
  • org.apache.commons:commons-exec:jar:1.3
  • org.apache.commons:commons-lang3:jar:3.12.0
  • org.apache.commons:commons-math3:jar:3.1.1
  • org.apache.commons:commons-math3:jar:3.4.1
  • org.apache.commons:commons-text:jar:1.6
  • org.apache.curator:curator-client:jar:2.7.1
  • org.apache.curator:curator-framework:jar:2.7.1
  • org.apache.curator:curator-recipes:jar:2.7.1
  • org.apache.derby:derby:jar:10.14.2.0
  • org.apache.directory.api:api-asn1-api:jar:1.0.0-M20
  • org.apache.directory.api:api-util:jar:1.0.0-M20
  • org.apache.directory.server:apacheds-i18n:jar:2.0.0-M15
  • org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15
  • org.apache.hadoop:hadoop-annotations:jar:2.7.4
  • org.apache.hadoop:hadoop-auth:jar:2.7.4
  • org.apache.hadoop:hadoop-client:jar:2.7.4
  • org.apache.hadoop:hadoop-common:jar:2.7.4
  • org.apache.hadoop:hadoop-hdfs:jar:2.7.4
  • org.apache.hadoop:hadoop-mapreduce-client-app:jar:2.7.4
  • org.apache.hadoop:hadoop-mapreduce-client-common:jar:2.7.4
  • org.apache.hadoop:hadoop-mapreduce-client-core:jar:2.7.4
  • org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:2.7.4
  • org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:2.7.4
  • org.apache.hadoop:hadoop-yarn-api:jar:2.7.4
  • org.apache.hadoop:hadoop-yarn-client:jar:2.7.4
  • org.apache.hadoop:hadoop-yarn-common:jar:2.7.4
  • org.apache.hadoop:hadoop-yarn-server-common:jar:2.7.4
  • org.apache.hive.shims:hive-shims-0.23:jar:2.3.9
  • org.apache.hive.shims:hive-shims-common:jar:2.3.9
  • org.apache.hive.shims:hive-shims-scheduler:jar:2.3.9
  • org.apache.hive:hive-common:jar:2.3.9
  • org.apache.hive:hive-exec:jar:core:2.3.9
  • org.apache.hive:hive-llap-client:jar:2.3.9
  • org.apache.hive:hive-llap-common:jar:2.3.9
  • org.apache.hive:hive-metastore:jar:2.3.9
  • org.apache.hive:hive-serde:jar:2.3.9
  • org.apache.hive:hive-shims:jar:2.3.9
  • org.apache.hive:hive-storage-api:jar:2.7.2
  • org.apache.hive:hive-vector-code-gen:jar:2.3.9
  • org.apache.htrace:htrace-core:jar:3.1.0-incubating
  • org.apache.httpcomponents:httpclient:jar:4.2.5
  • org.apache.httpcomponents:httpclient:jar:4.5.13
  • org.apache.httpcomponents:httpcore:jar:4.2.4
  • org.apache.httpcomponents:httpcore:jar:4.4.13
  • org.apache.httpcomponents:httpmime:jar:4.5
  • org.apache.ivy:ivy:jar:2.5.0
  • org.apache.orc:orc-core:jar:1.6.14
  • org.apache.orc:orc-mapreduce:jar:1.6.14
  • org.apache.orc:orc-shims:jar:1.6.14
  • org.apache.parquet:parquet-column:jar:1.12.2
  • org.apache.parquet:parquet-common:jar:1.12.2
  • org.apache.parquet:parquet-encoding:jar:1.12.2
  • org.apache.parquet:parquet-format-structures:jar:1.12.2
  • org.apache.parquet:parquet-hadoop:jar:1.12.2
  • org.apache.parquet:parquet-jackson:jar:1.12.2
  • org.apache.pdfbox:fontbox:jar:2.0.19
  • org.apache.pdfbox:pdfbox:jar:2.0.19
  • org.apache.spark:spark-catalyst_2.12:jar:3.2.2
  • org.apache.spark:spark-catalyst_2.12-jars:3.2.2
  • org.apache.spark:spark-core_2.12:jar:3.2.2
  • org.apache.spark:spark-core_2.12-jars:3.2.2
  • org.apache.spark:spark-hive_2.12:jar:3.2.2
  • org.apache.spark:spark-kvstore_2.12:jar:3.2.2
  • org.apache.spark:spark-launcher_2.12:jar:3.2.2
  • org.apache.spark:spark-network-common_2.12:jar:3.2.2
  • org.apache.spark:spark-network-shuffle_2.12:jar:3.2.2
  • org.apache.spark:spark-sketch_2.12:jar:3.2.2
  • org.apache.spark:spark-sql_2.12:jar:3.2.2
  • org.apache.spark:spark-sql_2.12-jars:3.2.2
  • org.apache.spark:spark-tags_2.12:jar:3.2.2
  • org.apache.spark:spark-unsafe_2.12:jar:3.2.2
  • org.apache.thrift:libfb303:jar:0.9.3
  • org.apache.thrift:libthrift:jar:0.12.0
  • org.apache.velocity:velocity:jar:1.5
  • org.apache.xbean:xbean-asm9-shaded:jar:4.20
  • org.apache.yetus:audience-annotations:jar:0.5.0
  • org.apache.zookeeper:zookeeper:jar:3.4.6
  • org.apache.zookeeper:zookeeper:jar:3.6.2
  • org.apache.zookeeper:zookeeper-jute:jar:3.6.2
  • org.codehaus.jackson:jackson-core-asl:jar:1.9.13
  • org.codehaus.jackson:jackson-jaxrs:jar:1.9.13
  • org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13
  • org.codehaus.jackson:jackson-xc:jar:1.9.13
  • org.datanucleus:datanucleus-api-jdo:jar:4.2.4
  • org.datanucleus:datanucleus-core:jar:4.1.17
  • org.datanucleus:datanucleus-rdbms:jar:4.1.19
  • org.datanucleus:javax.jdo:jar:3.2.0-m3
  • org.eclipse.jetty.websocket:websocket-api:jar:9.2.12.v20150709
  • org.eclipse.jetty.websocket:websocket-client:jar:9.2.12.v20150709
  • org.eclipse.jetty.websocket:websocket-common:jar:9.2.12.v20150709
  • org.eclipse.jetty:jetty-io:jar:9.2.12.v20150709
  • org.eclipse.jetty:jetty-util:jar:9.2.12.v20150709
  • org.glassfish.hk2.external:aopalliance-repackaged:jar:2.6.1
  • org.glassfish.hk2.external:jakarta.inject:jar:2.6.1
  • org.glassfish.hk2:hk2-api:jar:2.6.1
  • org.glassfish.hk2:hk2-locator:jar:2.6.1
  • org.glassfish.hk2:hk2-utils:jar:2.6.1
  • org.glassfish.hk2:osgi-resource-locator:jar:1.0.3
  • org.glassfish.jersey.inject:jersey-hk2:jar:2.34
  • org.jetbrains:annotations:jar:17.0.0
  • org.json4s:json4s-ast_2.12:jar:3.7.0-M11
  • org.json4s:json4s-core_2.12:jar:3.7.0-M11
  • org.json4s:json4s-jackson_2.12:jar:3.7.0-M11
  • org.json4s:json4s-scalap_2.12:jar:3.7.0-M11
  • org.knowm.xchart:xchart:jar:3.6.5
  • org.lz4:lz4-java:jar:1.7.1
  • org.mortbay.jetty:jetty-sslengine:jar:6.1.26
  • org.mortbay.jetty:jetty-util:jar:6.1.26
  • org.objenesis:objenesis:jar:2.5.1
  • org.objenesis:objenesis:jar:2.6
  • org.roaringbitmap:RoaringBitmap:jar:0.9.0
  • org.roaringbitmap:shims:jar:0.9.0
  • org.scalactic:scalactic_2.12:jar:3.2.3
  • org.scala-lang.modules:scala-parser-combinators_2.12:jar:1.1.2
  • org.scala-lang.modules:scala-xml_2.12:jar:1.2.0
  • org.scala-lang:scala-library:jar:2.12.12
  • org.scala-lang:scala-library:jar:2.12.15
  • org.scala-lang:scala-reflect:jar:2.12.12
  • org.scala-lang:scala-reflect:jar:2.12.15
  • org.scalatest:scalatest_2.12:jar:3.2.3
  • org.scalatest:scalatest-compatible:jar:3.2.3
  • org.scalatest:scalatest-core_2.12:jar:3.2.3
  • org.scalatest:scalatest-diagrams_2.12:jar:3.2.3
  • org.scalatest:scalatest-featurespec_2.12:jar:3.2.3
  • org.scalatest:scalatest-flatspec_2.12:jar:3.2.3
  • org.scalatest:scalatest-freespec_2.12:jar:3.2.3
  • org.scalatest:scalatest-funspec_2.12:jar:3.2.3
  • org.scalatest:scalatest-funsuite_2.12:jar:3.2.3
  • org.scalatest:scalatest-matchers-core_2.12:jar:3.2.3
  • org.scalatest:scalatest-mustmatchers_2.12:jar:3.2.3
  • org.scalatest:scalatest-propspec_2.12:jar:3.2.3
  • org.scalatest:scalatest-refspec_2.12:jar:3.2.3
  • org.scalatest:scalatest-shouldmatchers_2.12:jar:3.2.3
  • org.scalatest:scalatest-wordspec_2.12:jar:3.2.3
  • org.scalatestplus:scalatestplus-mockito_2.12:jar:1.0.0-M2
  • org.scalatestplus:scalatestplus-scalacheck_2.12:jar:3.1.0.0-RC2
  • org.seleniumhq.selenium:selenium-api:jar:2.52.0
  • org.seleniumhq.selenium:selenium-htmlunit-driver:jar:2.52.0
  • org.seleniumhq.selenium:selenium-remote-driver:jar:2.52.0
  • org.seleniumhq.selenium:selenium-support:jar:2.52.0
  • org.slf4j:jcl-over-slf4j:jar:1.7.30
  • org.slf4j:jul-to-slf4j:jar:1.7.30
  • org.slf4j:slf4j-api:jar:1.7.30
  • org.slf4j:slf4j-log4j12:jar:1.7.30
  • org.spark-project.spark:unused:jar:1.0.0
  • org.xerial.snappy:snappy-java:jar:1.0.4.1
  • org.xerial.snappy:snappy-java:jar:1.1.8.4
  • org.yaml:snakeyaml:jar:1.31:runtime
  • stax:stax-api:jar:1.0.1
  • xalan:serializer:jar:2.7.2
  • xalan:xalan:jar:2.7.2
  • xerces:xercesImpl:jar:2.9.1
  • xml-apis:xml-apis:jar:1.3.04

Apache 2.0 with dual license

With Apache-2.0, BSD-2-Clause, BSD-3-Clause, EDL-1.0, EPL-2.0, GPL-2.0-with-classpath-exception, MIT, Public-Domain, W3C:

  • org.glassfish.jersey.containers:jersey-container-servlet-core:jar:2.34
  • org.glassfish.jersey.core:jersey-client:jar:2.34
  • org.glassfish.jersey.core:jersey-common:jar:2.34
  • org.glassfish.jersey.core:jersey-server:jar:2.34
  • org.glassfish.jersey.containers:jersey-container-servlet:jar:2.34

With Apache-2.0 and GPL-2.0

  • org.rocksdb:rocksdbjni:jar:6.20.3
  • net.java.dev.jna:jna:jar:4.1.0
  • net.java.dev.jna:jna-platform:jar:4.1.0
  • org.javassist:javassist:jar:3.25.0-GA

BSD-2-Clause

  • com.github.luben:zstd-jni:jar:1.5.0-4
  • javolution:javolution:jar:5.5.1
  • jline:jline:jar:2.12
  • org.jodd:jodd-core:jar:3.5.2

BSD-3-Clause

  • com.esotericsoftware:kryo-shaded:jar:4.0.2
  • com.esotericsoftware:minlog:jar:1.3.0
  • com.google.protobuf:protobuf-java:jar:2.5.0
  • com.google.protobuf:protobuf-java:jar:3.23.4
  • com.thoughtworks.paranamer:paranamer:jar:2.3
  • com.thoughtworks.paranamer:paranamer:jar:2.8
  • io.glutenproject:protobuf-java:jar:3.23.4-0
  • io.glutenproject:protobuf-java-util:jar:3.23.4-0
  • net.sf.py4j:py4j:jar:0.10.9.5
  • org.abego.treelayout:org.abego.treelayout.core:jar:1.0.3
  • org.antlr:antlr4:jar:4.9.2
  • org.antlr:antlr4-runtime:jar:4.8
  • org.antlr:antlr4-runtime:jar:4.8
  • org.antlr:antlr-runtime:jar:3.5.2
  • org.antlr:antlr-runtime:jar:3.5.2
  • org.antlr:ST4:jar:4.0.4
  • org.antlr:ST4:jar:4.3
  • org.codehaus.janino:commons-compiler:jar:3.0.16
  • org.codehaus.janino:janino:jar:3.0.16
  • org.fusesource.leveldbjni:leveldbjni-all:jar:1.8
  • org.hamcrest:hamcrest-core:jar:1.3
  • org.scalacheck:scalacheck_2.12:jar:1.13.5
  • org.scala-sbt-interface:jar:1.0
  • org.threeten:threeten-extra:jar:1.5.0

CDDL-1.1

  • javax.activation:activation:jar:1.1
  • javax.activation:activation:jar:1.1.1
  • javax.servlet:servlet-api:jar:2.5
  • javax.transaction:jta:jar:1.1
  • javax.transaction:transaction-api:jar:1.1

CDDL-1.1 with dual license

With CDDL-1.1 and GPL-2.0:

  • com.sun.jersey:jersey-client:jar:1.9
  • com.sun.jersey:jersey-core:jar:1.9
  • javax.servlet.jsp:jsp-api:jar:2.1
  • javax.xml.bind:jaxb-api:jar:2.2.2
  • org.glassfish:javax.json:jar:1.0.4
  • javax.xml.bind:jaxb-api:jar:2.2.11

EPL-1.0

  • junit:junit:jar:4.13.1

EPL-2.0 with dual license

With EPL-2.0 and GPL-2.0-with-classpath-exception:

  • jakarta.annotation:jakarta.annotation-api:jar:1.3.5
  • jakarta.servlet:jakarta.servlet-api:jar:4.0.3
  • jakarta.ws.rs:jakarta.ws.rs-api:jar:2.1.6

ICU

  • com.ibm.icu:icu4j:jar:61.1

MIT

  • net.razorvine:pyrolite:jar:4.30
  • org.checkerframework:checker-qual:jar:3.19.0
  • org.codehaus.mojo:animal-sniffer-annotations:jar:1.14
  • org.kohsuke:github-api:jar:1.117
  • org.mockito:mockito-core:jar:2.23.4
  • xmlenc:xmlenc:jar:0.52

Public Domain

  • org.tukaani:xz:jar:1.0
  • org.tukaani:xz:jar:1.8

W3C

  • org.w3c.css:sac:jar:1.3

 

  • No labels