Abstract

StreamPark is a streaming application development platform. Aimed at ease building and managing streaming applications, StreamPark provides scaffolding for writing streaming process logic with Apache Flink and Apache Spark. Also, StreamPark provides a dashboard for controlling and monitoring streaming tasks. It was initially known as StreamX and renamed to StreamPark in August 2022.

Proposal

StreamPark was born to make stream processing more straightforward and positioned as a stream processing development scaffold and operation platform. StreamPark provides a set of APIs and connectors to develop Flink and Spark applications agilely. It integrates compilation, publishing, deployment, and monitoring. Streaming applications can be deployed on YARN or Kubernetes simply using StreamPark.

To sum up, StreamPark standardizes configuration, development, testing, deployment, monitoring, and O&M experience.

We are actively running the StreamPark community and look forward to organizing more community events.

Based on community consensus, we expect to donate the StreamPark codebase to the Apache Software Foundation. We believe that introducing StreamPark to the ASF and following the Apache Way will continue to improve the quality of the project and the vitality of the community.

Voted on submitting the proposal to the Incubator. Check here: https://github.com/streamxhub/streampark/issues/1345

Background

Apache Flink and Apache Spark are widely used streaming computing engines. Based on a bench of excellent experiences combined with best practices, we extracted the task deployment and runtime parameters into the configuration files. This way, an easy-to-use RuntimeContext with out-of-the-box connectors would bring a more accessible and efficient task development experience. It reduces the learning cost and development barriers; hence developers can focus on the business logic.

On the other hand, It can be challenging for enterprises to use Flink & Spark if there is no professional management platform for Flink & Spark tasks during the deployment phase. StreamPark provides a professional task management platform, including task development, scheduling, interactive query, deployment, operation, maintenance, etc.

In 2021, streamxhub open-sourced StreamPark. Huajie Wang(benjobs@apache.org) creates streamxhub as a personal organization, and StreamPark starts as a personal side-project. So there’s no need to sign an SGA.

Currently, StreamPark is running in the production environments of several users and has been widely recognized and appreciated by those users.

Rationale

StreamPark abstracts the environment and program parameters of task development and deployment in a convention over configuration manner for low code development. It initializes a runtime environment and context and combines it with a series of connectors to simplify development. From the aspect of a task management platform, StreamPark is a streaming data management platform based on the JVM platform.

Current Status

Meritocracy:

StreamPark project started in 2019 and was open-sourced on GitHub on April 12, 2021. The project now has contributors from dozens of companies.

We have been learning and practicing the "Apache Way" to run our project. Users and contributors are welcomed and respected. We encourage them to participate in the community and provide quality patches and/or support that moves the project forward by rewards. Besides, we also encourage non-code contributions (documentation, events, community management, etc.). Those who provide high-quality contributions will be encouraged to become committers.

Users

So far, we have accumulated a few users, and the accrued download time is over 5,000. The representative users are Baidu, China Unicom, Ziroom, Yonghui Supermarket, InMobi, YTO Express, and so on.

Community:

StreamPark has built an open-source community with 52 developers and released over ten versions in the past year.

Core Developers:

The core developers are all experienced, open-source developers. They have operated the StreamPark community for over one year.

Alignment:

StreamPark works well with Flink, Spark, and many other Apache projects. The codebase of StreamPark is already under the Apache License 2.0. The community has been learning and practicing the Apache Way since its establishment, and our current core developers all have experience working on various Apache projects.

Known Risks

Project Name

We have checked and believe the name StreamPark is suitable. There are no other projects found using this name through patent inquiry.

Orphaned products

A few users have already deployed StreamPark in the production environment. The developers and community maintain a healthy development routine, and the risk of the project being abandoned is minimal. We are now actively operating the community and will continue to increase the vitality of the community to attract more contributors to the community.

Inexperience with Open Source:

Many StreamPark contributors have experience working on open source projects before and are also active committers and contributors to other Apache projects.

Length of Incubation:

Expect to enter incubation in two months and graduate in about two years.

Homogenous Developers:

The contributors are from various organizations, including Baidu, China Unicom, Ziroom, etc. At this stage, we admit that the StreamPark community is not diverse enough. We need to pay more attention to creating a more diverse community by nominating committers based on their contributions to the project.

Reliance on Salaried Developers:

Most of the developers are paid by their employers to contribute to this project. Given some volunteer developers and committers' sense of ownership for the code, the project could continue even if no salaried developers contributed to the project.

Relationships with Other Apache Products:

We have integrated with Apache Spark, Apache Flink, and Apache Commons. We plan to have better ecological integration with other Apache projects (mainly big data projects).

A Excessive Fascination with the Apache Brand:

We are interested in joining ASF to increase our connections in the open-source world. Based on extensive collaboration, it is possible to build a community of developers and committers that live longer than the founder. Also, the Apache Brand can help encourage more organizations to use StreamPark more confidently.

Documentation

StreamPark has an offical website, the full documents are currently only in Chinese, and the English version is under rapid development.

Initial Source

The project consists of two repositories: the core modules and the official website. Each of them is available on GitHub.

Code: https://github.com/streamxhub/streampark Document: https://github.com/streamxhub/streampark-website

Source and Intellectual Property Submission Plan

Once StreamPark is approved to join Apache Incubator, the Initial committers will submit iCLA(s). The code is already licensed under Apache Software 2.0.

Since no organization owns StreamPark, there's no entity to sign the SGA. We will ask the top 20 contributors to sign iCLAs for IP clearance.

External Dependencies:

Backend:

  • Apache 2.0

    • com.baomidou:dynamic-datasource-spring-boot-starter
    • com.baomidou:mybatis-plus
    • com.baomidou:mybatis-plus-annotation
    • com.baomidou:mybatis-plus-boot-starter
    • com.baomidou:mybatis-plus-core
    • com.baomidou:mybatis-plus-extension
    • com.fasterxml:classmate
    • com.fasterxml.jackson.core:jackson-annotations
    • com.fasterxml.jackson.core:jackson-core
    • com.fasterxml.jackson.core:jackson-databind
    • com.fasterxml.jackson.datatype:jackson-datatype-jdk8
    • com.fasterxml.jackson.datatype:jackson-datatype-jsr310
    • com.fasterxml.jackson.module:jackson-module-parameter-names
    • com.fasterxml.jackson.module:jackson-module-scala_2.12
    • com.fasterxml.woodstox:woodstox-core
    • com.github.ben-manes.caffeine:caffeine
    • com.github.docker-java:docker-java-api
    • com.github.docker-java:docker-java-core
    • com.github.docker-java:docker-java-transport
    • com.github.docker-java:docker-java-transport-httpclient5
    • com.github.jsqlparser:jsqlparser
    • com.github.stephenc.jcip:jcip-annotations
    • com.github.xiaoymin:swagger-bootstrap-ui
    • com.google.code.findbugs:jsr305
    • com.google.code.gson:gson
    • com.google.errorprone:error_prone_annotations
    • com.google.guava:failureaccess
    • com.google.guava:guava
    • com.google.guava:listenablefuture
    • com.google.inject:guice
    • com.google.inject.extensions:guice-servlet
    • com.google.j2objc:j2objc-annotations
    • com.googlecode.javaewah:JavaEWAH
    • com.jamesmurty.utils:java-xmlbuilder
    • com.jayway.jsonpath:json-path
    • com.nimbusds:nimbus-jose-jwt
    • com.squareup.okhttp:okhttp
    • com.squareup.okio:okio
    • com.twitter:chill-java
    • com.twitter:chill_2.12
    • com.vaadin.external.google:android-json
    • com.zaxxer:HikariCP
    • commons-beanutils:commons-beanutils
    • commons-cli:commons-cli
    • commons-codec:commons-codec
    • commons-collections:commons-collections
    • commons-configuration:commons-configuration
    • commons-daemon:commons-daemon
    • commons-digester:commons-digester
    • commons-io:commons-io
    • commons-lang:commons-lang
    • commons-logging:commons-logging
    • commons-net:commons-net
    • io.netty:netty
    • io.netty:netty-all
    • io.netty:netty-buffer
    • io.netty:netty-codec
    • io.netty:netty-codec-dns
    • io.netty:netty-codec-haproxy
    • io.netty:netty-codec-http
    • io.netty:netty-codec-http2
    • io.netty:netty-codec-memcache
    • io.netty:netty-codec-mqtt
    • io.netty:netty-codec-redis
    • io.netty:netty-codec-smtp
    • io.netty:netty-codec-socks
    • io.netty:netty-codec-stomp
    • io.netty:netty-codec-xml
    • io.netty:netty-common
    • io.netty:netty-handler
    • io.netty:netty-handler-proxy
    • io.netty:netty-resolver
    • io.netty:netty-resolver-dns
    • io.netty:netty-resolver-dns-classes-macos
    • io.netty:netty-resolver-dns-native-macos
    • io.netty:netty-transport
    • io.netty:netty-transport-classes-epoll
    • io.netty:netty-transport-classes-kqueue
    • io.netty:netty-transport-native-epoll
    • io.netty:netty-transport-native-kqueue
    • io.netty:netty-transport-native-unix-common
    • io.netty:netty-transport-rxtx
    • io.netty:netty-transport-sctp
    • io.netty:netty-transport-udt
    • io.springfox:springfox-core
    • io.springfox:springfox-schema
    • io.springfox:springfox-spi
    • io.springfox:springfox-spring-web
    • io.springfox:springfox-swagger-common
    • io.springfox:springfox-swagger-ui
    • io.springfox:springfox-swagger2
    • io.swagger:swagger-annotations
    • io.swagger:swagger-models
    • io.undertow:undertow-core
    • io.undertow:undertow-servlet
    • io.undertow:undertow-websockets-jsr
    • jakarta.validation:jakarta.validation-api
    • javax.inject:javax.inject
    • log4j:log4j
    • net.bytebuddy:byte-buddy
    • net.bytebuddy:byte-buddy-agent
    • net.java.dev.jets3t:jets3t
    • net.java.dev.jna:jna
    • net.minidev:accessors-smart
    • net.minidev:json-smart
    • org.apache.avro:avro
    • org.apache.commons:commons-compress
    • org.apache.commons:commons-email
    • org.apache.commons:commons-lang3
    • org.apache.commons:commons-math3
    • org.apache.curator:curator-client
    • org.apache.curator:curator-framework
    • org.apache.curator:curator-recipes
    • org.apache.directory.api:api-asn1-api
    • org.apache.directory.api:api-util
    • org.apache.directory.server:apacheds-i18n
    • org.apache.directory.server:apacheds-kerberos-codec
    • org.apache.flink:flink-annotations
    • org.apache.flink:flink-clients_2.12
    • org.apache.flink:flink-core
    • org.apache.flink:flink-file-sink-common
    • org.apache.flink:flink-hadoop-fs
    • org.apache.flink:flink-java
    • org.apache.flink:flink-kubernetes_2.12
    • org.apache.flink:flink-metrics-core
    • org.apache.flink:flink-optimizer
    • org.apache.flink:flink-queryable-state-client-java
    • org.apache.flink:flink-rpc-akka-loader
    • org.apache.flink:flink-rpc-core
    • org.apache.flink:flink-runtime
    • org.apache.flink:flink-scala_2.12
    • org.apache.flink:flink-shaded-asm-7
    • org.apache.flink:flink-shaded-force-shading
    • org.apache.flink:flink-shaded-guava
    • org.apache.flink:flink-shaded-jackson
    • org.apache.flink:flink-shaded-netty
    • org.apache.flink:flink-shaded-zookeeper-3
    • org.apache.flink:flink-streaming-java_2.12
    • org.apache.hadoop:hadoop-annotations
    • org.apache.hadoop:hadoop-auth
    • org.apache.hadoop:hadoop-common
    • org.apache.hadoop:hadoop-hdfs
    • org.apache.hadoop:hadoop-hdfs-client
    • org.apache.hadoop:hadoop-mapreduce-client-core
    • org.apache.hadoop:hadoop-yarn-api
    • org.apache.hadoop:hadoop-yarn-client
    • org.apache.hadoop:hadoop-yarn-common
    • org.apache.htrace:htrace-core4
    • org.apache.httpcomponents:httpclient
    • org.apache.httpcomponents:httpcore
    • org.apache.httpcomponents.client5:httpclient5
    • org.apache.httpcomponents.client5:httpclient5-fluent
    • org.apache.httpcomponents.core5:httpcore5
    • org.apache.httpcomponents.core5:httpcore5-h2
    • org.apache.ivy:ivy
    • org.apache.maven:maven-aether-provider
    • org.apache.maven:maven-artifact
    • org.apache.maven:maven-builder-support
    • org.apache.maven:maven-core
    • org.apache.maven:maven-model
    • org.apache.maven:maven-model-builder
    • org.apache.maven:maven-plugin-api
    • org.apache.maven:maven-repository-metadata
    • org.apache.maven:maven-settings
    • org.apache.maven:maven-settings-builder
    • org.apache.maven.plugins:maven-shade-plugin
    • org.apache.maven.shared:maven-artifact-transfer
    • org.apache.maven.shared:maven-common-artifact-filters
    • org.apache.maven.shared:maven-dependency-tree
    • org.apache.maven.shared:maven-shared-utils
    • org.apache.shiro:shiro-cache
    • org.apache.shiro:shiro-config-core
    • org.apache.shiro:shiro-config-ogdl
    • org.apache.shiro:shiro-core
    • org.apache.shiro:shiro-crypto-cipher
    • org.apache.shiro:shiro-crypto-core
    • org.apache.shiro:shiro-crypto-hash
    • org.apache.shiro:shiro-event
    • org.apache.shiro:shiro-lang
    • org.apache.shiro:shiro-spring
    • org.apache.shiro:shiro-web
    • org.apache.tomcat.embed:tomcat-embed-el
    • org.apache.yetus:audience-annotations
    • org.apache.zookeeper:zookeeper
    • org.apiguardian:apiguardian-api
    • org.apiguardian:apiguardian-api
    • org.assertj:assertj-core
    • org.codehaus.jackson:jackson-core-asl
    • org.codehaus.jackson:jackson-jaxrs
    • org.codehaus.jackson:jackson-mapper-asl
    • org.codehaus.jackson:jackson-xc
    • org.codehaus.jettison:jettison
    • org.codehaus.plexus:plexus-classworlds
    • org.codehaus.plexus:plexus-component-annotations
    • org.codehaus.plexus:plexus-interpolation
    • org.codehaus.plexus:plexus-utils
    • org.freemarker:freemarker
    • org.hibernate.validator:hibernate-validator
    • org.javassist:javassist
    • org.jboss.logging:jboss-logging
    • org.jboss.threads:jboss-threads
    • org.jboss.xnio:xnio-api
    • org.jboss.xnio:xnio-nio
    • org.json4s:json4s-ast_2.12
    • org.json4s:json4s-core_2.12
    • org.json4s:json4s-jackson_2.12
    • org.json4s:json4s-scalap_2.12
    • org.lz4:lz4-java
    • org.mapstruct:mapstruct
    • org.mortbay.jetty:jetty
    • org.mortbay.jetty:jetty-sslengine
    • org.mortbay.jetty:jetty-util
    • org.mybatis:mybatis
    • org.mybatis:mybatis-spring
    • org.objenesis:objenesis
    • org.opentest4j:opentest4j
    • org.opentest4j:opentest4j
    • org.quartz-scheduler:quartz
    • org.scala-lang.modules:scala-xml_2.12
    • org.scalaj:scalaj-http_2.12
    • org.skyscreamer:jsonassert
    • org.slf4j:jcl-over-slf4j
    • org.slf4j:log4j-over-slf4j
    • org.sonatype.aether:aether-api
    • org.sonatype.aether:aether-impl
    • org.sonatype.aether:aether-spi
    • org.sonatype.aether:aether-util
    • org.sonatype.plexus:plexus-cipher
    • org.sonatype.plexus:plexus-sec-dispatcher
    • org.sonatype.sisu:sisu-guice
    • org.sonatype.sisu:sisu-inject-bean
    • org.sonatype.sisu:sisu-inject-plexus
    • org.springframework:spring-aop
    • org.springframework:spring-beans
    • org.springframework:spring-context
    • org.springframework:spring-context-support
    • org.springframework:spring-core
    • org.springframework:spring-expression
    • org.springframework:spring-jcl
    • org.springframework:spring-jdbc
    • org.springframework:spring-messaging
    • org.springframework:spring-test
    • org.springframework:spring-tx
    • org.springframework:spring-web
    • org.springframework:spring-webmvc
    • org.springframework:spring-websocket
    • org.springframework.boot:spring-boot
    • org.springframework.boot:spring-boot-autoconfigure
    • org.springframework.boot:spring-boot-starter
    • org.springframework.boot:spring-boot-starter-aop
    • org.springframework.boot:spring-boot-starter-cache
    • org.springframework.boot:spring-boot-starter-jdbc
    • org.springframework.boot:spring-boot-starter-json
    • org.springframework.boot:spring-boot-starter-quartz
    • org.springframework.boot:spring-boot-starter-test
    • org.springframework.boot:spring-boot-starter-undertow
    • org.springframework.boot:spring-boot-starter-validation
    • org.springframework.boot:spring-boot-starter-web
    • org.springframework.boot:spring-boot-starter-websocket
    • org.springframework.boot:spring-boot-test
    • org.springframework.boot:spring-boot-test-autoconfigure
    • org.springframework.plugin:spring-plugin-core
    • org.springframework.plugin:spring-plugin-metadata
    • org.vafer:jdependency
    • org.wildfly.client:wildfly-client-config
    • org.wildfly.common:wildfly-common
    • org.xerial.snappy:snappy-java
    • org.xmlunit:xmlunit-core
    • org.xmlunit:xmlunit-core
    • org.yaml:snakeyaml
    • p6spy:p6spy
    • xerces:xercesImpl
    • xml-apis:xml-apis
  • Public Domain

    • aopalliance:aopalliance
  • MIT

    • com.auth0:java-jwt
    • com.beachape:enumeratum-macros_2.12
    • com.beachape:enumeratum_2.12
    • org.checkerframework:checker-qual
    • org.mockito:mockito-core
    • org.mockito:mockito-junit-jupiter
    • org.projectlombok:lombok
    • org.slf4j:slf4j-api
  • New BSD

    • com.esotericsoftware.kryo:kryo
    • com.esotericsoftware.minlog:minlog
  • BSD

    • asm:asm
    • com.google.protobuf:protobuf-java
    • com.jcraft:jsch
    • com.thoughtworks.paranamer:paranamer
    • org.codehaus.woodstox:stax2-api
    • org.fusesource.leveldbjni:leveldbjni-all
    • org.hamcrest:hamcrest
    • org.hamcrest:hamcrest-core
    • org.scala-lang:scala-compiler
    • org.scala-lang:scala-library
    • org.scala-lang:scala-reflect
    • org.ow2.asm:asm
    • org.ow2.asm:asm-analysis
    • org.ow2.asm:asm-commons
    • org.ow2.asm:asm-tree
    • org.ow2.asm:asm-util
    • org.owasp.encoder:encoder
    • xmlenc:xmlenc
  • EDL

    • jakarta.activation:jakarta.activation-api
    • jakarta.xml.bind:jakarta.xml.bind-api
    • org.eclipse.jgit:org.eclipse.jgit
  • EPL 1.0

    • jakarta.websocket:jakarta.websocket-api
    • junit:junit
    • org.eclipse.aether:aether-api
    • org.eclipse.aether:aether-connector-basic
    • org.eclipse.aether:aether-impl
    • org.eclipse.aether:aether-spi
    • org.eclipse.aether:aether-transport-file
    • org.eclipse.aether:aether-transport-http
    • org.eclipse.aether:aether-util
  • EPL 2.0

    • jakarta.annotation:jakarta.annotation-api
    • jakarta.servlet:jakarta.servlet-api
    • org.aspectj:aspectjweaver
    • org.junit.jupiter:junit-jupiter
    • org.junit.jupiter:junit-jupiter-api
    • org.junit.jupiter:junit-jupiter-engine
    • org.junit.jupiter:junit-jupiter-params
    • org.junit.platform:junit-platform-commons
    • org.junit.platform:junit-platform-engine
  • LGPL

    • com.github.spotbugs:spotbugs-annotations
  • LGPL 2.1

    • org.javassist:javassist
    • org.codehaus.jackson:jackson-xc
    • org.codehaus.jackson:jackson-jaxrs
    • net.java.dev.jna:jna
    • com.github.jsqlparser:jsqlparser
  • CDDL 1.1,GPL 2

    • com.sun.jersey:jersey-client
    • com.sun.jersey:jersey-core
    • com.sun.jersey:jersey-json
    • com.sun.jersey:jersey-server
    • com.sun.jersey.contribs:jersey-guice
    • com.sun.mail:javax.mail
    • com.sun.xml.bind:jaxb-impl
    • javax.activation:javax.activation-api
    • jakarta.servlet:jakarta.servlet-api    * javax.mail:mail
    • javax.xml.bind:jaxb-api
  • CDDL

    • javax.activation:activation
    • javax.servlet:servlet-api
    • javax.servlet.jsp:jsp-api
  • Eclipse Public License 2.0

    • jakarta.websocket:jakarta.websocket-api

Frontend:

  • Apache 2.0
    • stompjs
  • MIT
    • axios
    • moment
    • ant-design-vue
    • lodash
    • moment
    • monaco-editor
    • sql-formatter
    • vue
    • vuex
    • vue-i18n
    • xterm
  • BSD
    • highlight.js

Cryptography:

N/A

Required Resources

Mailing lists:

Git Repositories:

From https://github.com/streamxhub/streampark

From https://github.com/streamxhub/streampark-website

Issue Tracking:

The community would like to continue using GitHub Issues.

Other Resources:

The community has already chosen GitHub actions as continuous integration tools.

Initial Committers

@benjobs has tried to ask all contributors (59 members) of StreamPark to see if they want to act as an initial committer. And by now the below seven show their interest whose contributions are highly remarkable.

Sponsors

Champion:

  • tison [tison@apache.org]

Nominated Mentors:

  • tison [tison@apache.org]
  • Willem Ning Jiang [ningjiang@apache.org]
  • Stephan Ewen [sewen@apache.org]
  • Thomas Weise [thw@apache.org]
  • Duo Zhang [zhangduo@apache.org]

Sponsoring Entity:

We are expecting the Apache Incubator could sponsor this project.

  • No labels