DistributedCompute-OpenSource-List

Distributed System OpenSource List

DSL

  • Apache Beam #Project#: Apache Beam is a unified model for defining both batch and streaming data-parallel processing pipelines, as well as a set of language-specific SDKs for constructing pipelines and Runners for executing them on distributed processing backends, including Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow.

Message Oriental Middleware

  • 2017-Sandglass #Project#: Sandglass is a distributed, horizontally scalable, persistent, time sorted message queue.
  • 2018-PhxQueue #Project#: PhxQueue 是微信开源的一款基于 Paxos 协议实现的高可用、高吞吐和高可靠的分布式队列,保证 At-Least-Once Delivery,在微信内部广泛支持微信支付、公众平台等多个重要业务。
  • 2018-Jocko #Project#: Kafka implemented in Golang with built-in coordination (No ZK dep, single binary install, Cloud Native)
  • 2018-QMQ #Project#: QMQ 是去哪儿网内部广泛使用的消息中间件,自 2012 年诞生以来在去哪儿网所有业务场景中广泛的应用,包括跟交易息息相关的订单场景;也包括报价搜索等高吞吐量场景。
  • 2018-NATS #Project#: High-Performance server for NATS, the cloud native messaging system.
  • 2018-Waltz #Project#: Waltz is a quorum-based distributed write-ahead log for replicating transactions.
  • 2018-TubeMQ #Project#: TubeMQ focuses on high-performance storage and transmission of massive data in big data scenarios
  • Disque #Project#: Disque is a distributed message broker.

Kafka

  • Kowl #Project#: Kowl (previously known as Kafka Owl) is a web application that helps you to explore messages in your Apache Kafka cluster and get better insights on what is actually happening in your Kafka cluster in the most comfortable way.

Processing Engine

  • hazelcast-jet #Project#: A general purpose distributed data processing engine, built on top of Hazelcast.
  • Flink #Project#: Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities.
  • Wallaroo #Project#: Wallaroo is a fast, elastic data processing engine that rapidly takes you from prototype to production by eliminating infrastructure complexity.
  • Bigslice #Project#: Bigslice is a system for fast, large-scale, serverless data processing using Go.

Edge Computing | 边缘计算

  • OpenEdge #Project#: Extend cloud computing, data and service seamlessly to edge devices.
  • KubeEdge #Project#: KubeEdge is built upon Kubernetes and extends native containerized application orchestration and device management to hosts at the Edge.

Job Scheduler

  • 2017-Elastic Job #Project#: Elastic-Job is a distributed scheduled job solution. Elastic-Job is composited from 2 independent sub projects: Elastic-Job-Lite and Elastic-Job-Cloud.
  • TaskBotJS #Project#: The best JavaScript/TypeScript job processing framework on the planet.
  • SIA #Project#: SIA 是我们公司基础开发平台 Simple is Awesome 的简称,SIA-TASK(微服务任务调度平台)是其中的一项重要产品,SIA-TASK 契合当前微服务架构模式,具有跨平台,可编排,高可用,无侵入,一致性,异步并行,动态扩展,实时监控等特点。
  • 2018-XXL-JOB #Project#: A distributed task scheduling framework.(分布式任务调度平台 XXL-JOB)。
  • 2019-Mantis #Project#: A platform that makes it easy for developers to build realtime, cost-effective, operations-focused applications.
  • 2019-Ofelia #Project#: Ofelia is a modern and low footprint job scheduler for docker environments, built on Go. Ofelia aims to be a replacement for the old fashioned cron.
  • 2020-ballista #Project#: Distributed compute platform implemented in Rust, and powered by Apache Arrow.

DAG Workflow

  • 2018-Conductor #Project#: Conductor is a Workflow Orchestration engine that runs in the cloud.
  • 2019-Dolphin Scheduler #Project#: Dolphin Scheduler is a distributed and easy-to-expand visual DAG workflow scheduling system, dedicated to solving the complex dependencies in data processing, making the scheduling system out of the box for data processing.
  • 2020-PowerJob #Project#: 新一代分布式任务调度与计算框架,支持 CRON、API、固定频率、固定延迟等调度策略,提供工作流来编排任务解决依赖关系,使用简单,功能强大,文档齐全,欢迎各位接入使用!