Please see Samza's new wiki.

Stream Processing

MillWheel: Fault-Tolerant Stream Processing at Internet Scale

Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams

Scalable Distributed Stream Processing

Distributed Operation in the Borealis Stream Processing Engine

STREAM: The Stanford Data Stream Management System

Fault-tolerance and high availability in data stream management systems

Highly Available, Fault-Tolerant, Parallel Dataflows

MapReduce Online

Discretized Streams: Fault-Tolerant Streaming Computation at Scale

Towards a Streaming SQL Standard

Semantics and Implementation of Continuous Sliding Window Queries over Data Streams

S-Store: A Streaming NewSQL System for Big Velocity Applications

The 8 Requirements of Real-Time Stream Processing

Models and Issues in Data Stream Systems

S4: Distributed Stream Computing Platform

Muppet: MapReduce-Style Processing of Fast Data

Streaming Algorithms

HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm

HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm

Data Streams as Random Permutations: the Distinct Element Problem

Approximately Detecting Duplicates for Streaming Data using Stable Bloom Filters

An Improved Data Stream Summary: The Count-Min Sketch and its Applications

Fast Incremental Maintenance of Approximate Histograms

Effective Computation of Biased Quantiles over Data Streams

Dynamic Histograms: Capturing Evolving Data Sets

Incremental calculation of weighted mean and variance

A curated collection of papers on streaming algorithms

Distributed Systems

Blazes: Coordination Analysis for Distributed Programs

StreamProcessingPapers (last edited 2015-03-02 19:08:19 by ChrisRiccomini)