(This page is still under construction)

Streaming 2.0

Streaming protocol is re-designed from ground up in Apache Cassandra 2.0. Here is the overview of the protocol design and implementation.

Design goal

Highlight

Stream Plan

Unlike the previous version, which performs sending and receiving data separately from each other and from the operation, Streaming 2.0 groups the related stream sessions under the same "Stream Plan".

Stream Plan for repair, bootstrap, bulkload, etc.
 |- Stream session with Endpoint 1
      |- Stream receiving tasks
      |- Stream transfer tasks
 |
 |- Stream session with Endpoint 2
 .
 .
 .

File transfer and messages

Streaming message and file exchange are pipelined on the same, persistent tcp connection.

Stream event support

Finer grained event notification. With JMX notification support, even external client can listen on event.

API

Public APIs

Basic API usage is as follows:

   1 // Start building your streaming plan
   2 StreamPlan bulkloadPlan = new StreamPlan("Bulkload");
   3 // Add transfer files tasks for each destination
   4 for (InetAddress remote : remoteTargets)
   5     bulkloadPlan.transferFiles(remote, ranges, sstables);
   6 // Execute your plan
   7 StreamResultFuture result = bulkloadPlan.execute();
   8 try
   9 {
  10     // ... and wait for streaming completes
  11     result.get();
  12     // all streaming success!
  13 }
  14 catch (Exception e)
  15 {
  16     // some stream failed
  17 }

Alternatively, StreamResultFuture implements guava's ListenableFuture<StreamState>, So you can use FutureCallback<StreamState> to capture stream success and failure.

   1 Futures.addCallback(result, new FutureCallback<StreamState>()
   2 {
   3     public void onSuccess(StreamState result)
   4     {
   5         // Yes, we did it!
   6     }
   7 
   8     public void onFailure(Throwable t)
   9     {
  10         // O_o something goes wrong
  11     }
  12 });

You can add event listener to StreamResultFuture for stream events:

   1 StreamResultFuture result = bulkloadPlan.execute();
   2 result.addEventListener(new StreamEventHandler()
   3 {
   4     public void handleEvent(StreamEvent event)
   5     {
   6         // streaming completed
   7     }
   8 });

Internal APIs

Stream session

Stream session handles the streaming part of one of more SSTables to and from a specific remote node. Both this node and the remote one will create a similar symmetrical StreamSession. A streaming session has the following life-cycle:

1. Connections Initialization

2. Streaming preparation phase

3. Streaming phase

4. Completion phase

Events

StreamResultFuture emits StreamEvent at the following cases:

To listen to StreamEvent, implement StreamEventHandler and register handler to StreamResultFuture.

JMX support

JMX support is provided through StreamingManager MBean. You can get list of streaming states of all currently running stream plans. It also provides JMX Notification support so that you can subscribe to stream events above through JMX interface.

stats

Streaming2 (last edited 2013-11-15 01:03:04 by GehrigKunz)