cascading.cascade
Class Cascade

java.lang.Object
  extended by cascading.cascade.Cascade
All Implemented Interfaces:
Runnable

public class Cascade
extends Object
implements Runnable

A Cascade is an assembly of Flow instances that share or depend on equivalent Tap instances and are executed as a single group. The most common case is where one Flow instance depends on a Tap created by a second Flow instance. This dependency chain can continue as practical.

Note Flow instances that have no shared dependencies will be executed in parallel.

Additionally, a Cascade allows for incremental builds of complex data processing processes. If a given source Tap is newer than a subsequent sink Tap in the assembly, the connecting Flow(s) will be executed when the Cascade executed. If all the targets (sinks) are up to date, the Cascade exits immediately and does nothing.

The concept of 'stale' is pluggable, see the FlowSkipStrategy class.

When a Cascade starts up, if first verifies which Flow instances have stale sinks, if the sinks are not stale, the method Flow.deleteSinksIfNotAppend() is called. Before appends were supported (logically) the Cascade deleted all the sinks in a Flow.

The new consequence of this is if the Cascade fails, but does compelete a Flow that appended data, re-running the Cascade (and the successful append Flow) will re-append data to the source. Some systems may be idempotent and may not have any side-effects. So plan accordingly.

See Also:
Flow, FlowSkipStrategy

Nested Class Summary
protected  class Cascade.CascadeJob
          Class CascadeJob manages Flow execution in the current Cascade instance.
 
Method Summary
 void complete()
          Method complete begins the current Cascade process if method start() was not previously called.
 CascadeStats getCascadeStats()
          Method getCascadeStats returns the cascadeStats of this Cascade object.
 List<Flow> getFlows()
          Method getFlows returns the flows managed by this Cascade object.
 FlowSkipStrategy getFlowSkipStrategy()
          Method getFlowSkipStrategy returns the current FlowSkipStrategy used by this Flow.
 String getName()
          Method getName returns the name of this Cascade object.
 void run()
          Method run implements the Runnable run method.
 FlowSkipStrategy setFlowSkipStrategy(FlowSkipStrategy flowSkipStrategy)
          Method setFlowSkipStrategy sets a new FlowSkipStrategy, the current strategy, if any, is returned.
 void start()
          Method start begins the current Cascade process.
 void stop()
           
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Method Detail

getName

public String getName()
Method getName returns the name of this Cascade object.

Returns:
the name (type String) of this Cascade object.

getCascadeStats

public CascadeStats getCascadeStats()
Method getCascadeStats returns the cascadeStats of this Cascade object.

Returns:
the cascadeStats (type CascadeStats) of this Cascade object.

getFlows

public List<Flow> getFlows()
Method getFlows returns the flows managed by this Cascade object. The returned Flow instances will be in topological order.

Returns:
the flows (type Collection) of this Cascade object.

getFlowSkipStrategy

public FlowSkipStrategy getFlowSkipStrategy()
Method getFlowSkipStrategy returns the current FlowSkipStrategy used by this Flow.

Returns:
FlowSkipStrategy

setFlowSkipStrategy

public FlowSkipStrategy setFlowSkipStrategy(FlowSkipStrategy flowSkipStrategy)
Method setFlowSkipStrategy sets a new FlowSkipStrategy, the current strategy, if any, is returned. If a strategy is given, it will be used as the strategy for all Flow instances managed by this Cascade instance. To revert back to consulting the strategies associated with each Flow instance, re-set this value to null, its default value.

FlowSkipStrategy instances define when a Flow instance should be skipped. The default strategy is FlowSkipIfSinkStale and is inherited from the Flow instance in question. An alternative strategy would be FlowSkipIfSinkExists.

A FlowSkipStrategy will not be consulted when executing a Flow directly through start()

Parameters:
flowSkipStrategy - of type FlowSkipStrategy
Returns:
FlowSkipStrategy

start

public void start()
Method start begins the current Cascade process. It returns immediately. See method complete() to block until the Cascade completes.


complete

public void complete()
Method complete begins the current Cascade process if method start() was not previously called. This method blocks until the process completes.

Throws:
RuntimeException - wrapping any exception thrown internally.

run

public void run()
Method run implements the Runnable run method.

Specified by:
run in interface Runnable

stop

public void stop()

toString

public String toString()
Overrides:
toString in class Object


Copyright © 2007-2009 Concurrent, Inc. All Rights Reserved.