cascading.pipe
Class GroupBy

java.lang.Object
  extended by cascading.pipe.Pipe
      extended by cascading.pipe.Group
          extended by cascading.pipe.GroupBy
All Implemented Interfaces:
FlowElement, Serializable

public class GroupBy
extends Group

The GroupBy pipe groups the Tuple stream by the given groupFields.

If more than one Pipe instance is provided on the constructor, all branches will be merged. It is required that all Pipe instances output the same field names, otherwise the FlowConnector will fail to create a Flow instance. Again, the Pipe instances are merged together as if one Tuple stream and not joined. See CoGroup for joining by common fields.

Typically an Every follows GroupBy to apply an Aggregator function to every grouping. The Each operator may also follow GroupBy to apply a Function or Filter to the resulting stream. But an Each cannot come immediately before an Every.

Optionally a stream can be further sorted by providing sortFields. This allows an Aggregator to receive values in the order of the sortedFields.

Note that local sorting always happens on the groupFields, sortFields are a secondary sorting on the grouped values within the current grouping. sortFields is particularly useful if the Aggregators following the GroupBy would like to see their arguments in order.

It should be noted for MapReduce systems, distributed group sorting is not 'complete'. That is groups are sorted as seen by each Reducer, but they are not sorted across Reducers. See the MapReduce algorithm for details.

See Also:
Serialized Form

Field Summary
 
Fields inherited from class cascading.pipe.Group
declaredFields, groupFieldsMap, sortFieldsMap
 
Fields inherited from class cascading.pipe.Pipe
previous
 
Constructor Summary
GroupBy(Pipe pipe)
          Creates a new GroupBy instance that will group on Fields.ALL fields.
GroupBy(Pipe[] pipes)
          Creates a new GroupBy instance that will first merge the given pipes, then group on Fields.FIRST.
GroupBy(Pipe[] pipes, Fields groupFields)
          Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names.
GroupBy(Pipe[] pipes, Fields groupFields, Fields sortFields)
          Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.
GroupBy(Pipe[] pipes, Fields groupFields, Fields sortFields, boolean reverseOrder)
          Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.
GroupBy(Pipe pipe, Fields groupFields)
          Creates a new GroupBy instance that will group on the given groupFields field names.
GroupBy(Pipe pipe, Fields groupFields, Fields sortFields)
          Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.
GroupBy(Pipe pipe, Fields groupFields, Fields sortFields, boolean reverseOrder)
          Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.
GroupBy(String groupName, Pipe[] pipes, Fields groupFields)
          Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names.
GroupBy(String groupName, Pipe[] pipes, Fields groupFields, Fields sortFields)
          Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.
GroupBy(String groupName, Pipe[] pipes, Fields groupFields, Fields sortFields, boolean reverseOrder)
          Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.
GroupBy(String groupName, Pipe pipe, Fields groupFields)
          Creates a new GroupBy instance that will group on the given groupFields field names.
GroupBy(String groupName, Pipe pipe, Fields groupFields, Fields sortFields)
          Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.
GroupBy(String groupName, Pipe pipe, Fields groupFields, Fields sortFields, boolean reverseOrder)
          Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.
 
Method Summary
 
Methods inherited from class cascading.pipe.Group
collectReduceGrouping, equals, getDeclaredFields, getGroupingSelectors, getName, getPrevious, getSortingSelectors, hashCode, isGroupBy, isSorted, isSortReversed, iterateReduceValues, outgoingScopeFor, printInternal, resolveFields, toString, unwrapGrouping
 
Methods inherited from class cascading.pipe.Pipe
getHeads, getTrace, pipes, print, resolveIncomingOperationFields
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

GroupBy

public GroupBy(Pipe pipe)
Creates a new GroupBy instance that will group on Fields.ALL fields.

Parameters:
pipe - of type Pipe

GroupBy

public GroupBy(Pipe pipe,
               Fields groupFields)
Creates a new GroupBy instance that will group on the given groupFields field names.

Parameters:
pipe - of type Pipe
groupFields - of type Fields

GroupBy

public GroupBy(String groupName,
               Pipe pipe,
               Fields groupFields)
Creates a new GroupBy instance that will group on the given groupFields field names.

Parameters:
groupName - of type String
pipe - of type Pipe
groupFields - of type Fields

GroupBy

public GroupBy(Pipe pipe,
               Fields groupFields,
               Fields sortFields)
Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.

Parameters:
pipe - of type Pipe
groupFields - of type Fields
sortFields - of type Fields

GroupBy

public GroupBy(String groupName,
               Pipe pipe,
               Fields groupFields,
               Fields sortFields)
Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.

Parameters:
groupName - of type String
pipe - of type Pipe
groupFields - of type Fields
sortFields - of type Fields

GroupBy

public GroupBy(Pipe pipe,
               Fields groupFields,
               Fields sortFields,
               boolean reverseOrder)
Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.

Parameters:
pipe - of type Pipe
groupFields - of type Fields
sortFields - of type Fields
reverseOrder - of type boolean

GroupBy

public GroupBy(String groupName,
               Pipe pipe,
               Fields groupFields,
               Fields sortFields,
               boolean reverseOrder)
Creates a new GroupBy instance that will group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.

Parameters:
groupName - of type String
pipe - of type Pipe
groupFields - of type Fields
sortFields - of type Fields
reverseOrder - of type boolean

GroupBy

public GroupBy(Pipe[] pipes)
Creates a new GroupBy instance that will first merge the given pipes, then group on Fields.FIRST.

The assumption is that the first fields in all streams are logically the same field, which should be true as merging assumes all incoming streams have the same fields in the same order.

Parameters:
pipes - of type Pipe

GroupBy

public GroupBy(Pipe[] pipes,
               Fields groupFields)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names.

Parameters:
pipes - of type Pipe
groupFields - of type Fields

GroupBy

public GroupBy(String groupName,
               Pipe[] pipes,
               Fields groupFields)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names.

Parameters:
groupName - of type String
pipes - of type Pipe
groupFields - of type Fields

GroupBy

public GroupBy(Pipe[] pipes,
               Fields groupFields,
               Fields sortFields)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.

Parameters:
pipes - of type Pipe
groupFields - of type Fields
sortFields - of type Fields

GroupBy

public GroupBy(String groupName,
               Pipe[] pipes,
               Fields groupFields,
               Fields sortFields)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.

Parameters:
groupName - of type String
pipes - of type Pipe
groupFields - of type Fields
sortFields - of type Fields

GroupBy

public GroupBy(Pipe[] pipes,
               Fields groupFields,
               Fields sortFields,
               boolean reverseOrder)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.

Parameters:
pipes - of type Pipe
groupFields - of type Fields
sortFields - of type Fields
reverseOrder - of type boolean

GroupBy

public GroupBy(String groupName,
               Pipe[] pipes,
               Fields groupFields,
               Fields sortFields,
               boolean reverseOrder)
Creates a new GroupBy instance that will first merge the given pipes, then group on the given groupFields field names and sorts the grouped values on the given sortFields fields names.

Parameters:
groupName - of type String
pipes - of type Pipe
groupFields - of type Fields
sortFields - of type Fields
reverseOrder - of type boolean


Copyright © 2007-2008 Concurrent, Inc. All Rights Reserved.