|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcascading.tap.Tap
public abstract class Tap
A Tap represents the physical data source or sink in a connected Flow.
Pipe and Tuple stream, and
a sink Tap is the tail end. Kinds of Tap types are used to manage files from a local disk,
distributed disk, remote storage like Amazon S3, or via FTP. It simply abstracts
out the complexity of connecting to these types of data sources.
A Tap takes a Scheme instance, which is used to identify the type of resource (text file, binary file, etc).
A Tap is responsible for how the resource is reached.
A Tap is not given an explicit name by design. This is so a given Tap instance can be
re-used in different Flows that may expect a source or sink by a different
logical name, but are the same physical resource. If a tap had a name other than its path, which would be
used for the tap identity? If the name, then two Tap instances with different names but the same path could
interfere with one another.
| Constructor Summary | |
|---|---|
protected |
Tap()
|
protected |
Tap(Scheme scheme)
|
| Method Summary | |
|---|---|
abstract boolean |
containsFile(JobConf conf,
String currentFile)
Method containsFile indicates whether the tap contains a given file. |
abstract boolean |
deletePath(JobConf conf)
Method deletePath deletes the resource represented by this instance. |
boolean |
equals(Object object)
|
abstract Path |
getPath()
Method getPath returns the Hadoop path to the resource represented by this Tap instance. |
abstract long |
getPathModified(JobConf conf)
Method getPathModified returns the date this resource was last modified. |
Path |
getQualifiedPath(JobConf conf)
Method getQualifiedPath returns a FileSystem fully qualified Hadoop Path. |
Scheme |
getScheme()
Method getScheme returns the scheme of this Tap object. |
Fields |
getSinkFields()
Method getSinkFields returns the sinkFields of this Tap object. |
Fields |
getSourceFields()
Method getSourceFields returns the sourceFields of this Tap object. |
int |
hashCode()
|
boolean |
isDeleteOnSinkInit()
Method isDeleteOnSinkInit indicates whether the resource represented by this instance should be deleted if it already exists when the tap is initialized. |
boolean |
isSink()
Method isSink returns true if this Tap instance can be used as a sink. |
boolean |
isSource()
Method isSource returns true if this Tap instance can be used as a source. |
boolean |
isUseTapCollector()
Method isUseTapCollector returns true if this instances TapCollector should be used to sink values. |
abstract boolean |
makeDirs(JobConf conf)
Method makeDirs makes all the directories this Tap instance represents. |
TapIterator |
openForRead(JobConf conf)
Method openForRead opens the resource represented by this Tap instance. |
TapCollector |
openForWrite(JobConf conf)
Method openForWrite opens the resource represented by this Tap instance. |
Scope |
outgoingScopeFor(Set<Scope> incomingScopes)
Method outgoingScopeFor returns the Scope this FlowElement hands off to the next FlowElement. |
abstract boolean |
pathExists(JobConf conf)
Method pathExists return true if the path represented by this instance exists. |
Fields |
resolveFields(Scope scope)
Method resolveFields returns the actual field names represented by the given Scope. |
Fields |
resolveIncomingOperationFields(Scope incomingScope)
Method resolveIncomingOperationFields resolves the incoming scopes to the actual incoming operation field names. |
protected void |
setScheme(Scheme scheme)
|
void |
setUseTapCollector(boolean useTapCollector)
Method setUseTapCollector should be set to true if this instances TapCollector should be used to sink values. |
void |
sink(Fields fields,
Tuple tuple,
OutputCollector outputCollector)
Method sink emits the sink value(s) to the OutputCollector |
void |
sinkInit(JobConf conf)
Method sinkInit initializes this instance as a sink. |
Tuple |
source(Object key,
Object value)
Method source returns the source value as an instance of Tuple |
void |
sourceInit(JobConf conf)
Method sourceInit initializes this instance as a source. |
static Tap[] |
taps(Tap... taps)
Convenience function to make an array of Tap instances. |
| Methods inherited from class java.lang.Object |
|---|
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
protected Tap()
protected Tap(Scheme scheme)
| Method Detail |
|---|
public static Tap[] taps(Tap... taps)
taps - of type Tap
protected void setScheme(Scheme scheme)
public Scheme getScheme()
public boolean isUseTapCollector()
TapCollector should be used to sink values.
public void setUseTapCollector(boolean useTapCollector)
TapCollector should be used to sink values.
useTapCollector - the writeDirect of this Tap object.
public void sourceInit(JobConf conf)
throws IOException
conf - of type JobConf
IOException - on resource initialization failure.
public void sinkInit(JobConf conf)
throws IOException
conf - of type JobConf
IOException - on resource initialization failure.public abstract Path getPath()
public abstract boolean containsFile(JobConf conf,
String currentFile)
conf - of type JobConfcurrentFile - of type String
public Fields getSourceFields()
public Fields getSinkFields()
public TapIterator openForRead(JobConf conf)
throws IOException
conf - of type JobConf
IOException - when the resource cannot be opened
public TapCollector openForWrite(JobConf conf)
throws IOException
conf - of type JobConf
IOException - when
public Tuple source(Object key,
Object value)
Tuple
key - of type WritableComparablevalue - of type Writable
public void sink(Fields fields,
Tuple tuple,
OutputCollector outputCollector)
throws IOException
fields - of type Fieldstuple - of type TupleoutputCollector - of type OutputCollector
IOException - when the resource cannot be written topublic Scope outgoingScopeFor(Set<Scope> incomingScopes)
FlowElement
outgoingScopeFor in interface FlowElementincomingScopes - of type SetFlowElement#outgoingScopeFor(Set) public Fields resolveIncomingOperationFields(Scope incomingScope)
FlowElement
resolveIncomingOperationFields in interface FlowElementincomingScope - of type Scope
FlowElement.resolveIncomingOperationFields(Scope)public Fields resolveFields(Scope scope)
FlowElement
resolveFields in interface FlowElementscope - of type Scope
FlowElement.resolveFields(Scope)
public Path getQualifiedPath(JobConf conf)
throws IOException
conf - of type JobConf
IOException - when
public abstract boolean makeDirs(JobConf conf)
throws IOException
conf - of type JobConf
IOException - when there is an error making directories
public abstract boolean deletePath(JobConf conf)
throws IOException
conf - of type JobConf
IOException - when the resource cannot be deleted
public abstract boolean pathExists(JobConf conf)
throws IOException
conf - of type JobConf
IOException - when the status cannot be determined
public abstract long getPathModified(JobConf conf)
throws IOException
conf - of type JobConf
IOException - when the modified date cannot be determinedpublic boolean isDeleteOnSinkInit()
public boolean isSink()
public boolean isSource()
public boolean equals(Object object)
equals in class Objectpublic int hashCode()
hashCode in class Object
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||