Cascading

The core data processing API for Java developers and Data Engineers who wish to build data-intensive applications and frameworks.

Cascading was created for developers who want to…

Quickly build robust, reliable, data-oriented applications in Java
Eliminate platform lock-in
Develop testable and reusable integrations, data processing code and algorithms
Leverage existing best practices, skill sets and tools
Install nothing, all dependencies are through Maven
Create higher order DSLs/languages in other JVM based languages

About Cascading

Build Data Intensive Applications that are Scale-free

Developers can build and test their application locally, and then deploy them at scale in production.

Systems Integration

Easily build applications that integrate with your existing legacy systems.

There are many community-supported projects that allows your app to move data in and out of various sources (i.e. Elasticsearch, HBase, Cassandra, MongoDB, and more).

Application Portability

Write once, then run on different computation platforms. Applications written with Cascading are portable across any fabric that Cascading ecosystem supports.

Cascading ships with Apache Hadoop, Apache Tez, and In-Memory streaming support.

Division of Logic

Cascading allows you to develop your business logic separately from your integration logic via the Pipes and Taps abstractions.

Cascading

About Cascading

Build Data Intensive Applications that are Scale-free

Systems Integration

Application Portability

Division of Logic

Resources

Books

Videos

Tutorials

Examples