Welcome to the Cascading Ecosystem

The Cascading Ecosystem is a collection of applications, languages, and APIs for developing data-intensive applications.

At the ecosystem core is Cascading, a Java API for defining complex data flows and integrating those flows with back-end systems, and a query planner for mapping and executing logical flows onto a computing platform.

There are quite a few extensions to Cascading providing integrations with popular systems, testing frameworks, and tools that leverage Cascading.

Sitting on top of the Cascading API are languages and tools to simplify the development of data-intensive applications. For Scala developers, see Scalding. For Clojure developers, see Cascalog. For SQL developers, see Lingual. And for Java developers, the raw Cascading API can be used, or a fluent interface named Fluid.

Sitting below the Cascading query planner are platform providers and rules for mapping data flows onto a given platform like Apache Hadoop, Apache Tez, Apache Flink, or simply locally in memory (suitable for many streaming applications).

Learn more from the the User Guide, the most recent Cascading and Scalding books, or the tutorials and example applications. To learn about Cascading internals, see this post on the 3.x query planner.

Recent News

More

Cascading 3.1 Release

Announcing Cascading 3.0 on Apache Flink

Cascading 3.0 Maintenance Release

Cascading 2.7 Maintenance Release

All