Cascading 1.2 Now Available

|

We are happy to announce that Cascading 1.2 is now publicly available for download.

This release features many performance and usability enhancements while remaining backwards compatible with 1.0 and 1.1.

Specifically:

  • Performance optimizations during grouping (StreamComparator)
  • Composable map-side partial aggregations (AggregateBy)
  • Native Riffle support for non-Cascading (or nested iterative Cascading) processes (ProcessFlow and Riffle)

For a detailed list of changes see: CHANGES.txt

We are also happy to announce that Cascading and its extensions have their own Maven/Ivy Jar repository, Conjars. Conjars is a public repository, any developer wishing to publish Cascading libraries and extensions can register their public key and push artifacts. Conjars is a simple fork of the Clojars repo code.

Along with this release are a number of extensions created by the Cascading user community.

Among these extension are:

  • Cascading.Avro - Cascading Scheme for the Apache Avro data serialization format.
  • Cascading.Memcached - Integration with Memcached, Membase, and ElasticSearch.
  • Bixo - a web mining toolkit
  • DBMigrate - a tool for migrating data to/from RDBMSs into Hadoop
  • Apache HBase, Amazon SimpleDB, and JDBC integration
  • JRuby and Clojure based scripting languages for Cascading
  • Cascalog - a robust interactive extensible query language

This release will run against 0.19.x, and 0.20.x. Including Amazon Elastic MapReduce.