March 2010 Archives

Cascading-DBMigrate

|

Nathan at BackType has announced and released Cascading-DBMigrate.

In short, DBMigrate is a more flexible and reliable alternative to Sqoop for moving data to/from a relational data store.

Cascading.JDBC has been around for quite a while, but DBMigrate overcomes some of the limitations when dealing with MySQL servers (AsterData did not have the same limitations) and OFFSET/LIMIT queries.

Riffle: Lightweight Workflow

|

Riffle has been announced on the Mahout mailing list.

Riffle is a lightweight Java library for executing collections of dependent processes as a single process. It is Apache licensed so it can be included in non-GPL compatible projects.

The next major version of Cascading (1.2) will support the Riffle annotations so that projects like Mahout and Pig can participate in a Cascading Cascade execution.

Riffle can be found on its GitHub project page.

Cascading 1.1 RC1 Available

|

Cascading 1.1 RC1 is now available from the downloads page.

You can read about all the changes in the CHANGES.txt file.

Note we are no longer serving downloads from Google Code but from links off the download page.

Cascading at RazorFish and AWS

|
Check out the new Case Study published by Amazon on User Segmentation at RazorFish.

SimpleDB Support

|

Bixo Labs has recently announced a new project for integrating Hadoop and Cascading with Amazon Simple DB. Check it out on GitHub at cascading.simpledb.

This is in part a result of their Public Terabyte Dataset Project in AWS.

Cascading 1.1 User Guide Draft

|

In anticipation for the Cascading 1.1 release this month, we have published a draft of the 1.1 User Guide.

Please feel free to review and email in any comments or suggestions to the mailing list.

To download the most recent build of Cascading 1.1, please visit the download page at Concurrent. There are plans to have a 1.1 final release candidate available on the community site this week.