Category: News

Lingual Public Access

Lingual is now available for download or build. See the Lingual page for details, or visit the Lingual project page.

Lingual’s Architecture

Julian Hyde discusses how Optiq and Cascading work together to become Lingual.

Cascading Lingual – True SQL for Cascading and Hadoop

Announcing Lingual, a new framework that executes ANSI SQL queries as Cascading applications on Apache Hadoop clusters. Read more about it on the Lingual project page, signup for announcements on the mail list, or read the press release.

Cascading 2.2 WIP and CoercibleTypes

Cascading 2.2 is starting to take shape for those interested in test driving emerging features. Of note is “field type” support. This allows fields read from an input file to have type information retained through to where the data is sinked/stored to a file. This… Continue reading

Cascading 2.1

We are happy to announce that Cascading 2.1 is now publicly available for download. http://www.cascading.org/downloads/ This release includes a number of new features. Specifically: – Restartable Flows using Checkpointing – Improved memory utilization and gc – Refactored build system, source and javadoc jars now available… Continue reading

Cascading for the Impatient, Part 6

In our fifth installment of this series we showed how to implement TF-IDF in Cascading application. If you haven’t read that yet, it’s probably best to start there. Today’s post extends the TF-IDF app to show best practices for test-driven development (TDD) at scale. We’ll… Continue reading

Cascading for the Impatient, Part 5

In our fourth installment of this series we showed how to use HashJoin on two pipes, to perform “stop words” filtering at scale in a Cascading 2.0 application. If you haven’t read that yet, it’s probably best to start there. Today’s lesson builds on that… Continue reading

Cascading for the Impatient, Part 4

In our third installment of this series we showed how to write a custom Operation for a Cascading 2.0 application. If you haven’t read that yet, it’s probably best to start there. Today’s lesson takes that same Word Count app and expands on it to… Continue reading

Cascading Software Development Kit

The Cascading SDK is now available for download. The SDK includes Cascading source and jars, and many of the Cascading based tools like Load and Multitool. It also includes at Amazon Elastic MapReduce install script (bootstrap action) that will pre-install all included tools on the… Continue reading

Cascading for the Impatient, Part 3

In our second installment of this series we showed how to implement Word Count as a Cascading 2.0 application. If you haven’t read that yet, it’s probably best to start there. Today’s lesson takes the same app and stretches it even more. We’ll show how… Continue reading