News and Announcements
Cascading 0.8.2 & Cascading.groovy 0.4.2 Released
Version 0.8.2 of Cascading, and Cascading.groovy 0.4.2 are now available for download. For details on new features and bug fixes, see the CHANGES.txt file. This is a minor release of both packages and includes a fix for application jar classloading issues previously reported.
Cascading 0.8.1 & Cascading.groovy 0.4.1 Released
Version 0.8.1 of Cascading, and Cascading.groovy 0.4.1 are now available for download. For details on bug fixes, see the CHANGES.txt file. This is a minor release of both packages.
Cascading.groovy 0.4.0 Released
We are pleased to announce that the 0.4.0 release of Cascading.groovy, our Groovy language interpreter extension, is available for download. In short, this release includes Cascading 0.8.0.
The only incompatible change of note was the removal of the ‘sort’ alias to ‘group’. Since distributed sorting is not native to Hadoop, this function would be a little misleading.
Cascading 0.8.0 Released
Version 0.8.0 of Cascading is now available for download. For details on new features and bug fixes, see the CHANGES.txt file. This is a major release consisting of many features and some incompatible API changes, please read on.
This release includes a large number of changes and we won’t list them all here. But there are just a couple worth providing additional explanation for.
First off c.p.PipeAssembly was renamed to c.
RapLeaf on Cascading
One of the more vocal and involved companies using Cascading has been RapLeaf. Vocal due to the feedback and questions they present on the #cascading IRC channel. Involved due to the level of detail of their feedback, and the serious number of boundaries they are pushing with Cascading as a compute model.
They have hard problems to solve to support their business proposition, and we are glad Cascading has made it easier for them to achieve their goals.
Slides from the Hadoop User Group
A little late, but here are the slides on Cascading Chris presented at the July Hadoop User Group at Yahoo!.
Cascading.groovy 0.3.0 Released
We are pleased to announce that the 0.3.0 release of Cascading.groovy, our Groovy language interpreter extension, is available for download. In short, this release included Cascading 0.7.0, and by virtue supports Hadoop 0.17.x.
Cascading 0.7.0 Released
Version 0.7.0 of Cascading is now available for download. For details on new features and bug fixes, see the CHANGES.txt file. This is a major release consisting of many features and some incompatible API changes, please read on.
The changes can be broken down into new features, and incompatible API changes. Sorry, but this release could break your existing code. But it’s all for the good.
First we should mention the most important change in this release is API compatibility with Hadoop 0.
Cascading at the Hadoop User Group
Chris will be making a short presentation on Cascading at the Hadoop User Group meeting on July 22. Hope to see you there.
Cascading.groovy 0.2.0 Released
We are pleased to announce that the 0.2.0 release of Cascading.groovy, our Groovy language interpreter extension, is available for download. This release makes some minor additions the the base DSL syntax and support for the new Cascading features stream assertions and traps, providing for highly fault tolerant scriptable data processing applications.
Also note a companion release of Cascading 0.6.1 is also available. It represents no significant changes.
Cascading 0.6.0 Released
Version 0.6.0 of Cascading is now available for download. For details on new features and bug fixes, see the CHANGES.txt file. For a quick summary, read on.
This release provides two major features. Stream Assertions and Trap Taps.
Stream Assertions are used in a similar fashion as the Java language assert function.
As the developer assembles more complex assemblies, it makes sense to inline assertions on the data expected in the stream.
Cascading.groovy 0.1.0 Released
We are pleased to announce the 0.1.0 release of Cascading.groovy, our Groovy language interpreter extension. With Cascading.groovy, Hadoop applications can be scripted by both advanced and casual Hadoop users without thinking in MapReduce. Read our Groovy Scripting Overview for more details.
We consider this a usable Alpha release, it being our first.
The underlying core, Cascading, is very stable and feature rich. But the Groovy builder will still likely undergo various changes as we get more feedback from the community.
Cascading 0.5.0 Released
Version 0.5.0 of Cascading is now available for download. For details on new features and bug fixes, see the CHANGES.txt file. For a quick summary, read on.
By far, the biggest change is support for sorting via the GroupBy operator. By default the ‘groupFields’ fields are sorted, but to sort fields that are not being grouped on, set the ‘sortFields’ argument. This will allow the values of every grouping key to be sorted before being handed to an Aggregator function.
Cascading 0.4.0 Released
Version 0.4.0 of Cascading is now available for download. See below for a review of the major changes. For more details, see the changes.txt file.
Foremost, most changes can be stuffed under the heading of performance improvements. I can’t offer scientifically valid metrics, but let’s say my projects are running noticeably faster. These few enhancements constitute the changes.
One, we now skip the reducer if there is no Group in the assembly.
Cascading 0.3.0 Released
Cascading 0.3.0 has just been packaged and is available for download from our downloads page. It incorporates many great changes, read on for more.
The biggest additions are read-only support for HTTP and S3. This support was pushed down into Hadoop, so any Hfs Tap instance can include remote resources with http(s):// or s3tp:// urls. The s3tp url is similiar to the s3:// url, where the authority part of the url includes an AWS account key and secret.
Cascading 0.2.0 Released
Just uploaded the 0.2.0 release of Cascading. You can download it from here.
The most significant change is a “spillable” list added to CoGroup that allows it to operate on any size co-groupings.
Note there are no limitations with normal GroupBy calls as they stream directly through the stack. CoGrouping must accumulate all the groups before emitting them through some join policy (inner, outer, etc).
Also wanted to point out we have had 91 downloads since Jan 21.
Cascading 0.1.0 Released
A little note to let everyone know Cascading is now available for download and includes the full source. Please visit our project site for more information.
By no means is this release feature complete or to be considered final. There is still much work to do, but we believe it to be stable and useful, if not under documented.