Cascalog

Cascalog is an extension to Cascading that enables application development with Clojure, a Lisp dialect. Cascalog is built and maintained by the Cascading community.

  • Build Data Applications with Clojure
    Use regular Clojure functions as operations or filters, and because Cascalog is a Clojure dynamic programming language, you can use Cascalog in other Clojure code.
  • Built with the Cascading framework
    Because Cascalog is built on top of the Cascading framework, this dynamic programming language inherits the value Cascading brings to app development, including: extensibility with the Cascading ecosystem, application portability and test-driven development best practices.
  • Ad-hoc Queries
    Cascalog queries run as a series of MapReduce jobs. You can query from HDFS, various databases, and locally by making use of Cascading's Tap abstraction.

Cascalog Benefits

  • Build data applications with Clojure or Java
  • Query HDFS, databases, local data from the Clojure REPL
  • Easily run arbitrary Clojure code in your queries
  • Simple and expressive logic and operations
  • Leverage the benefits of the Cascading application framework

USING MAVEN

Cascalog is currently under active development and available as source in the Cascalog project or on Clojars.

To add the Clojars repository:

<repository>
  <id>clojars.org</id>
  <url>http://clojars.org/repo</url>
</repository>

To include the Cascalog dependency:

<dependency>
  <groupId>cascalog</groupId>
  <artifactId>cascalog</artifactId>
  <version>2.1.0</version>
</dependency>

USING LEININGEN

To include Cascalog in your leiningen or cake project, add the following to your

project.clj:

General

[cascalog/cascalog-core "2.1.0"] ;; under :dependencies
[org.apache.hadoop/hadoop-core "1.2.1"] ;; under :dev-dependencies

Leiningen 2.0

:repositories {"conjars" "http://conjars.org/repo"}
:dependencies [cascalog/cascalog-core "2.1.0"]
:profiles { :provided {:dependencies [[org.apache.hadoop/hadoop-core "1.2.1"]]}}

Leiningen < 2.0

:dependencies [cascalog/cascalog-core "2.1.0"]
:dev-dependencies [[org.apache.hadoop/hadoop-core "1.2.1"]]

Note that Cascalog is compatible with Clojure 1.2.0, 1.2.1, 1.3.0, 1.4.0, and 1.5.1.

Compatibility

Cascading

Cascalog is a DSL that integrates Cascading with the Clojure programming language. Because Cascalog is built on top of Cascading, Cascalog based code can be combined with your other Cascading applications and flows.

Driven

Cascalog applications will work with Driven. You can build your applications with Cascalog and visualize them in Driven, just like any other Cascading application.