[Nathan Marz]() has just announced and released Cascalog.
Cascalog is an interactive query language for Hadoop with a focus on simplicity, expressiveness, and flexibility intended to be used by Analysts and Developers alike.
Cascalog eschews the SQL syntax for a simpler and more expressive syntax based on Datalog.
With this added expressiveness, Cascalog can query existing data stores “out of the box” with no required data “importing” or “under the hood” configuration necessary.
Because Cascalog sits on top of Clojure, a powerful JVM based language and interactive shell, adding new operations to a query is as simple as defining a new function.
Cascalog also relies on Cascading, a robust data processing API and query planner.
Here is the canonical “word count” query in Cascalog:
(? ?word) (c/ count ?count))
You can check out an introductory blog post here:
The project is hosted here: http://github.com/nathanmarz/cascalog