Powered By
Here are a few of the many companies using Cascading in production:
Adknowledge
Adknowledge is an ad network which provides an online pay-per-click marketplace for high quality traffic across multiple channels of email, web and search engine inventory.
Adknowledge currently uses Cascading for ad-hoc queries (internal business intelligence) and to develop clickstream analytics, using a data warehouse based on HDFS.
BackType
BackType is a real-time, conversational search engine. We index and connect millions of conversations from blogs, social networks and other social media so you can find out what people are saying about the topics that interest you.
Read about how BackType uses Cascading on their tech blog. BackType engineers are the authors of Cascading-DBMigrate and Cascalog.
Bixolabs
Bixolabs is an elastic web mining platform that makes it easy to create web mining apps, so customers can focus on what they know best - using the data - without the challenges of building out a reliable, scalable web crawling and data processing workflow.
Bixolabs is built on top of Hadoop, Cascading & Bixo, and runs in EC2. This makes it a flexible, scalable, on-demand solution for companies processing web data for internal use, as well as companies building products based on web mining.
Delve Networks
Delve provides a complete online video solution to manage, publish, measure, and monetize high quality video content on the web. We power video for well known sites in the media, sports, health, finance and other verticals.
At Delve we use Cascading in conjunction with AWS services such as EC2, S3 and Elastic MapReduce to scale video analytics for our rapidly growing collection of usage data. We are planning on leveraging it further to build additional business intelligence applications.
Etsy
Etsy's mission is to enable people to make a living making things, and to reconnect makers with buyers.
Read about how Etsy leverages Cascading on their blog in Analyzing Etsy's data with Hadoop and Cascading. The Etsy engineers also maintain the Cascading.JRuby DSL.
Feeva
Feeva has created a digital bridge between fixed and mobile service providers and the digital marketing industry. This bridge solution enhances the performance of digital marketing campaigns while maintaining the highest standards of consumer privacy.
Feeva uses Hadoop, HBase and Cascading for two areas right now.
The first is for analytics, we build aggregates for an OLAP cube from detailed logs which show non-PII web activity from our partners. Our second process synthesizes subscriber data and third party data directly into our HBase database for use by other processes.
FlightCaster
FlightCaster predicts flight delays. We use an advanced algorithm that scours data on every domestic flight for the past 10-years and matches it to real-time conditions.
Read about how FlightCaster predicts flight delays with Cascading on InfoQ, DataWrangling, SDTimes, and InformationWeek. Or listen to an interview on Cloud Cafe. FlightCaster engineers created and contribute to Cascading-Clojure.
Nextag
Nextag is a comparison shopping website that helps consumers make informed decisions about what and where to buy products they'll love.
Nextag uses Cascading to aggregate and analyze billions of user actions in order to improve and help scale search relevance, recommendations and reporting.
Ning
Ning is the social platform for the world's interests and passions online. Ning offers an easy-to-use service that allows people to join and create Ning Networks.
Ning data analytics team uses Cascading for their ad-hoc log and data analysis.
OneSpot
OneSpot mines content from around the web to find the best possible content for specific communities of interest. They help content creators find a bigger audience and web publishers to become the one spot their readers need to go to find the best content.
OneSpot currently uses Cascading to generate arbitrary reports on data stored in HDFS. They are in the process of migrating their content scoring code to Cascading.
RapLeaf
Every day, people use Rapleaf to discover the information about themselves that is available on the internet. Businesses use Rapleaf's search service to better understand their customers, learn how their customers use the social web, and offer their customers new and enhanced services.
Recent blog posts by the RapLeaf engineering team: Engineering Rapleaf - Goodbye MapReduce, Hello Cascading and A new Cascading pipe - MultiGroupBy.
Razorfish
Razorfish, a digital advertising and marketing firm, segments users and customers based on the collection and analysis of non-personally identifiable data from browsing sessions.
Read about how Razorfish leverages Cascading and AWS in the Razorfish AWS Case Study.
StumbleUpon
StumbleUpon helps you discover and share great websites.
StumbleUpon uses Cascading to manage data stored in an Apache HBase cluster.
Twitter is website which offers a social networking and microblogging service, enabling its users to send and read messages.
Twitter uses Cascading to filter and preprocess data on their Hadoop cluster as it is loaded directly into a Cassandra cluster.
Veoh
Veoh is a revolutionary Internet TV service that gives viewers the power to easily discover, watch, and personalize their online viewing experience.
Veoh developers have been using Cascading since its initial public release in early 2008.
VideoEgg
VideoEgg is a new kind of rich media advertising network that guarantees brand engagement. Our network consists of over 100 million uniques across hundreds of leading sites, blogs, and gaming sites, as well as social and mobile applications.
We are currently using Cascading to process hundreds of gigabytes of daily logs before importing them into our Hive-based data warehouse. With Cascading, we can implement an otherwise complex set of map-reduce ETL tasks as a simple Cascade of reusable and easily extensible Flows.
Visible Technologies
Visible Technologies is a leading provider of online brand management solutions for companies and individuals in today's rapidly chaning new media environment.
We're using Cascading to manage workflow between all of our algorithms. It abstracts your calculations, processing, and workflow. It's very, very nice because it saves quite a bit of time writing 'pipeline' code. --Bradford Stephens
If you would like your organization listed, drop us a note.