Cascading Pattern is an extension to Cascading that provides various machine learning scoring algorithms and a utility for translating Predictive Model Markup Language (PMML) documents into applications on Apache Hadoop. Now you can deploy predictive models on to Hadoop or utilize the Cascading Pattern Java API to deploy your models or sophisticated ensembles.
Pattern Benefits
- Quickly deploy machine scoring applications at scale on Apache Hadoop in as little as 4 lines of code
- Leverage existing intellectual property in predictive models, and investments in predictive modeling tooling and core competencies
- Accelerate application development and testing
- Unlock accessibility to Hadoop
SUPPORTED PREDICTIVE MODEL TYPES
- Hierarchical Clustering
- K-Means Clustering
- Linear Regression
- Logistic Regression
- Random Forest Algorithm
Using Maven
Pattern is currently under active development and available as source in the Pattern project or Maven artifacts on Conjars.
To add the Conjars repository:
<repository> <id>conjars.org</id> <url>http://conjars.org/repo</url> </repository>
To include the Pattern core library:
<dependency> <groupId>cascading</groupId> <artifactId>pattern-core</artifactId> <version>1.0.0-wip-45</version> </dependency>
To include the Pattern PMML library:
<dependency> <groupId>cascading</groupId> <artifactId>pattern-pmml</artifactId> <version>1.0.0-wip-45</version> </dependency>
To include the Pattern Hadoop library:
<dependency> <groupId>cascading</groupId> <artifactId>pattern-hadoop</artifactId> <version>1.0.0-wip-45</version> </dependency>
To include the Pattern local library:
<dependency> <groupId>cascading</groupId> <artifactId>pattern-local</artifactId> <version>1.0.0-wip-45</version> </dependency>
Compatibility
Cascading
Pattern integrates Cascading with the PMML format. Because Pattern is built on top of Cascading, any Pattern based code will function with your other Cascading applications and flows.
Driven
Pattern applications will work with Driven. You can build your applications with Pattern and visualize them in Driven, just like any other Cascading application.