Enterprise Data Workflows with Cascading – Streamlined Enterprise Data Management and Analysis

Enterprise Data Workflows with Cascading
170 Pages
ISBN 9781449358723

There is an easier way to build Hadoop applications. With this hands-on book, you’ll learn how to use Cascading, the open source abstraction framework for Hadoop that lets you easily create and manage powerful enterprise-grade data processing applications—without having to learn the intricacies of MapReduce.

Working with sample apps based on Java and other JVM languages, you’ll quickly learn Cascading’s streamlined approach to data processing, data filtering, and workflow optimization. This book demonstrates how this framework can help your business extract meaningful information from large amounts of distributed data.

Paco Nathan

About Paco Nathan (Santa Rosa, California Author)

Paco Nathan

Director of Learning Group at O'Reilly Media, where the team focuses on applications of AI in Media. Also and advisor for Amplify Partners.

Paco has 30+ years tech industry experience, ranging from Bell Labs to early-stage start-ups. Known as a "player/coach" Data Scientist, with core expertise in machine learning, distributed systems, functional programming, cloud computing.

Cited in 2015 as one of the Top 30 People in Big Data and Analytics by Innovation Enterprise.