Cloudera Developer for Apache Hadoop
Xebia's four-day developer training course delivers the key concepts and expertise participants need to create robust data processing applications using Apache Hadoop. From workflow implementation and working with APIs through writing MapReduce code and executing joins, Cloudera's training course is the best preparation for the realworld challenges faced by Hadoop developers.
Programme and Course Overview
Xebia University delivers a developer-focused Cloudera Certified training course that closely analyzes Hadoop's structure and provides hands-on exercises that teach you how to import data from existing sources; process data with a variety of techniques such as Java MapReduce programs and Hadoop Streaming jobs; and work with Apache Hive and Pig.
Through instructor-led discussion and interactive, hands-on exercises, participants willÂ navigate the Hadoop ecosystem, learning topics such as:
- The internals of MapReduce and HDFS and how to write MapReduce code
- Â Best practices for Hadoop development, debugging, and implementation of workflowsÂ and common algorithms
- How to leverage Hive, Pig, Sqoop, Flume, Oozie, and other Hadoop ecosystem projects
- Creating custom components such as WritableComparables and InputFormats toÂ manage complex data types
- Writing and executing joins to link data sets in MapReduce
- Advanced Hadoop API topics required for real-world data analysis
Certification is a great differentiator; it helps establish you as a leader in the field, providing employers and customers with tangible evidence of your skills and expertise.
Learn more about the CCDH Certification Exam here:http://cloudera.com/content/cloudera/en/training/certification/ccdh.html
Target Group & Prerequisites:
This course is best suited to developers and engineers who have programmingÂ experience. Knowledge of Java is strongly recommended and is required to completeÂ the hands-on exercises.
Key Promises of this Training
- The core technologies of Hadoop.
- How HDFS and MapReduce work.
- How to develop MapReduce applications.
- How to unit test MapReduce applications.
- How to use MapReduce combiners, partitioners and the distributed cache.
- Best practices for developing and debugging MapReduce applications.
- How to implement data input and output in MapReduce applications.
- Algorithms for common MapReduce tasks.
- How to join data sets in MapReduce.
- How Hadoop integrates into the data center.
- How Hive, Impala and Pig can be used for rapid application development.
- How to create large workflows using Oozie.
- The Motivation for Hadoop
- Hadoop: Basic Concepts and HDFS
- Introduction to MapReduce
- Hadoop Clusters and the Hadoop Ecosystem
- Writing a MapReduce Program in Java
- Writing a MapReduce Program Using Streaming
- Delving Deeper into the Hadoop API
- Practical Development Tips and Techniquess
- Partitioners and Reducers
- Data Input and Output
- Implementing Custom InputFormats and OutputFormats
- Joining Data Sets in MapReduce Jobs
- Integrating Hadoop into the Enterprise Workflow
- An Introduction to Hive, Imapala, and Pig
- An Introduction to Oozie
Please note, that you need to bring your own laptop for this training.Â This laptop should meet the following requirements:
- At least 4GB RAM;
- 15GB of free hard disk space;
- VMware Player 5.x or above (Windows)/ VMware Fusion 4.x or above (Mac);
- Your laptop must support a 64-bit VMware guest image. If the machines are running a 64-bit version of WIndows, or Mac OS X on a Core DUO 2 processor or later, not other test is required. Otherwise, VMware provides a tool to check compatibility, which can be downloaded fromÂ http://tiny.cloudera.com/training2;
- Your laptop must have VT-x virtualization support enabled in the BIOS;
- If running Windows XP: 7-Zip or WinZip Â is needed (due to a bug in Windows XP's built-in Zip utility).