18
Apr

Open Software Integrators Presents a Two-day Intro to Hadoop Training

Summary: This is a two day training designed to give you an introduction to the Hadoop Ecosystem. You will learn the essentials of Hadoop Distributed File System (HDFS) and MapReduce Framework. In addition, you will also learn about Pig (High level programming language designed for data processing) and Hive (Data Warehousing). There will be hands on labs to apply what you have learned.

The cost is $159.24 (including fees) for this event.

Provided:
Participants will receive a USB key with training materials and labs

Requirements:
– Laptop with the following requirements
6GB RAM
At least 50GB free space on HD
– Familiarity with Java
– VirtualBox installed

Outline:
Day 1–
Hadoop Overview
Ecosystem
Use Cases
Setup for Lab
Importing VM into VirtualBox
Installing Java and Maven
Setting up .bash_profile
HDFS
Overview
Introduction
Architecture and Concepts
Access Options
Overview Exercise
Installation and Shell
Installation
NameNode Safe Mode
Secondary NameNode
Hadoop File System Shell
Installation and Shell Lab
Pseudo-Distributed mode
Configuration Files
Java API
Introduction
Configuration
Reading Data
Writing Data
Browsing File System
Java API Lab
MapReduce
Introduction and Installation
Introduction
MapReduce Model
YARN and MapReduce 2.0 Daemons
MapReduce on YARN Installation
MapReduce and YARN command line tools

Day 2–
MapReduce
Introduction and Installation La
Pseudo-Distributed Mode
Developing First Job
Introduction to MapReduce Framework
Implement First Job
Developing First Job Lab
Running Jobs
Tool, ToolRunner and GenericOptionsParser
Running MapReduce Locally
Running MapReduce on Cluster
Packaging MapReduce Jobs
MapReduce CLASSPATH
Submitting Jobs
Logs and Web UI
Running Jobs Lab
Pig
Overview
Execution Modes
Installation
Pig Latin Basics
Developing Pig Script
Resources
Pig Lab
Hive
Overview and Concepts
Installation
Table Creation and Deletion
Loading Data into Hive
Partitioning
Bucketing
Joins
Hive Lab

Instructor: Indika Kotakadeniya, Staff Development Manager & Senior Consultant at Open Software Integrators

Indika is a Senior Developer at Open Software Integrators. His diverse background includes configuration management, agile methodologies and system administration. He is big believer in Open Source Software/Linux and being process oriented within reason. Indika is also passionate about cricket and salsa.

Leave a Reply