Search This Blog

Thursday, November 22, 2018

Training Big Data With Apache Hadoop

Big Data with Apache Hadoop

Duration : 4 Days (09.00 – 16.00)
Venue  & Price : www.purnamaacademy.com 
Registration : www.purnamaacademy.com 


Description :
Hadoop adalah framework atau platform open source berbasis Java di bawah lisensi Apache untuk support aplikasi yang jalan pada Big Data. Hadoop menggunakan teknologi Google MapReduce dan Google File System (GFS) sebagai fondasinya.

Pada awalnya Hadoop dikembangkan oleh Doug Cutting dan Mike Cafarella pada tahun 2005 yang saat itu bekerja di Yahoo. Nama Hadoop berdasarkan mainan 'Gajah' anak dari Doug Cutting.

Beberapa point penting Hadoop :
1. Hadoop merupakan framework/Platform open source berbasis Java
2. Hadoop di bawah lisensi Apache
3. Hadoop untuk support aplikasi yang jalan pada Big Data
4. Hadoop dikembangkan oleh Doug Cutting
5. Hadoop gunakan teknologi Google MapReduce dan Google File System (GFS)

Hadoop optimal digunakan untuk menangani data dalam jumlah besar baik data Structured, Semi-structured, maupun Unstructured. Hadoop mereplikasi data di beberapa komputer (Klustering), sehingga jika salah satu komputer mati/problem maka data dapat diproses dari salah satu komputer lainnya yang masih hidup

Topics include:

Introduction to Hadoop and Big Data:

• What is Big Data?

• What are the challenges for processing big data?

• What technologies support big data?

• What is Hadoop?

• Why Hadoop?

• History of Hadoop

• Use cases of Hadoop

• RDBMS vs Hadoop

• When to use and when not to use Hadoop

• Ecosystem tour

• Vendor comparison

• Hardware Recommendations & Statistics


HDFS: Hadoop Distributed File System:

Significance of HDFS in Hadoop

• Features of HDFS

• 5 daemons of Hadoop

• Data Storage in HDFS

Introduction about Blocks
Data replication
• Accessing HDFS

CLI (Command Line Interface) and admin commands
Java Based Approach
• Fault tolerance

• Download Hadoop

• Installation and set-up of Hadoop


Start-up & Shut down process
• HDFS Federation

Map Reduce:

• Map Reduce Story

• Map Reduce Architecture

• How Map Reduce works

• Developing Map Reduce

• Map Reduce Programming Model

• Creating Input and Output Formats in Map Reduce Jobs

PIG:

• Introduction to Apache Pig

• Map Reduce Vs. Apache Pig

• SQL vs. Apache Pig

• Different data types in Pig

• Modes of Execution in Pig

• Grunt shell

• Loading data

• Exploring Pig

• Latin commands

HIVE:

• Hive introduction

• Hive architecture

• Hive vs RDBMS

• HiveQL and the shell

• Managing tables (external vs managed)

• Data types and schemas

• Partitions and buckets

HBASE:

• Architecture and schema design

• HBase vs. RDBMS

• HMaster and Region Servers

• Column Families and Regions

• Write pipeline

• Read pipeline

• HBase commands

Flume

SQOOP

Participants :  (System Architecture, Database Administrator, IT Developer, IT Manager, CTO, CIO)




No comments:

Post a Comment