Best Pivotal hd & hadoop Training in Jalandhar & Best Pivotal hd & hadoop Industrial Training in Jalandhar

pivotal hd & hadoop

Course Introduction

  1. Introductions and course logistics
  2. Course objectives

Introduction to NoSQL

  1. The NoSql paradigm
  2. NoSql and scalability
  3. *no SQL* versus *Not Only SQL*
  4. Types of NoSql databases (keyvalue store, graph, MapReduce…)

Introduction to Hadoop

  1. What is Hadoop?
  2. The Hadoop ecosystem: Pig, Hive, HBase, Zookeeper…
  3. Understanding MapReduce and HDFS (Hadoop Distributed File System)
  4. Insuring Data Integrity (checksum…)
  5. Saving space: input/output compression in Hadoop
  6. Launching a Hadoop job
  7. Configuring the Hadoop runtime

Hadoop Distributed File System

  1. Design goals: ability to run on commodity hardware, be fault tolerant…
  2. Scaling from one datanode to hundreds of datanodes
  3. HDFS commands
  4. Working with file paths
  5. HDFS administration (UI, admin commands…)
  6. Working with the Java API for HDFS
  7. Working with a Secordary NameNode, Federated NameNodes and High Availability NameNodes

Getting started with Map Reduce

  1. Map Reduce overview
  2. Hadoop versions
  3. Writing a mapper
  4. Writing a reducer
  5. Debugging and testing

Indepth Map Reduce

  1. The Writable hierarchy
  2. Partitionners, Combiners, Shuffle
  3. How to reuse objects and Garbage Collector optimization
  4. Map Reduce restrictions
  5. Joins (Map side and Reduce side)

Spring Data Hadoop

  1. The Writable hierarchy
  2. Partitionners, Combiners, Shuffle
  3. How to reuse objects and Garbage Collector optimization
  4. Map Reduce restrictions
  5. Joins (Map side and Reduce side)

Streaming MapReduce, Pig

  1. High level alternatives to writing Java Mappers and Reducers
  2. Hadoop streaming
  3. Pig scripting
  4. SQL in Hadoop

Introduction to Hive

  1. Hive overview
  2. Hive tables and DDL
  3. Partitions and external tables
  4. Selecting data
  5. Joins
  6. Transforms & User Defined Functions (UDFs)

Pivotal HD architecture

  1. Apache Hadoop Components
  2. HAWQ
  3. Data Loader
  4. Command Center
  5. Hadoop Virtualization Extensions (HVE)

Getting started with HAWQ

  1. HAWQ Installation and Environment
  2. Configuration and Operation Overview
  3. Client access to HAWQ
  4. Introduction to HAWQ SQL
  5. Quick introduction to Spring JDBC and Test Support

Working with HAWQ

  1. Creating database tables
  2. Queries
  3. Joins
  4. Functions

HAWQ external tables

  1. External Tables overview
  2. Loading data with gpfdist/gpload
  3. External tables with PXF
  4. Loading & unloading data recap
  5. Hadoop and Sqoop

HAWQ practical considerations

  1. Query Plans
  2. Using ANALYZE and EXPLAIN
  3. Distributions and partitioning
  4. ata storage and I/O