Hadoop

Presentation

Training overview

Hadoop Common, HDFS, YARN, MapReduce Oozie, Pig, Hive, HBase The functionalities of the Hadoop framework. The different versions. Distributions: Apache, Cloudera, Hortonworks, EMR, MapR. Specificities of each distribution. Architecture and operating principle. Terminology: NameNode, DataNode, ResourceManager, NodeManager Role of the different components

Goals

– Understand the architecture of a Hadoop system.
– Detail the main services, their configuration, security within a cluster but also its operation.
– Review the different software components to manipulate big data (MapReduce, Pig, Hive Sqoop).

Target audience

– Technical directors
– Project managers
– Architects
– Consulting
– DBAs
– Application developers

Program

     1. Study of configuration files
    – User management for hdfs and yarn daemons
    – Access rights on executables and directories
    – Architecture and management of Hadoop general services
    -HDFS
    -YARN
    – MapReduce
    – HBase

    2. Hadoop cluster monitoring
    – Log load tracking (jConsole)
    – Management of Access JM nodes
    – Implementation of a JMX client
    – HDFS Administration
    – File storage, fsck, dfsadmin
    – Central cache management with Cacheadmin

    3. Security
    – Enabling security with Kerberos in core-site.xml and in hdfs-site.xml for NameNode and DataNode.
    – Security management with Apache Sentry

    4. Operation
    – Supervision of elements by the NodeManage
    – Graphical monitoring with Ambari, Kibana, Cloudera Manager
    – Visualization of alerts in case of unavailability of a node
    – Configuring logs with log4j

    5. HDFS
    – Architecture
    – SHELL commands

    6. mapreduce
    – MapReduce Architecture
    – Run a MapReduce code

    7. HEU
    – Introduction
    – Features and use
    – HBASE
    – Architecture
    – SHELL commands
    – Creation of database, tables, families
    – Data query

    8. HIVE
    – Architecture
    – Hive Access Methods
    – HiveQL
    – Creation of databases, tables, views
    – Data query using HiveQL
    – User Defined Function (UDF) manipulations
    – Partition your data
    – Archiving of your data

    9. PIG
    – Introduction
    – Methods of execution
    – Pig Latin
    – Communication between Pig and Hive

    10. SQOOP
    – Introduction
    – For what uses?
    – Methods of use
    – Import and export of data

    11. OOZIE
    – Introduction
    – Planning workflows with parameters

What's More

Digital Finance

Please wait while flipbook is loading. For more related info, FAQs and issues please refer to DearFlip WordPress Flipbook Plugin Help documentation.

Means of contact

  1. Telephone

    +216 96 803 221

  2. Email

    contact@upgradetek-engineering.com

  3. Whatsapp

Registration

Switch The Language

    Upgradetek Engineering is a strategy and management consulting firm, specialized in the transformation of financial institutions. As one of the leaders of this sector in Tunisia, we have been supporting our banking and financial clients for more than 14 years in the evolution of their business model, in defining and implementing new target business models and improving their performance.

    Address
    23, Avenue of Naplouse 1001 Tunis, Tunisia
    Phone
    +216 71 33 93 95
    E-MAIL
    contact@upgradetek-engineering.com