Srbija Posted November 16, 2022 Share #1 Posted November 16, 2022 Practical Guide To Setup Hadoop And Spark Cluster Using Cdh Last updated 6/2019 MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz Language: English | Size: 10.02 GB | Duration: 20h 56m Step by step instructions to setup Hadoop and Spark Cluster using Cloudera Distribution of Hadoop (Formerly CCA 131) What you'll learn Learn Hadoop and Spark Administration using CDH Provision Cluster from GCP (Google Cloud Platform) to setup Hadoop and Spark Cluster using CDH Setup Ansible for server automation to setup pre-requisites to setup Hadoop and Spark Cluster using CDH Setup 8 node cluster from scratch using CDH Understand Architecture of HDFS, YARN, Spark, Hive, Hue and many more Requirements Basic Linux Skills A 64 bit computer with minimum of 4 GB RAM Operating System - Windows 10 or Mac or Linux Flavor Description Cloudera is one of the leading vendor for distributions related to Hadoop and Spark. As part of this Practical Guide, you will learn step by step process of setting up Hadoop and Spark Cluster using CDH.Install - Demonstrate an understanding of the installation process for Cloudera Manager, CDH, and the ecosystem projects.Set up a local CDH repositoryPerform OS-level configuration for Hadoop installationInstall Cloudera Manager server and agentsInstall CDH using Cloudera ManagerAdd a new node to an existing clusterAdd a service using Cloudera ManagerConfigure - Perform basic and advanced configuration needed to effectively administer a Hadoop clusterConfigure a service using Cloudera ManagerCreate an HDFS user's home directoryConfigure NameNode HAConfigure ResourceManager HAConfigure proxy for Hiveserver2/ImpalaManage - Maintain and modify the cluster to support day-to-day operations in the enterpriseRebalance the clusterSet up alerting for excessive disk fillDefine and install a rack topology scriptInstall new type of I/O compression library in clusterRevise YARN resource assignment based on user feedbackCommission/decommission a nodeSecure - Enable relevant services and configure the cluster to meet goals defined by security policy; demonstrate knowledge of basic security practicesConfigure HDFS ACLsInstall and configure SentryConfigure Hue user authorization and authenticationEnable/configure log and query redactionCreate encrypted zones in HDFSTest - Benchmark the cluster operational metrics, test system configuration for operation and efficiencyExecute file system commands via HTTPFSEfficiently copy data within a cluster/between clustersCreate/restore a snapshot of an HDFS directoryGet/set ACLs for a file or directory structureBenchmark the cluster (I/O, CPU, network)Troubleshoot - Demonstrate ability to find the root cause of a problem, optimize inefficient execution, and resolve resource contention scenariosResolve errors/warnings in Cloudera ManagerResolve performance problems/errors in cluster operationDetermine reason for application failureConfigure the Fair Scheduler to resolve application delaysOur ApproachYou will start with creating Cloudera QuickStart VM (in case you have laptop with 16 GB RAM with Quad Core). This will facilitate you to get comfortable with Cloudera Manager.You will be able to sign up for GCP and avail credit up to $300 while offer lasts. Credits are valid up to year.You will then understand brief overview about GCP and provision 7 to 8 Virtual Machines using templates. You will also attaching external hard drive to configure for HDFS later.Once servers are provisioned, you will go ahead and set up Ansible for Server Automation.You will take care of local repository for Cloudera Manager and Cloudera Distribution of Hadoop using Packages.You will then setup Cloudera Manager with custom database and then Cloudera Distribution of Hadoop using Wizard that comes as part of Cloudera Manager.As part of setting up of Cloudera Distribution of Hadoop you will setup HDFS, learn HDFS Commands, Setup YARN, Configure HDFS and YARN High Availability, Understand about Schedulers, Setup Spark, Transition to Parcels, Setup Hive and Impala, Setup HBase and Kafka etc. Overview Section 1: Introduction - CCA 131 Cloudera Certified Hadoop and Spark Administrator Lecture 1 Introduction to the course Lecture 2 CCA 131 - Administrator - Official Page Lecture 3 Understanding required skills for the certification Lecture 4 Understanding the environment provided while taking the exam Lecture 5 Signing up for the exam Section 2: Getting Started - Provision instances from Google Cloud Lecture 6 Introduction Lecture 7 Setup Ubuntu using Windows Subsystem Lecture 8 Sign up for GCP Lecture 9 Create template for Big Data Server Lecture 10 Provision Servers for Big Data Cluster Lecture 11 Review Concepts Lecture 12 Setting up gcloud Lecture 13 Setup ansible on first server Lecture 14 Format JBOD Lecture 15 Cluster Topology Section 3: Getting Started - Setup local yum repository server - CDH Lecture 16 Introduction Lecture 17 Overview of yum Lecture 18 Setuphttpd service Lecture 19 Setup local yum repository - Cloudera Manager Lecture 20 Setup local yum repository - Cloudera Distribution of Hadoop (CDH) Lecture 21 Copy repo files Section 4: Install CM and CDH - Setup CM, Install CDH and Setup Cloudera Management Service Lecture 22 Introduction Lecture 23 Setup Pre-requisites Lecture 24 Install Cloudera Manager Lecture 25 Licensing and Installation Options Lecture 26 Install CM and CDH on all nodes Lecture 27 CM Agents and CM Server Lecture 28 Setup Cloudera Management Service Lecture 29 Cloudera Management Service - Components Section 5: Install CM and CDH - Configure Zookeeper Lecture 30 Introduction Lecture 31 Learning Process Lecture 32 Setup Zookeeper Lecture 33 Review important properties Lecture 34 Zookeeper Concepts Lecture 35 Important Zookeeper Commands Section 6: Install CM and CDH - Configure HDFS and Understand Concepts Lecture 36 Introduction Lecture 37 Setup HDFS Lecture 38 Copy Data into HDFS Lecture 39 Copy Data into HDFS Contd Lecture 40 Components of HDFS Lecture 41 Components of HDFS Contd Lecture 42 Configuration files and Important Properties Lecture 43 Review Web UIs and log files Lecture 44 Checkpointing Lecture 45 Checkpointing Contd Lecture 46 Namenode Recovery Process Lecture 47 Configure Rack Awareness Section 7: Install CM and CDH - Important HDFS Commands Lecture 48 Introduction Lecture 49 Getting list of commands and help Lecture 50 Creating Directories and Changing Ownership Lecture 51 Managing Files and File Permissions - Deleting Files from HDFS Lecture 52 Managing Files and File Permissions - Copying Files Local File System and HDFS Lecture 53 Managing Files and File Permissions - Copying Files within HDFS Lecture 54 Managing Files and File Permissions - Previewing Data in HDFS Lecture 55 Managing Files and File Permissions - Changing File Permissions Lecture 56 Controlling Access using ACLs - Enable ACLs On Cluster Lecture 57 Controlling Access using ACLs - ACLs On Files Lecture 58 Controlling Access using ACLs - ACLs On Directories Lecture 59 Controlling Access using ACLs - Removing ACLs Lecture 60 Overriding Properties Lecture 61 HDFS usage commands and getting metadata Lecture 62 Creating Snapshots Lecture 63 Using CLI for administration Section 8: Install CM and CDH - Configure YARN + MRv2 and Understand Concepts Lecture 64 Introduction Lecture 65 Setup YARN + MR2 Lecture 66 Run Simple Map Reduce Job Lecture 67 Components of YARN and MR2 Lecture 68 Configuration files and Important Properties - Overview Lecture 69 Configuration files and Important Properties - Review YARN Properties Lecture 70 Configuration files and Important Properties - Review Map Reduce Properties Lecture 71 Configuration files and Important Properties - Running Jobs Lecture 72 Review Web UIs and log files Lecture 73 YARN and MR2 CLI Lecture 74 YARN Application Life Cycle Lecture 75 Map Reduce Job Execution Life Cycle Section 9: Install CM and CDH - Configuring HDFS and YARN HA Lecture 76 Introduction Lecture 77 High Availability - Overview Lecture 78 Configure HDFS Namenode HA Lecture 79 Review Properties - HDFS Namenode HA Lecture 80 HDFS Namenode HA - Quick Recap of HDFS typical Configuration Lecture 81 HDFS Namenode HA - Components Lecture 82 HDFS Namenode HA - Automatic failover Lecture 83 Configure YARN Resource Manager HA Lecture 84 Review - YARN Resource Manager HA Lecture 85 High Availability - Implications Section 10: Install CM and CDH - YARN Schedulers - FIFO, Fair, and Capacity Lecture 86 Introduction Lecture 87 Schedulers Overview Lecture 88 FIFO Scheduler Lecture 89 Introduction to Fair Scheduler Lecture 90 Configure Fair Scheduler - Configure Cluster with Fair Scheduler Lecture 91 Configure Fair Scheduler - Running Jobs Without Specifying Queue Lecture 92 Configure Fair Scheduler - Running Jobs Specifying Queue Lecture 93 Configure Fair Scheduler - Important Properties Lecture 94 Capacity Scheduler - Introduction Lecture 95 Capacity Scheduler - Configure using Cloudera Manager Lecture 96 Capacity Scheduler - Run Sample Jobs Section 11: Install Other Components - Spark Overview and Installation Lecture 97 Introduction Lecture 98 Setup and Validate Spark 1.6.x Lecture 99 Review Important Properties Lecture 100 Spark Execution Life Cycle Lecture 101 Convert Cluster to Parcels Lecture 102 Setup Spark 2.3.x Lecture 103 Run Spark Jobs - Spark 2.3.x Section 12: Install Other Components - Configuring Database Engines - Hive and Impala Lecture 104 Introduction Lecture 105 Setup Hive and Impala Lecture 106 Validating Hive and Impala Lecture 107 Components and Properties of Hive Lecture 108 Troubleshooting Hive Issues Lecture 109 Hive Commands and Queries Lecture 110 Different Query Engines Lecture 111 Components and Properties of Impala Lecture 112 Running Queries using Impala - Overview Section 13: Install Other Components - Configure Hadoop Ecosystem components Lecture 113 Introduction Lecture 114 Setup Oozie, Pig, Sqoop and Hue Lecture 115 Review Important Properties Lecture 116 Run Sample Oozie job Lecture 117 Run Pig Job Lecture 118 Validate Sqoop Lecture 119 Overview of Hue Section 14: Install Other Components - Install and Configure Kafka and HBase Lecture 120 Introduction Lecture 121 Kafka Overview Lecture 122 Setup Parcels and Add Kafka Service Lecture 123 Validate Kafka Lecture 124 Setting up HBase Lecture 125 Validate HBase Section 15: CCA 131 - Revision for the Exam - Install the Cluster Lecture 126 Introduction Lecture 127 Set up a local CDH Repository Lecture 128 Perform OS-level Configuration Lecture 129 Install Cloudera Manager Server and Agents Lecture 130 Install CDH using Cloudera Manager Lecture 131 Add a New Node to an Existing Cluster Lecture 132 Install - Add Host as Worker Lecture 133 Add a Service using Cloudera Manager Section 16: CCA 131 - Revision for the Exam - Configure the Cluster Lecture 134 Introduction Lecture 135 Configure a Service using Cloudera Manager Lecture 136 Create an HDFS user's home directory Lecture 137 Configure NameNode HA Lecture 138 Configure ResourceManager HA Lecture 139 Configure proxy for HiveServer2/Impala - Install HA Proxy Lecture 140 Configure proxy for HiveServer2 Lecture 141 Configure proxy for Impala Section 17: CCA 131 - Revision for the Exam - Manage the Cluster Lecture 142 Introduction Lecture 143 Rebalance the cluster Lecture 144 Set up alerting for excessive disk fill Lecture 145 Define and install a rack topology script Lecture 146 Add I/O Compression Library Lecture 147 YARN Resource Assignment Lecture 148 Commission/Decommission a node Section 18: CCA 131 - Revision for the Exam - Secure the Cluster Lecture 149 Introduction Lecture 150 Configure HDFS ACLs Lecture 151 Install and Configure Sentry Lecture 152 Configure Hue user authorization and authentication Lecture 153 Enable or Configure Log and Query Redaction Lecture 154 Create Encrypted Zones in HDFS - Enable Encryption Lecture 155 Create Encrypted Zones in HDFS - Create Encryption Keys and Zones Section 19: CCA 131 - Revision for the Exam - Test and Troubleshoot the Cluster Lecture 156 Introduction Lecture 157 Execute file system commands via HTTPFS Lecture 158 Efficiently copy data within a cluster Lecture 159 Efficiently copy data between clusters Lecture 160 Create/Restore a snapshot of an HDFS directory Lecture 161 Get/Set ACLs for a file or directory structure Lecture 162 Benchmark the cluster (I/O, CPU, network) Lecture 163 Resolve errors/warnings in Cloudera Manager Lecture 164 Resolve performance problems/errors in cluster operation System Administrators who want to understand Big Data eco system and setup clusters,Experienced Big Data Administrators who want to learn how to manage Hadoop and Spark Clusters setup using CDH,Entry level professionals who want to learn basics and Setup Big Data Clusters Hidden Content Give reaction to this post to see the hidden content. Download from RapidGator Hidden Content Give reaction to this post to see the hidden content. Download from DDownload Hidden Content Give reaction to this post to see the hidden content. Link to comment
Recommended Posts
Please sign in to comment
You will be able to leave a comment after signing in
Sign In Now