Srbija Posted November 15, 2022 Share #1 Posted November 15, 2022 Spark Sql And Spark 3 Using Scala Hands-On With Labs Last updated 2/2022 MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz Language: English | Size: 8.75 GB | Duration: 24h 12m A comprehensive course on Spark SQL as well as Data Frame APIs using Scala with complementary lab access What you'll learn All the HDFS Commands that are relevant to validate files and folders in HDFS. Enough Scala to work Data Engineering Projects using Scala as Programming Language Spark Dataframe APIs to solve the problems using Dataframe style APIs. Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark Dataframe APIs Inner as well as outer joins using Spark Data Frame APIs Ability to use Spark SQL to solve the problems using SQL style syntax. Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark SQL Inner as well as outer joins using Spark SQL Basic DDL to create and manage tables using Spark SQL Basic DML or CRUD Operations using Spark SQL Create and Manage Partitioned Tables using Spark SQL Manipulating Data using Spark SQL Functions Advanced Analytical or Windowing Functions to perform aggregations and ranking using Spark SQL Requirements Basic programming skills Self support lab (Instructions provided) or ITVersity lab at additional cost for appropriate environment. Minimum memory required based on the environment you are using with 64 bit operating system 4 GB RAM with access to proper clusters or 16 GB RAM with virtual machines such as Cloudera QuickStart VM Description As part of this course, you will learn all the key skills to build Data Engineering Pipelines using Spark SQL and Spark Data Frame APIs using Scala as a Programming language. This course used to be a CCA 175 Spark and Hadoop Developer course for the preparation of the Certification Exam. As of 10/31/2021, the exam is sunset and we have renamed it to Spark SQL and Spark 3 using Scala as it covers industry-relevant topics beyond the scope of certification.About Data EngineeringData Engineering is nothing but processing the data depending on our downstream needs. We need to build different pipelines such as Batch Pipelines, Streaming Pipelines, etc as part of Data Engineering. All roles related to Data Processing are consolidated under Data Engineering. Conventionally, they are known as ETL Development, Data Warehouse Development, etc. Apache Spark is evolved as a leading technology to take care of Data Engineering at scale.I have prepared this course for anyone who would like to transition into a Data Engineer role using Spark (Scala). I myself am a proven Data Engineering Solution Architect with proven experience in designing solutions using Apache Spark.Let us go through the details about what you will be learning in this course. Keep in mind that the course is created with a lot of hands-on tasks which will give you enough practice using the right tools. Also, there are tons of tasks and exercises to evaluate yourself.Setup of Single Node Big Data ClusterMany of you would like to transition to Big Data from Conventional Technologies such as Mainframes, Oracle PL/SQL, etc and you might not have access to Big Data Clusters. It is very important for you set up the environment in the right manner. Don't worry if you do not have the cluster handy, we will guide you through support via Udemy Q&A.Setup Ubuntu-based AWS Cloud9 Instance with the right configurationEnsure Docker is setupSetup Jupyter Lab and other key componentsSetup and Validate Hadoop, Hive, YARN, and SparkAre you feeling a bit overwhelmed about setting up the environment? Don't worry!!! We will provide complementary lab access for up to 2 months. Here are the details.Training using an interactive environment. You will get 2 weeks of lab access, to begin with. If you like the environment, and acknowledge it by providing a 5* rating and feedback, the lab access will be extended to additional 6 weeks (2 months). Feel free to send an email to [email protected] to get complementary lab access. Also, if your employer provides a multi-node environment, we will help you set up the material for the practice as part of the live session. On top of Q&A Support, we also provide required support via live sessions.A quick recap of ScalaThis course requires a decent knowledge of Scala. To make sure you understand Spark from a Data Engineering perspective, we added a module to quickly warm up with Scala. If you are not familiar with Scala, then we suggest you go through relevant courses on Scala as Programming Language.Data Engineering using Spark SQLLet us, deep-dive into Spark SQL to understand how it can be used to build Data Engineering Pipelines. Spark with SQL will provide us the ability to leverage distributed computing capabilities of Spark coupled with easy-to-use developer-friendly SQL-style syntax.Getting Started with Spark SQLBasic Transformations using Spark SQLManaging Spark Metastore Tables - Basic DDL and DMLManaging Spark Metastore Tables Tables - DML and PartitioningOverview of Spark SQL FunctionsWindowing Functions using Spark SQLData Engineering using Spark Data Frame APIsSpark Data Frame APIs are an alternative way of building Data Engineering applications at scale leveraging distributed computing capabilities of Spark. Data Engineers from application development backgrounds might prefer Data Frame APIs over Spark SQL to build Data Engineering applications.Data Processing Overview using Spark Data Frame APIs leveraging Scala as Programming LanguageProcessing Column Data using Spark Data Frame APIs leveraging Scala as Programming LanguageBasic Transformations using Spark Data Frame APIs leveraging Scala as Programming Language - Filtering, Aggregations, and SortingJoining Data Sets using Spark Data Frame APIs leveraging Scala as Programming LanguageAll the demos are given on our state-of-the-art Big Data cluster. You can avail of one-month complimentary lab access by reaching out to [email protected] with a Udemy receipt. Overview Section 1: Introduction Lecture 1 CCA 175 Spark and Hadoop Developer - Curriculum Section 2: Setting up Environment using AWS Cloud9 Lecture 2 Getting Started with Cloud9 Lecture 3 Creating Cloud9 Environment Lecture 4 Warming up with Cloud9 IDE Lecture 5 Overview of EC2 related to Cloud9 Lecture 6 Opening ports for Cloud9 Instance Lecture 7 Associating Elastic IPs to Cloud9 Instance Lecture 8 Increase EBS Volume Size of Cloud9 Instance Lecture 9 Setup Jupyter Lab on Cloud9 Lecture 10[Commands] Setup Jupyter Lab on Cloud9 Section 3: Setting up Environment - Overview of GCP and Provision Ubuntu VM Lecture 11 Signing up for GCP Lecture 12 Overview of GCP Web Console Lecture 13 Overview of GCP Pricing Lecture 14 Provision Ubuntu VM from GCP Lecture 15 Setup Docker Lecture 16 Why we are setting up Python and Jupyter Lab for Scala related course? Lecture 17 Validating Python Lecture 18 Setup Jupyter Lab Section 4: Setup Hadoop on Single Node Cluster Lecture 19 Introduction to Single Node Hadoop Cluster Lecture 20 Setup Prerequisties Lecture 21[Commands] - Setup Prerequisites Lecture 22 Setup Password less login Lecture 23[Commands] - Setup Password less login Lecture 24 Download and Install Hadoop Lecture 25[Commands] - Download and Install Hadoop Lecture 26 Configure Hadoop HDFS Lecture 27[Commands] - Configure Hadoop HDFS Lecture 28 Start and Validate HDFS Lecture 29[Commands] - Start and Validate HDFS Lecture 30 Configure Hadoop YARN Lecture 31[Commands] - Configure Hadoop YARN Lecture 32 Start and Validate YARN Lecture 33[Commands] - Start and Validate YARN Lecture 34 Managing Single Node Hadoop Lecture 35[Commands] - Managing Single Node Hadoop Section 5: Setup Hive and Spark on Single Node Cluster Lecture 36 Setup Data Sets for Practice Lecture 37[Commands] - Setup Data Sets for Practice Lecture 38 Download and Install Hive Lecture 39[Commands] - Download and Install Hive Lecture 40 Setup Database for Hive Metastore Lecture 41[Commands] - Setup Database for Hive Metastore Lecture 42 Configure and Setup Hive Metastore Lecture 43[Commands] - Configure and Setup Hive Metastore Lecture 44 Launch and Validate Hive Lecture 45[Commands] - Launch and Validate Hive Lecture 46 Scripts to Manage Single Node Cluster Lecture 47[Commands] - Scripts to Manage Single Node Cluster Lecture 48 Download and Install Spark 2 Lecture 49[Commands] - Download and Install Spark 2 Lecture 50 Configure Spark 2 Lecture 51[Commands] - Configure Spark 2 Lecture 52 Validate Spark 2 using CLIs Lecture 53[Commands] - Validate Spark 2 using CLIs Lecture 54 Validate Jupyter Lab Setup Lecture 55[Commands] - Validate Jupyter Lab Setup Lecture 56 Intergrate Spark 2 with Jupyter Lab Lecture 57[Commands] - Intergrate Spark 2 with Jupyter Lab Lecture 58 Download and Install Spark 3 Lecture 59[Commands] - Download and Install Spark 3 Lecture 60 Configure Spark 3 Lecture 61[Commands] - Configure Spark 3 Lecture 62 Validate Spark 3 using CLIs Lecture 63[Commands] - Validate Spark 3 using CLIs Lecture 64 Intergrate Spark 3 with Jupyter Lab Lecture 65[Commands] - Intergrate Spark 3 with Jupyter Lab Section 6: Scala Fundamentals Lecture 66 Introduction and Setting up of Scala Lecture 67 Setup Scala on Windows Lecture 68 Basic Programming Constructs Lecture 69 Functions Lecture 70 Object Oriented Concepts - Classes Lecture 71 Object Oriented Concepts - Objects Lecture 72 Object Oriented Concepts - Case Classes Lecture 73 Collections - Seq, Set and Map Lecture 74 Basic Map Reduce Operations Lecture 75 Setting up Data Sets for Basic I/O Operations Lecture 76 Basic I/O Operations and using Scala Collections APIs Lecture 77 Tuples Lecture 78 Development Cycle - Create Program File Lecture 79 Development Cycle - Compile source code to jar using SBT Lecture 80 Development Cycle - Setup SBT on Windows Lecture 81 Development Cycle - Compile changes and run jar with arguments Lecture 82 Development Cycle - Setup IntelliJ with Scala Lecture 83 Development Cycle - Develop Scala application using SBT in IntelliJ Section 7: Overview of Hadoop HDFS Commands Lecture 84 Getting help or usage of HDFS Commands Lecture 85 Listing HDFS Files Lecture 86 Managing HDFS Directories Lecture 87 Copying files from local to HDFS Lecture 88 Copying files from HDFS to local Lecture 89 Getting File Metadata Lecture 90 Previewing Data in HDFS File Lecture 91 HDFS Block Size Lecture 92 HDFS Replication Factor Lecture 93 Getting HDFS Storage Usage Lecture 94 Using HDFS Stat Commands Lecture 95 HDFS File Permissions Lecture 96 Overriding Properties Section 8: Apache Spark 2 using Scala - Data Processing - Overview Lecture 97 Introduction for the module Lecture 98 Starting Spark Context using spark-shell Lecture 99 Overview of Spark read APIs Lecture 100 Previewing Schema and Data using Spark APIs Lecture 101 Overview of Spark Data Frame APIs Lecture 102 Overview of Functions to Manipulate Data in Spark Data Frames Lecture 103 Overview of Spark Write APIs Section 9: Apache Spark 2 using Scala - Processing Column Data using Pre-defined Functions Lecture 104 Introduction to Pre-defined Functions Lecture 105 Creating Spark Session Object in Notebook Lecture 106 Create Dummy Data Frames for Practice Lecture 107 Categories of Functions on Spark DAta Frame Columns Lecture 108 Using Spark Special Functions - col Lecture 109 Using Spark Special Functions - lit Lecture 110 Manipulating String Columns using Spark Functions - Case Conversion and Length Lecture 111 Manipulating String Columns using Spark Functions - substring Lecture 112 Manipulating String Columns using Spark Functions - split Lecture 113 Manipulating String Columns using Spark Functions - Concatenating Strings Lecture 114 Manipulating String Columns using Spark Functions - Padding Strings Lecture 115 Manipulating String Columns using Spark Functions - Trimming unwanted characters Lecture 116 Date and Time Functions in Spark - Overview Lecture 117 Date and Time Functions in Spark - Date Arithmetic Lecture 118 Date and Time Functions in Spark - Using trunc and date_trunc Lecture 119 Date and Time Functions in Spark - Using date_format and other functions Lecture 120 Date and Time Functions in Spark - dealing with unix timestamp Lecture 121 Pre-defined Functions in Spark - Conclusion Section 10: Apache Spark 2 using Scala - Basic Transformations using Data Frames Lecture 122 Introduction to Basic Transformations using Data Frame APIs Lecture 123 Starting Spark Context Lecture 124 Overview of Filtering using Spark Data Frame APIs Lecture 125 Filtering Data from Spark Data Frames - Reading Data and Understanding Schema Lecture 126 Filtering Data from Spark Data Frames - Task 1 - Equal Operator Lecture 127 Filtering Data from Spark Data Frames - Task 2 - Comparison Operators Lecture 128 Filtering Data from Spark Data Frames - Task 3 - Boolean AND Lecture 129 Filtering Data from Spark Data Frames - Task 4 - IN Operator Lecture 130 Filtering Data from Spark Data Frames - Task 5 - Between and Like Lecture 131 Filtering Data from Spark Data Frames - Task 6 - Using functions in Filter Lecture 132 Overview of Aggregations using Spark Data Frame APIs Lecture 133 Overview of Sorting using Spark Data Frame APIs Lecture 134 Solution - Get Delayed Counts using Spark Data Frame APIs - Part 1 Lecture 135 Solution - Get Delayed Counts using Spark Data Frame APIs - Part 2 Lecture 136 Solution - Getting Delayed Counts By Date using Spark Data Frame APIs Section 11: Apache Spark 2 using Scala - Joining Data Sets Lecture 137 Prepare and Validate Data Sets Lecture 138 Starting Spark Session or Spark Context Lecture 139 Analyze Data Sets for Joins using Spark Data Frame APIs Lecture 140 Eliminate Duplicate records from Data Frame using Spark Data Frame APIs Lecture 141 Recap of Basic Transformations using Spark Data Frame APIs Lecture 142 Joining Data Sets using Spark Data Frame APIs - Problem Statements Lecture 143 Overview of Joins using Spark Data Frame APIs Lecture 144 Inner Join using Spark Data Fr - Get number of flights departed from US airports Lecture 145 Inner Join using Spark Data Fram - Get number of flights departed from US States Lecture 146 Outer Join using Spark Data Frame APIs - Get Aiports - Never Used Section 12: Apache Spark using SQL - Getting Started Lecture 147 Getting Started with Spark SQL - Overview Lecture 148 Overview of Spark Documentation Lecture 149 Launching and using Spark SQL CLI Lecture 150 Overview of Spark SQL Properties Lecture 151 Running OS Commands using Spark SQL Lecture 152 Understanding Spark Metastore Warehouse Directory Lecture 153 Managing Spark Metastore Databases Lecture 154 Managing Spark Metastore Tables Lecture 155 Retrieve Metadata of Spark Metastore Tables Lecture 156 Role of Spark Metastore or Hive Metastore Lecture 157 Exercise - Getting Started with Spark SQL Section 13: Apache Spark using SQL - Basic Transformations Lecture 158 Basic Transformation using Spark SQL - Introduction Lecture 159 Spark SQL - Overview Lecture 160 Define Problem Statement for Basic Transformations using Spark SQL Lecture 161 Prepare or Create Tables using Spark SQL Lecture 162 Projecting or Selecting Data using Spark SQL Lecture 163 Filtering Data using Spark SQL Lecture 164 Joining Tables using Spark SQL - Inner Lecture 165 Joining Tables using Spark SQL - Outer Lecture 166 Aggregating Data using Spark SQL Lecture 167 Sorting Data using Spark SQL Lecture 168 Conclusion - Final Solution using Spark SQL Section 14: Apache Spark using SQL - Basic DDL and DML Lecture 169 Introduction to Basic DDL and DML using Spark SQL Lecture 170 Create Spark Metastore Tables using Spark SQL Lecture 171 Overview of Data Types for Spark Metastore Table Columns Lecture 172 Adding Comments to Spark Metastore Tables using Spark SQL Lecture 173 Loading Data Into Spark Metastore Tables using Spark SQL - Local Lecture 174 Loading Data Into Spark Metastore Tables using Spark SQL - HDFS Lecture 175 Loading Data into Spark Metastore Tables using Spark SQL - Append and Overwrite Lecture 176 Creating External Tables in Spark Metastore using Spark SQL Lecture 177 Managed Spark Metastore Tables vs External Spark Metastore Tables Lecture 178 Overview of Spark Metastore Table File Formats Lecture 179 Drop Spark Metastore Tables and Databases Lecture 180 Truncating Spark Metastore Tables Lecture 181 Exercise - Managed Spark Metastore Tables Section 15: Apache Spark using SQL - DML and Partitioning Lecture 182 Introduction to DML and Partitioning of Spark Metastore Tables using Spark SQL Lecture 183 Introduction to Partitioning of Spark Metastore Tables using Spark SQL Lecture 184 Creating Spark Metastore Tables using Parquet File Format Lecture 185 Load vs. Insert into Spark Metastore Tables using Spark SQL Lecture 186 Inserting Data using Stage Spark Metastore Table using Spark SQL Lecture 187 Creating Partitioned Spark Metastore Tables using Spark SQL Lecture 188 Adding Partitions to Spark Metastore Tables using Spark SQL Lecture 189 Loading Data into Partitioned Spark Metastore Tables using Spark SQL Lecture 190 Inserting Data into Partitions of Spark Metastore Tables using Spark SQL Lecture 191 Using Dynamic Partition Mode to insert data into Spark Metastore Tables Lecture 192 Exercise - Partitioned Spark Metastore Tables using Spark SQL Section 16: Apache Spark using SQL - Pre-defined Functions Lecture 193 Introduction - Overview of Spark SQL Functions Lecture 194 Overview of Pre-defined Functions using Spark SQL Lecture 195 Validating Functions using Spark SQL Lecture 196 String Manipulation Functions using Spark SQL Lecture 197 Date Manipulation Functions using Spark SQL Lecture 198 Overview of Numeric Functions using Spark SQL Lecture 199 Data Type Conversion using Spark SQL Lecture 200 Dealing with Nulls using Spark SQL Lecture 201 Using CASE and WHEN using Spark SQL Lecture 202 Query Example - Word Count using Spark SQL Section 17: Apache Spark using SQL - Pre-defined Functions - Exercises Lecture 203 Prepare Users Table using Spark SQL Lecture 204 Exercise 1 - Get number of users created per year Lecture 205 Exercise 2 - Get the day name of the birth days of users Lecture 206 Exercise 3 - Get the names and email ids of users added in the year 2019 Lecture 207 Exercise 4 - Get the number of users by gender Lecture 208 Exercise 5 - Get last 4 digits of unique ids Lecture 209 Exercise 6 - Get the count of users based up on country code Section 18: Apache Spark using SQL - Windowing Functions Lecture 210 Introduction to Windowing Functions using Spark SQL Lecture 211 Prepare HR Database in Spark Metastore using Spark SQL Lecture 212 Overview of Windowing Functions using Spark SQL Lecture 213 Aggregations using Windowing Functions using Spark SQL Lecture 214 LEAD or LAG Functions using Spark SQL Lecture 215 Getting first and last values using Spark SQL Lecture 216 Ranking using Windowing Functions in Spark SQL Lecture 217 Order of execution of Spark SQL Queries Lecture 218 Overview of Subqueries using Spark SQL Lecture 219 Filtering Window Function Results using Spark SQL Section 19: Sample scenarios with solutions Lecture 220 Introduction to Sample Scenarios and Solutions Lecture 221 Problem Statements - General Guidelines Lecture 222 Initializing the job - General Guidelines Lecture 223 Getting crime count per type per month - Understanding Data Lecture 224 Getting crime count per type per month - Implementing the logic - Core API Lecture 225 Getting crime count per type per month - Implementing the logic - Data Frames Lecture 226 Getting crime count per type per month - Validating Output Lecture 227 Get inactive customers - using Core Spark API (leftOuterJoin) Lecture 228 Get inactive customers - using Data Frames and SQL Lecture 229 Get top 3 crimes in RESIDENCE - using Core Spark API Lecture 230 Get top 3 crimes in RESIDENCE - using Data Frame and SQL Lecture 231 Convert NYSE data from text file format to parquet file format Lecture 232 Get word count - with custom control arguments, num keys and file format Any IT aspirant/professional willing to learn Data Engineering using Apache Spark,Python Developers who want to learn Spark using Scala to add additional skill to be a Data Engineer,Java or Scala Developers to learn Spark using Scala to add Data Engineering Skills to their profile Homepage Hidden Content Give reaction to this post to see the hidden content. Hidden Content Give reaction to this post to see the hidden content. Hidden Content Give reaction to this post to see the hidden content. Hidden Content Give reaction to this post to see the hidden content. Link to comment
Recommended Posts
Please sign in to comment
You will be able to leave a comment after signing in
Sign In Now