Leveraging Unstructured Data with Cloud Dataproc on Google Cloud Platform
You'll learn how to create and manage computing clusters to run Hadoop, Spark, Pig and/or Hive jobs on Google Cloud Platform.
What you'll learn
This 1-week, accelerate course builds upon previous courses in the Data Engineering on Google Cloud Platform specialization. Through a combination of video lectures, demonstrations, and hands-on labs, you'll learn how to create and manage computing clusters to run Hadoop, Spark, Pig and/or Hive jobs on Google Cloud Platform. You will also learn how to access various cloud storage options from their compute clusters and integrate Google's machine learning capabilities into their analytics programs.
Table of contents
- Introduction to Leveraging Unstructured Data Course 2m
- Introducing Cloud Dataproc 1m
- Defining Unstructured Data 5m
- Deriving Value from Unstructured Data 7m
- Approaches to Working with Big Data 4m
- MapReduce and Hadoop Origins 5m
- On-prem Hadoop has a lot of Overhead 2m
- Cloud Dataproc versus Hadoop Alternatives 3m
- Creating a Dataproc Cluster 4m
- Dataproc Customization 3m
- Dataproc and the CLI 1m
- Getting Started With GCP And Qwiklabs 4m
- Lab - Create a Dataproc Cluster 0m
- Leveraging Unstructured Data - Lab 1 : Creating a Dataproc Cluster v1.3 0m
- Create a Dataproc Cluster Lab Demo and Review 8m
- Custom Machine Types 3m
- Preemptible VMs 3m
- Wrap-up for Introduction to Cloud Dataproc 1m
- Overview of Running Dataproc jobs 3m
- Methods for Submitting Jobs 1m
- Lab - Working with Structured and Semi-Structured Data 2m
- Leveraging Unstructured Data - Lab 2 : Work with structured and semi-structured data v1.3 0m
- Working with Structured and Semi-Structured Data Lab Demo and Review 12m
- Separation of Storage and Compute 7m
- Evolution of Data Processing 5m
- The Importance of Networking in Data Processing 4m
- Separating Storage and Compute with Spark 2m
- Submitting Spark Jobs 4m
- Overview of Spark Concepts 3m
- Lab - Working with Spark jobs 1m
- Leveraging Unstructured Data - Lab 3 : Submit Dataproc jobs for unstructured data v1.3 0m
- Working with Spark Jobs Lab Demo and Review 9m
- Wrap-up for Running Dataproc Jobs 0m
- Leveraging GCP 2m
- BigQuery Support 8m
- Lab - Leverage GCP 1m
- Leveraging Unstructured Data - Lab 4 : Leverage GCP v1.3 0m
- Leverage GCP Lab Demo and Review 5m
- Cluster Customization 4m
- Installing Software on a Dataproc Cluster 8m
- Lab - Cluster Automation Using CLI commands 0m
- Leveraging Unstructured Data - Lab 5 : Cluster automation using CLI commands v1.3 0m
- Cluster Automation Using CLI Commands Lab demo and Review 9m
- Wrap-up for Leveraging GCP 1m
- Review of Leveraging GCP 0m
- Continuing Discussion of Machine Learning 2m
- A Closer Look at Machine Learning 4m
- Examples of Applied ML 3m
- Natural Language Processing Close-Up 3m
- Lab - Adding Machine Learning 1m
- Leveraging Unstructured Data - Lab 6 : Add Machine Learning (ML) v1.3 0m
- Adding Machine Learning Lab demo and Review 10m
- Wrap-Up for Analyzing Unstructured Data 0m