09-26-2024, 06:30 AM
The Complete Hands-On Introduction To Apache Airflow
Last updated 3/2023
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.67 GB | Duration: 3h 28m
Learn to author, schedule and monitor data pipelines through practical examples using Apache Airflow
[b]What you'll learn[/b]
Create plugins to add functionalities to Apache Airflow.
Using Docker with Airflow and different executors
Master core functionalities such as DAGs, Operators, Tasks, Workflows, etc
Understand and apply advanced concepts of Apache Airflow such as XCOMs, Branching and SubDAGs.
The difference between Sequential, Local and Celery Executors, how do they work and how can you use them.
Use Apache Airflow in a Big Data ecosystem with Hive, PostgreSQL, Elasticsearch etc.
Install and configure Apache Airflow
Think, answer and implement solutions using Airflow to real data processing problems
[b]Requirements[/b]
VirtualBox must be installed - A VM of 3Gb will have to be downloaded
At least 8 gigabytes of memory
Some prior programming or scripting experience. Python experience will help you a lot but since it's a very easy language to learn, it shouldn't be too difficult if you are not familiar with.
[b]Description[/b]
Apache Airflow is an open-source platform to programmatically author, schedule and monitor workflows. If you have many ETL(s) to manage, Airflow is a must-have.In this course you are going to learn everything you need to start using Apache Airflow through theory and pratical videos. Starting from very basic notions such as, what is Airflow and how it works, we will dive into advanced concepts such as, how to create plugins and make real dynamic pipelines.
Overview
Section 1: Course Introduction
Lecture 1 Prerequisites
Lecture 2 Course Objectives
Lecture 3 Who I am?
Lecture 4 Development Environment
Section 2: Getting Started with Airflow
Lecture 5 Why Airflow?
Lecture 6 What is Airflow?
Lecture 7 Core Components
Lecture 8 Core Concepts
Lecture 9 Airflow is not.
Lecture 10 Single Node Architecture
Lecture 11 Multi Node Architecture
Lecture 12 How does it work?
Lecture 13 [Practice] Installing Apache Airflow
Lecture 14 What is Docker?
Lecture 15 The docker-compose file
Lecture 16 Key Takeaways
Section 3: The important views of the Airflow UI
Lecture 17 The DAGs View
Lecture 18 The Grid View
Lecture 19 The Graph View
Lecture 20 The Landing Times View
Lecture 21 The Calendar View
Lecture 22 The Gantt View
Lecture 23 The Code View
Lecture 24 Wrap up!
Section 4: Coding Your First Data Pipeline with Airflow
Lecture 25 The Project
Lecture 26 Advices
Lecture 27 What is a DAG?
Lecture 28 DAG Skeleton
Lecture 29 What is an Operator?
Lecture 30 Providers
Lecture 31 Create a Table
Lecture 32 Create a connection
Lecture 33 The secret weapon!
Lecture 34 What is a Sensor?
Lecture 35 Is the API available?
Lecture 36 Extract users
Lecture 37 Process users
Lecture 38 Before running process_user
Lecture 39 What is a Hook?
Lecture 40 Store users
Lecture 41 Order matters!
Lecture 42 Your DAG in action!
Lecture 43 DAG Scheduling
Lecture 44 Backfilling: How does it work?
Lecture 45 Wrap up!
Section 5: The New Way of Scheduling DAGs
Lecture 46 Why do you need that feature?
Lecture 47 What is a Dataset?
Lecture 48 Adios schedule_interval!
Lecture 49 Create the Producer DAG
Lecture 50 Create the Consumer DAG
Lecture 51 Track your Datasets with the new view!
Lecture 52 Wait for many datasets
Lecture 53 Dataset limitations
Section 6: Databases and Executors
Lecture 54 What's an executor?
Lecture 55 The default config
Lecture 56 The Sequential Executor
Lecture 57 The Local Executor
Lecture 58 The Celery Executor
Lecture 59 The current config
Lecture 60 Add the DAG parallel_dag.py into the dags folder
Lecture 61 Monitor your tasks with Flower
Lecture 62 Remove DAG examples
Lecture 63 Running tasks on Celery Workers
Lecture 64 What is a queue?
Lecture 65 Add a new Celery Worker
Lecture 66 Create a queue to better distribute tasks
Lecture 67 Send a task to a specific queue
Lecture 68 Concurrency, the parameters you must know!
Section 7: Implementing Advanced Concepts in Airflow
Lecture 69 Adios repetitive patterns
Lecture 70 Add the DAG group_dag.py
Lecture 71 How to use SubDAGs?
Lecture 72 Adios SubDAGs, welcome TaskGroups!
Lecture 73 Add the DAG xcom_dag.py
Lecture 74 Sharing data between tasks with XComs
Lecture 75 [Practice] XComs in action!
Lecture 76 Choosing a specific path in your DAG
Lecture 77 [Practice] Executing a task according to a condition
Lecture 78 Trigger rules or how tasks get triggered
Section 8: Creating Airflow Plugins with Elasticsearch and PostgreSQL
Lecture 79 Introduction
Lecture 80 What's Elasticsearch?
Lecture 81 Running Elasticsearch with Airflow
Lecture 82 How the plugin system works?
Lecture 83 Create the connection
Lecture 84 Create the ElasticHook
Lecture 85 Add ElasticHook to the Plugin system
Lecture 86 Add the DAG elastic_dag.py
Lecture 87 Your Hook in Action!
Section 9: BONUS - APPENDIX
Lecture 88 [BLOG POST] How to use the DockerOperator with Templating and Apache Spark
Lecture 89 [BLOG POST] Apache Airflow with Kubernetes Executor
Lecture 90 [BLOG POST] How to use templates and macros in Apache Airflow
Lecture 91 [BLOG POST] How to use timezones in Apache Airflow
Lecture 92 [BLOG POST] How to use the BashOperator
Lecture 93 [BLOG POST] Variables in Apache Airflow: The Guide
Lecture 94 [BLOG POST] Best Practices in Apache Airflow (part 1)
Lecture 95 Unsupported video hosting
[b]What you'll learn[/b]
Create plugins to add functionalities to Apache Airflow.
Using Docker with Airflow and different executors
Master core functionalities such as DAGs, Operators, Tasks, Workflows, etc
Understand and apply advanced concepts of Apache Airflow such as XCOMs, Branching and SubDAGs.
The difference between Sequential, Local and Celery Executors, how do they work and how can you use them.
Use Apache Airflow in a Big Data ecosystem with Hive, PostgreSQL, Elasticsearch etc.
Install and configure Apache Airflow
Think, answer and implement solutions using Airflow to real data processing problems
[b]Requirements[/b]
VirtualBox must be installed - A VM of 3Gb will have to be downloaded
At least 8 gigabytes of memory
Some prior programming or scripting experience. Python experience will help you a lot but since it's a very easy language to learn, it shouldn't be too difficult if you are not familiar with.
[b]Description[/b]
Apache Airflow is an open-source platform to programmatically author, schedule and monitor workflows. If you have many ETL(s) to manage, Airflow is a must-have.In this course you are going to learn everything you need to start using Apache Airflow through theory and pratical videos. Starting from very basic notions such as, what is Airflow and how it works, we will dive into advanced concepts such as, how to create plugins and make real dynamic pipelines.
Overview
Section 1: Course Introduction
Lecture 1 Prerequisites
Lecture 2 Course Objectives
Lecture 3 Who I am?
Lecture 4 Development Environment
Section 2: Getting Started with Airflow
Lecture 5 Why Airflow?
Lecture 6 What is Airflow?
Lecture 7 Core Components
Lecture 8 Core Concepts
Lecture 9 Airflow is not.
Lecture 10 Single Node Architecture
Lecture 11 Multi Node Architecture
Lecture 12 How does it work?
Lecture 13 [Practice] Installing Apache Airflow
Lecture 14 What is Docker?
Lecture 15 The docker-compose file
Lecture 16 Key Takeaways
Section 3: The important views of the Airflow UI
Lecture 17 The DAGs View
Lecture 18 The Grid View
Lecture 19 The Graph View
Lecture 20 The Landing Times View
Lecture 21 The Calendar View
Lecture 22 The Gantt View
Lecture 23 The Code View
Lecture 24 Wrap up!
Section 4: Coding Your First Data Pipeline with Airflow
Lecture 25 The Project
Lecture 26 Advices
Lecture 27 What is a DAG?
Lecture 28 DAG Skeleton
Lecture 29 What is an Operator?
Lecture 30 Providers
Lecture 31 Create a Table
Lecture 32 Create a connection
Lecture 33 The secret weapon!
Lecture 34 What is a Sensor?
Lecture 35 Is the API available?
Lecture 36 Extract users
Lecture 37 Process users
Lecture 38 Before running process_user
Lecture 39 What is a Hook?
Lecture 40 Store users
Lecture 41 Order matters!
Lecture 42 Your DAG in action!
Lecture 43 DAG Scheduling
Lecture 44 Backfilling: How does it work?
Lecture 45 Wrap up!
Section 5: The New Way of Scheduling DAGs
Lecture 46 Why do you need that feature?
Lecture 47 What is a Dataset?
Lecture 48 Adios schedule_interval!
Lecture 49 Create the Producer DAG
Lecture 50 Create the Consumer DAG
Lecture 51 Track your Datasets with the new view!
Lecture 52 Wait for many datasets
Lecture 53 Dataset limitations
Section 6: Databases and Executors
Lecture 54 What's an executor?
Lecture 55 The default config
Lecture 56 The Sequential Executor
Lecture 57 The Local Executor
Lecture 58 The Celery Executor
Lecture 59 The current config
Lecture 60 Add the DAG parallel_dag.py into the dags folder
Lecture 61 Monitor your tasks with Flower
Lecture 62 Remove DAG examples
Lecture 63 Running tasks on Celery Workers
Lecture 64 What is a queue?
Lecture 65 Add a new Celery Worker
Lecture 66 Create a queue to better distribute tasks
Lecture 67 Send a task to a specific queue
Lecture 68 Concurrency, the parameters you must know!
Section 7: Implementing Advanced Concepts in Airflow
Lecture 69 Adios repetitive patterns
Lecture 70 Add the DAG group_dag.py
Lecture 71 How to use SubDAGs?
Lecture 72 Adios SubDAGs, welcome TaskGroups!
Lecture 73 Add the DAG xcom_dag.py
Lecture 74 Sharing data between tasks with XComs
Lecture 75 [Practice] XComs in action!
Lecture 76 Choosing a specific path in your DAG
Lecture 77 [Practice] Executing a task according to a condition
Lecture 78 Trigger rules or how tasks get triggered
Section 8: Creating Airflow Plugins with Elasticsearch and PostgreSQL
Lecture 79 Introduction
Lecture 80 What's Elasticsearch?
Lecture 81 Running Elasticsearch with Airflow
Lecture 82 How the plugin system works?
Lecture 83 Create the connection
Lecture 84 Create the ElasticHook
Lecture 85 Add ElasticHook to the Plugin system
Lecture 86 Add the DAG elastic_dag.py
Lecture 87 Your Hook in Action!
Section 9: BONUS - APPENDIX
Lecture 88 [BLOG POST] How to use the DockerOperator with Templating and Apache Spark
Lecture 89 [BLOG POST] Apache Airflow with Kubernetes Executor
Lecture 90 [BLOG POST] How to use templates and macros in Apache Airflow
Lecture 91 [BLOG POST] How to use timezones in Apache Airflow
Lecture 92 [BLOG POST] How to use the BashOperator
Lecture 93 [BLOG POST] Variables in Apache Airflow: The Guide
Lecture 94 [BLOG POST] Best Practices in Apache Airflow (part 1)
Lecture 95 Unsupported video hosting