5 hours ago
Deep Reinforcement Learning Made-Easy
Published 10/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 9.20 GB | Duration: 14h 41m
Reinforcement Learning for beginners to advanced learners
What you'll learn
To understand deep learning and reinforcement learning paradigms
To understand Architectures and optimization methods for deep neural network training
To implement deep learning methods within Tensor Flow and apply them to data
To understand the theoretical foundations and algorithms of reinforcement learning
To apply reinforcement learning algorithms to environments with complex dynamics
Requirements
Basic python programming but not necessary
Description
This course is the integration of deep learning and reinforcement learning. The course will introduce student with deep neural networks (DNN) starting from simple neural networks (NN) to recurrent neural network and long-term short-term memory networks. NN and DNN are the part of reinforcement learning (RL) agent so the students will be explained how to design custom RL environments and use them with RL agents. After the completion of the course the students will be able:To understand deep learning and reinforcement learning paradigmsTo understand Architectures and optimization methods for deep neural network trainingTo implement deep learning methods within Tensor Flow and apply them to data.To understand the theoretical foundations and algorithms of reinforcement learning.To apply reinforcement learning algorithms to environments with complex dynamics.Course Contents:Introduction to Deep Reinforcement LearningArtificial Neural Network (ANN)ANN to Deep Neural Network (DNN)Deep Learning Hyperparameters: RegularizationDeep Learning Hyperparameters: Activation Functions and OptimizationsConvolutional Neural Network (CNN)CNN ArchitectureRecurrent Neural Network (RNN)RNN for Long SequencesLSTM NetworkOverview of Markov Decision ProcessesBellman Equations and Value FunctionsDeep Reinforcement Learning with Q-LearningModel-Free PredictionDeep Reinforcement Learning with Policy GradientsExploration and Exploitation in Reinforcement Learning
Overview
Section 1: Introduction
Lecture 1 Introduction to Deep Reinforcement Learning
Lecture 2 Reinforcement Learning and its main components (agent, environment, rewards)
Lecture 3 Comparison with supervised and unsupervised learning
Lecture 4 Overview of the RL history
Lecture 5 Recent advances in Deep Reinforcement Learning
Lecture 6 Learning objectives for the course and Introduction to Python
Section 2: Artificial Neural Network (ANN)
Lecture 7 ANN algorithm: Nontechnical explanation
Lecture 8 ANN algorithm: Mathematical Formulae
Lecture 9 ANN algorithm: A Worked-Out Example
Section 3: ANN to Deep Neural Network (DNN)
Lecture 10 Deep Neural Network
Lecture 11 Deep learning frameworks
Lecture 12 Introduction to TensorFlow and Keras
Lecture 13 Key terms in TensorFlow
Lecture 14 KERAS
Lecture 15 The concept of gradient descent
Lecture 16 Learning rate
Section 4: Deep Learning Hyperparameters Regularization
Lecture 17 Hyper parameters in Machine Learning
Lecture 18 L1 and L2 Regularization in Regression
Lecture 19 Regularization in Neural networks
Lecture 20 Regularization in Regression
Lecture 21 Data standardization in L1 and L2 regularization
Lecture 22 Dropout Regularization
Lecture 23 Early stopping method for neural networks
Lecture 24 Saving the Model
Section 5: Deep Learning Hyper parameters, Activation Functions and Optimizations
Lecture 25 Loss Functions
Lecture 26 Activation Functions
Lecture 27 Activation Function: Sigmoid
Lecture 28 Activation Function: Tanh
Lecture 29 Activation Function: ReLU
Lecture 30 Activation Function: SoftMax
Lecture 31 Optimizers: SGD, Mini-batch descent
Section 6: Convolutional Neural Network (CNN)
Lecture 32 Introduction to CNN
Lecture 33 Artificial Neural network vs Convolutional Neural Network (ANN vs CNN)
Lecture 34 Filters or kernels
Section 7: Recurrent Neural Network (RNN)
Lecture 35 Cross-sectional data vs sequential data
Lecture 36 Models for sequential data: ANN, CNN and Sequential ANN
Lecture 37 Case study of word prediction
Lecture 38 Introduction to RNN
Lecture 39 Python Code: Model Training of CNN and RNN
Section 8: Reinforcement Learning: Overview of Markov Decision Processes
Lecture 40 Review of Reinforcement Learning
Lecture 41 Introduction to Value Function Approximation
Lecture 42 Python Code: Value Function Approximation using CartPole
Lecture 43 Linear function approximation
Lecture 44 Python Code: Linear Function Approximation using CartPole
Lecture 45 Non-linear function approximation with deep neural networks
Lecture 46 Python Code: Non-Linear Function Approximation with Neural Networks
Lecture 47 Applications and limitations of Value Function Approximation
Lecture 48 Definition of Markov Decision Processes (MDPs)
Lecture 49 Python Code: MDPs and Bellman Equations and Value Functions
Lecture 50 Key components of an MDP
Lecture 51 Bellman Equations and Value Functions
Lecture 52 Policy iteration and value iteration algorithms
Lecture 53 Python Code: Policy iteration and value iteration algorithms
Section 9: Bellman Equations and Value Functions
Lecture 54 Python Code: Introduction to Python Gym Library Documentation
Lecture 55 Review of Bellman Equations
Lecture 56 Definition of value functions (state value, action value)
Lecture 57 Calculation of value functions using Bellman Equations
Lecture 58 Intuitive interpretation of value functions
Lecture 59 Markov Processes
Lecture 60 Markov Reward Processes
Lecture 61 Markov Decision Processes
Lecture 62 Extensions to MDPs
Section 10: Deep Reinforcement Learning with Q-Learning
Lecture 63 Definition of Q-Learning
Lecture 64 Calculation of Q-Values using Q-Learning
Lecture 65 Python Code: Q-Learning and Python Gym library
Lecture 66 Comparison of Q-Learning with policy iteration and value iteration algorithms
Lecture 67 Advantages and disadvantages of Q-Learning
Lecture 68 Overview of Deep Q-Network (DQN) algorithm
Lecture 69 Architecture of a DQN model
Lecture 70 Implementation of DQN in TensorFlow
Lecture 71 Python Code: Implementation of DQN
Lecture 72 Applications and limitations of DQN
Section 11: Model-Free Prediction
Lecture 73 Definition of Model-Free Prediction
Lecture 74 Calculation of state values using Model-Free Prediction methods
Lecture 75 Monte Carlo
Lecture 76 Python Code: Monte Carlo Algorithm
Lecture 77 TD Learning
Lecture 78 Python Code: Temporal Difference (TD) Learning Algorithm
Lecture 79 Python Code: SARSA Algorithm
Lecture 80 Discussion of the limitations of Model-Free Prediction
Lecture 81 Python Code: Expected SARSA Algorithm
Lecture 82 Python Code: n-Steps SARSA Algorithm
Section 12: Deep Reinforcement Learning with Policy Gradients
Lecture 83 Overview of Policy Gradient methods
Lecture 84 Policy optimization using gradient ascent
Lecture 85 Actor-critic algorithms
Lecture 86 Python code: Actor-critic algorithm
Lecture 87 Implementation of policy gradient methods in TensorFlow
Lecture 88 Python code: Deep Reinforcement Learning with Policy Gradients
Section 13: Intoduction to MATLAB Reinforcement Learning Toolbox
Lecture 89 MATLAB code: Introduction to MATLAB Reinforcement Learning Designer
Lecture 90 MATLAB code: Introduction to MATLAB RL Designer and Coding
Section 14: Exploration and Exploitation in Reinforcement Learning
Lecture 91 Exploration vs. exploitation tradeoff
Lecture 92 Different strategies for exploration
Lecture 93 Python code: Exploration vs. Exploitation using the epsilon-greedy strategy
Lecture 94 Exploration in model-based and model-free reinforcement learning
Lecture 95 Implementation of Policy Gradient Methods in TensorFlow
Lecture 96 Python code: Proximal Policy Optimization PPO agent's Algorithm
Lecture 97 Python Code: PPO Algorithm
Lecture 98 Python Code: PPO using stable_baselines3 and Gym libraries
Lecture 99 Python Code: PPO using stable_baselines3 and gymnasium libraries
Section 15: Reinforcement Learning Agents' Types
Lecture 100 Reinforcement Learning Agents' Types
Lecture 101 Deep Deterministic Policy Gradient (DDPG)
Lecture 102 Python code: Deep Deterministic Policy Gradient (DDPG) agent's Algorithm
Lecture 103 Twin Delayed DDPG (TD3)
Lecture 104 Model-Based Policy Optimization (MBPO)
Lecture 105 Python code: Model-Based Policy Optimization (MBPO) agent's Algorithm
Lecture 106 Advantage Actor-Critic (A2C)
Lecture 107 Python code: Advantage Actor-Critic (A2C) agent's Algorithm
Lecture 108 Asynchronous Advantage Actor-Critic (A3C)
Lecture 109 Trust Region Policy Optimization (TRPO)
Lecture 110 Soft Actor-Critic (SAC)
Lecture 111 Multi-Agent Reinforcement Learning
Lecture 112 Python code: The Fruit Gathering Game using Cooperative Multi-agent Reinforcemen
Lecture 113 Python code: Creating Custom Environment with PPO agent
Data Scientists,Machine Learning Engineers,Robotics Programmer