Login

**AD-TEAM** · 02-05-2025, 04:33 AM

[Image: 9dd1eff00165f2dbd6c9d6b40010170a.jpg]

Contextual Multi-Armed Bandit Problems In Python
Published 3/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.54 GB | Duration: 9h 1m

All you need to master and apply multi-armed bandit problems into real-world problems

[b]What you'll learn[/b]

Master all essential Bandit Algorithms

Learn How to Apply Bandit Problems into Real-world Applications with Focus on Product Recommendation

Learn How to Implement All Essential Aspects of Bandit Algorithms in Python

Build Different Deterministic and Stochastic Environments for Bandit Problems to Simulate Different Scenarios

Learn and Apply Bayesian Inference for Bandit Problems and Beyond as a Byproduct of This Course

Understand Essential Concepts in Contextual Bandit Problems

Apply Contextual Bandit Problems in a Real-World Product Recommendation Dataset and Scenario

[b]Requirements[/b]

No obligational pre-requisites

[b]Description[/b]

Welcome to our course where we'll guide you through Multi-armed Bandit Problems and Contextual Bandit Problems, step by step. No prior experience needed - we'll start from scratch and build up your skills so you can use these algorithms for your own projects.We'll cover the basics like random, greedy, e-greedy, softmax, and more advanced methods like Upper Confidence Bound (UCB). Along the way, we'll explain concepts like Regret concept instead of just focusing on rewards value in Reinforcement Learning and Multi-armed Bandit Problems. Through practical examples in different types of environments, like deterministic, stochastic and non-stationary environment, you'll see how these algorithms perform in action.Ever wondered how Multi-armed Bandit problems relate to Reinforcement Learning? We'll break it down for you, highlighting what's similar and what's different.We'll also dive into Bayesian inference, introducing you to Thompson sampling, both for binary reward and real value reward in simple terms, and use Beta and Gaussian distributions to estimate the probability distributions with clear examples to help you understand the theory and how to put it into practice.Then, we'll explore Contextual Bandit problems, using the LinUCB algorithm as our guide. From basic toy examples to real-world data, you'll see how it works and compare it to simpler methods like e-greedy.Don't worry if you're new to Python - we've got you covered with a section to help you get started. And to make sure you're really getting it, we'll throw in some quizzes to test your understanding along the way.Our explanations are clear, our code is clean, and we've added fun visualizations to help everything make sense. So join us on this journey and become a master of Multi-armed and Contextual Bandit Problems!

Overview

Section 1: Introduction

Lecture 1 Course Overview

Lecture 2 Casino and Statistics

Lecture 3 Story: A Gambler in Casino

Lecture 4 Multi-armed Bandit Problems and Their Applications

Lecture 5 Multi-armed Bandit Problems for Startup Founders

Lecture 6 Similarities and Differences between Bandit Problems and Reinforcement Learning

Lecture 7 Slides

Lecture 8 Resources

Section 2: Introduction to Python

Lecture 9 Introduction to Google Colab

Lecture 10 Introduction to Python Part 1

Lecture 11 Introduction to Python Part 2

Lecture 12 Introduction to Python Part 3

Lecture 13 Code for Introduction to Python

Section 3: Fundamental Algorithms in Multi-Armed Bandits Problems

Lecture 14 Environment Design Logic

Lecture 15 Deterministic Environment

Lecture 16 Proof for Incremental Averaging

Lecture 17 Random Agent Class Implementation

Lecture 18 Incremental Average Implementation

Lecture 19 Results for Random Agent

Lecture 20 Plotting Function Part1

Lecture 21 Plotting Function Part2

Lecture 22 Plot Results for Random Agent

Lecture 23 Greedy Agent

Lecture 24 Epsilon Greedy Agent

Lecture 25 Epsilon Greedy Parameter Tuning Part1

Lecture 26 Epsilon Greedy Parameter Tuning Part2

Lecture 27 Difference Between Stochasticity, Uncertainty, and Non-Stationary

Lecture 28 Create a Stochastic Environment

Lecture 29 Create an Instance of Stochastic Environment

Lecture 30 Agents Performance with Stochastic Environment

Lecture 31 Softmax Agent Implementation

Lecture 32 Softmax Agent Results

Lecture 33 Upper Confidence Bound (UCB) Algorithm Theory

Lecture 34 UCB Algorithm Implementation

Lecture 35 UCB Algorithm Results

Lecture 36 Comparisons of All Agent Performance and a Life Lesson

Lecture 37 Regret Concept and Implementation

Lecture 38 Regret Function Visualization

Lecture 39 Epsilon Greedy with Regret Concept

Lecture 40 Regret Curves Results for Deterministic Environment

Lecture 41 Regret Curves Results for Stochastic Environment

Lecture 42 Code for Basic Agents

Section 4: Thompson Sampling for Multi-Armed Bandits

Lecture 43 Why and How We can Use Thompson Sampling

Lecture 44 Design of Thompson Sampling Class Part 1

Lecture 45 Design of Thompson Sampling Class Part 2

Lecture 46 Results for Thompson Sampling with Binary Reward

Lecture 47 Thompson Sampling For Binary Reward with Stochastic Environment

Lecture 48 Theory for Gaussian Thompson Sampling

Lecture 49 Environment for Gaussian Thompson Sampling

Lecture 50 Select Arm Module for Gaussian Thompson Sampling Class

Lecture 51 Parameter Update Module for Gaussian Thompson Sampling Agent

Lecture 52 Visualization Function for Gaussian Thompson Sampling

Lecture 53 Results for Gaussian Thompson Sampling

Lecture 54 Code for Thompson Sampling

Section 5: Contextual Bandit Problems

Lecture 55 Contextual Bandit Problems vs Supervised Learning

Lecture 56 LinUCB Math Notations

Lecture 57 LinUCB Algorithm Theory

Lecture 58 LinUCB Implementation Part 1

Lecture 59 LinUCB Implementation Part 2

Lecture 60 LinUCB Implementation Part 3

Lecture 61 Test LinUCB Algorithm

Lecture 62 Epsilon Greedy Algorithm Implementation

Lecture 63 Simulation Functions

Lecture 64 Comparison of Epsilon Greedy and LinUCB with Toy Data

Lecture 65 Real-world Case Dataset Explanation

Lecture 66 Split Data into Train and Test

Lecture 67 Test Agents with Accuracy Metric

Lecture 68 Evaluate Agent Performances based on Accumulated Rewards

Lecture 69 Datasets and Data Preparation Code

Lecture 70 Code for Contextual Bandit Problems

Web Application Developers,Researchers working on Action optimization,Machine Learning Developers and Data Scientists,Startup Enthusiasts Driven to Develop Customized Recommendation Apps.

[Image: 5JjsoxLv_o.jpg]