Skip to content
Deadfish🐟 Studying Free
Main Navigation
Home
Reinforcement Learning
Cognitive Behavioral Computing
ML Derivation
Appearance
Menu
Return to top
On this page
Lecture7: Exploration and Exploitation
Introduction
Three broad families:
state-action exploration / parameter exploration
Multi-Armed Bandits
We don't know where we start.
What does the regret look like?
not only initialize the value but also the count