Fiveable

๐Ÿ†šGame Theory and Economic Behavior Unit 14 Review

QR code for Game Theory and Economic Behavior practice questions

14.3 Learning models in game theory

๐Ÿ†šGame Theory and Economic Behavior
Unit 14 Review

14.3 Learning models in game theory

Written by the Fiveable Content Team โ€ข Last updated September 2025
Written by the Fiveable Content Team โ€ข Last updated September 2025
๐Ÿ†šGame Theory and Economic Behavior
Unit & Topic Study Guides

Learning models in game theory explore how players adapt and improve their strategies over time. These models range from simple reinforcement learning to complex belief-based approaches, each offering unique insights into strategic behavior.

Reinforcement learning rewards successful actions, while belief-based models update expectations about opponents. Social learning and imitation add another layer, showing how strategies spread through populations. These models help explain real-world strategic behavior and decision-making processes.

Reinforcement and Q-Learning Models

Reinforcement Learning Fundamentals

  • Reinforcement learning involves agents learning optimal strategies through trial and error interactions with their environment
  • Agents receive rewards or punishments based on the outcomes of their actions and aim to maximize their cumulative reward over time
  • Key components of reinforcement learning include states, actions, rewards, and a value function that estimates the long-term value of being in a particular state or taking a specific action
  • Reinforcement learning algorithms update the value function based on the observed rewards and use it to guide the agent's decision-making process (Q-learning, SARSA)

Q-Learning Algorithm

  • Q-learning is a model-free reinforcement learning algorithm that learns an optimal action-value function, denoted as Q(s,a), representing the expected cumulative reward of taking action a in state s
  • The Q-function is updated iteratively based on the Bellman equation: $Q(s,a) \leftarrow Q(s,a) + \alpha[r + \gamma \max_{a'} Q(s',a') - Q(s,a)]$
    • $\alpha$ is the learning rate, $\gamma$ is the discount factor, $r$ is the immediate reward, and $s'$ is the next state
  • Q-learning is an off-policy algorithm, meaning it learns the optimal policy independently of the policy being followed during the learning process
  • The agent selects actions based on an exploration-exploitation trade-off, typically using strategies like $\epsilon$-greedy or softmax to balance exploring new actions and exploiting the current best action

Experience-Weighted Attraction (EWA) Learning

  • EWA learning is a hybrid model that combines elements of reinforcement learning and belief-based learning
  • Each strategy is assigned an attraction value, which is updated based on the payoffs received and the agent's experience
  • The attraction update rule incorporates both the realized payoff and the forgone payoff of unchosen strategies, weighted by an experience factor
  • The experience factor determines the relative importance of past and recent experiences in shaping the attraction values
  • EWA learning can capture both the law of effect (reinforcement) and the power law of practice (experience-based learning) observed in human learning

Belief-Based Learning Models

Fictitious Play

  • Fictitious play is a belief-based learning model where agents form beliefs about their opponents' strategies based on observed actions
  • Each agent maintains a belief distribution over the possible strategies of their opponents and best-responds to these beliefs
  • The belief distribution is updated using empirical frequencies of observed actions, assuming opponents are playing stationary strategies
  • Fictitious play can converge to Nash equilibria in certain classes of games (zero-sum, potential, and supermodular games) but may fail to converge in general

Best Response Dynamics

  • Best response dynamics is a learning model where agents repeatedly best-respond to their opponents' current strategies
  • At each time step, one or more agents update their strategies to the best response given their opponents' current strategies
  • Best response dynamics can be deterministic (agents always choose the best response) or stochastic (agents choose best responses with high probability)
  • The convergence properties of best response dynamics depend on the game structure and the specific update rules used (simultaneous or asynchronous updates)

Bayesian Learning

  • Bayesian learning is a belief-based model where agents update their beliefs about opponents' strategies using Bayes' rule
  • Agents start with prior beliefs over the possible strategies of their opponents and update these beliefs based on observed actions
  • The posterior belief distribution is obtained by multiplying the prior beliefs with the likelihood of observed actions, normalized by the total probability
  • Bayesian learning allows agents to incorporate prior knowledge and handle uncertainty in a principled manner
  • The convergence properties of Bayesian learning depend on the prior beliefs and the true strategies being played by the opponents

Imitation and Social Learning

Imitation Learning Mechanisms

  • Imitation learning involves agents learning by observing and copying the behavior of other agents in the population
  • Agents may imitate successful strategies based on their payoffs or popularity within the population
  • Imitation can be random (copying a randomly selected agent) or guided by specific rules (copying the best-performing agent or the most common strategy)
  • Imitation learning can lead to the spread of successful strategies and the emergence of social norms and conventions
  • However, imitation can also result in suboptimal outcomes if agents copy inferior strategies or get stuck in local optima

Social Learning and Cultural Evolution

  • Social learning refers to the process of acquiring knowledge, skills, or behaviors through observation, interaction, or communication with other agents
  • Social learning can occur through various mechanisms, such as imitation, teaching, or language-mediated learning
  • Cultural evolution studies how social learning shapes the distribution of behaviors, norms, and beliefs in a population over time
  • Social learning can facilitate the accumulation of knowledge across generations and the adaptation of populations to changing environments
  • However, social learning can also lead to the spread of maladaptive behaviors or the persistence of outdated practices (conformity bias, cultural inertia)
  • The interplay between individual learning, social learning, and environmental factors determines the dynamics of cultural evolution in a population