Fiveable
Fiveable

or

Log in

Find what you need to study


Light

Find what you need to study

4.3 Operant Conditioning

5 min readdecember 20, 2022

John Mohl

John Mohl

Haseung Jun

Haseung Jun

John Mohl

John Mohl

Haseung Jun

Haseung Jun

 refers to when a behavior leads to an environmental response, which affects the likelihood of the behavior happening again. 

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-rsD09i9sZzVF.png?alt=media&token=b0ac2120-625c-4f43-a4af-d3f1130012fa

Image Courtesy of Verywell mind.

One of the earliest contributors to this aspect of learning was , who found that behaviors that had a favorable outcome became stronger. In contrast, behaviors that had an unfavorable outcome became weaker. He referred to this as his .  

 took this principle further and described different types of consequences that can occur and how they could affect the presentation of the behavior. He created what is called the "", where animals in the box only receive food if they press a lever or peck a disk. The food was used as a reinforcer, using the food as .

🎥 Watch: AP PsychologyOperant Conditioning with Pigeons

Reinforcement and Punishment

Reinforcing behavior means there is a greater likelihood that the behavior will occur again. Contrarily, punishing a behavior will create a lessened probability that the behavior will happen again. In the box,

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-VP44R4CRgySf.JPG?alt=media&token=19cf3dd5-15f6-4c3a-a54b-794836054126

Image Courtesy of Verywell Mind.

Describing a consequence as "positive" does not indicate a synonym of "good," similarly with "negative" and "bad." Instead, the use of the word "positive" suggests the presence of a result, whereas "negative" indicates the absence or disappearance.

Thus, when a behavior is positively reinforced, it means something is presented (usually something pleasant) to increase the likelihood of the behavior happening again. When something is negatively reinforced, something is taken away (usually unpleasant) to encourage that behavior to happen again.

When something is positively punished, something is presented (usually unpleasant), making the behavior happen less often. In contrast, something negatively punished has something taken away (usually something pleasant) to make that behavior happen less often.

Term

Description

Outcome

Example

Add or increase a pleasant stimulus

Behavior is strengthened 

You get a cookie for an “A.” 🍪

Negative 

 

Reduce or remove an unpleasant stimulus

Behavior is strengthened

Taking painkillers (removes pain), the behavior of taking painkillers is strengthened. 

Add an unpleasant stimulus

Behavior is weakened

Give more homework for misbehavior ✍️

Reduce or remove pleasant stimulus 

Behavior is weakened

No phone 📱 after breaking curfew 

Table adapted from Open Source Textbook.

Skinner used negative and positive reinforcements and punishments to train the rats inside the . But sometimes, we can't wait forever for the rat to learn. So in order to speed up the process, we can use . reinforces the steps to reach the end goal. For example, the rat could be reinforced for touching the lever with any part of its body. Because we're rewarding any behavior close to the behavior we want, we have a greater chance of getting the rat to stumble upon the behavior we want.

There's also something called . When the animals are trained enough, they can perform multiple tasks in order to get the reward. A good example of this would be going through an obstacle course to get the final reward.

There are basic conditioning phenomena also describe the process of .

Rat learns to press the food lever for food
The rat unlearns that connection between the lever and food and the rat stops pressing the lever
After a period of rest without the learned behavior, the rat presses the lever
The rat presses anything that looks like a lever, thinking it will give it food
The rat learns to only press a certain lever

🎥 Watch: AP PsychologyPositive and Negative Punishments

Limitations to Operant Conditioning

Despite stringent behaviorists’ claims, there are limitations to classical conditioning. When presented with a puzzle 🧩, some organisms can discover the solution to the problem without proper reinforcements to guide them to the solution. This phenomenon is known as . is sometimes referred to as the “a-ha moment” when one suddenly realizes the solution to a problem💡.

found that rats did not show any noticeable improvement in getting through a maze in the absence of . However, when was provided, he found a marked decrease in the time needed to finish the maze, suggesting that the rats knew the solution to the maze but did not express it behaviorally, meaning that they had a  of the maze. Tolman called this ⏳.

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2Frat.png?alt=media&token=fc717723-4e5f-457d-ba0d-e0bd6cf80c5a

Image Courtesy of Wikimedia Commons.

Not all types of stimuli will necessarily be conditioned with all types of responses. found that people are more readily predisposed to be conditioned to taste if the corresponding response is internal. For example, the behavioral response of nausea 🤢 is more likely to be conditioned to a taste stimulus than an external stimulus, such as a sound 🔊 .

Other research has shown that cognitive interpretations of conditioning also play a role. If a person believes that a particular stimulus, as opposed to the intended stimulus, causes the conditioning, then the stimulus designed to produce the conditioning will not occur.

Reinforcement Schedules

The probability of successful operative conditioning depends upon how the reinforcements are presented.

When something is produced on a fixed schedule, occurs in a predictable (but not continuous) pattern. One knows when the subsequent will be given, assuming behaviors are performed. When is presented on a variable schedule, it means that is not predictable, and it is not apparent when the next will exactly occur. 

When is given on an interval schedule, it means a certain amount of time must pass by, assuming the behavior is performed before is given. When is given on a ratio schedule, a certain number of behaviors must be performed before the is provided. 

Altogether, this makes four different types of schedules of

 

Schedule

Explanation 

Real World Example 

Rewarded after a specific number of responses #️⃣

You get paid $100 bucks after writing two columns. 

Rewarded after an average but unpredictable number of responses 

Put money in a slot machine.  It pays out after several plays, but the player is uncertain of the number because it varies. 

 

Rewarded after a set amount of time has elapsed 📅

People who earn a monthly salary

 

Rewarded after an average but unpredictable amount of time has elapsed

Person checks email messages and is rewarded with a message at varying times.  

Table adapted from Open Source Textbook.

🎥Watch: AP PsychologyOperant Conditioning

Key Terms to Review (31)

Acquisition

: In psychology, acquisition refers to the initial stage of learning or conditioning. It's when a response is first established and gradually strengthened.

B.F. Skinner

: B.F. Skinner was a psychologist who developed the theory of operant conditioning, which posits that behavior is determined by its consequences, be they reinforcements or punishments.

Chaining

: Chaining is a behavioral psychology term that refers to the process of linking together individual behaviors into a sequence or chain. Each behavior in the chain serves as both an end result of the previous behavior and the cue for the next one.

Cognitive Map

: A cognitive map is a mental representation or image of the layout of one's physical environment.

Discrimination

: In psychology, discrimination refers to an organism’s ability to differentiate between a specific stimulus and similar ones but not identical stimuli.

E.L. Thorndike

: Edward Lee Thorndike was an American psychologist who developed the law of effect, which states that behaviors followed by pleasant outcomes are likely to be repeated, while those followed by unpleasant outcomes are not.

Edward Tolman

: Edward Tolman was an American psychologist who developed a cognitive view of learning, which became known as latent learning. He believed that people and animals are active information processors and not just passive learners as suggested by behaviorism.

Extinction

: In psychology, extinction refers to the gradual weakening and eventual disappearance of a conditioned response. This occurs when the conditioned stimulus is repeatedly presented without the unconditioned stimulus.

Fixed Interval

: A fixed interval is a schedule of reinforcement where the first response is rewarded only after a specified amount of time has elapsed.

Fixed Ratio

: A fixed ratio schedule is a system of reinforcement in operant conditioning where a response is reinforced only after a specified number of responses.

Fixed Schedule

: In psychology, fixed schedule refers to delivering reinforcement after a specific number of responses or after a specific time interval has passed.

Generalization

: Generalization in psychology refers to the tendency for the conditioned response to be evoked by stimuli that are similar to the stimulus to which it was originally conditioned.

Insight Learning

: Insight learning is a form of cognitive learning where animals or humans solve a problem using a sudden understanding or realization, rather than trial and error.

Interval Schedule

: An interval schedule in psychology refers to a schedule of reinforcement where responses are reinforced after a specific amount of time has passed.

John Garcia

: John Garcia was a psychologist known for his research on taste aversion, showing that some species are biologically prepared to make certain associations, which led to the development of the Garcia Effect.

Latent Learning

: Latent learning is a type of learning that occurs without any obvious reinforcement and isn't demonstrated until there's an incentive to do so.

Law of Effect

: The Law of Effect is a psychological principle advanced by Edward Thorndike suggesting that responses closely followed by satisfaction will become firmly attached to the situation and therefore more likely to reoccur when the situation is repeated. Conversely, if the situation is followed by discomfort, connections to the situation will become weaker.

Negative Punishment

: Negative punishment involves taking away something desirable or enjoyable to decrease the likelihood of a particular behavior reoccurring.

Negative Reinforcement

: Negative reinforcement involves removing an aversive stimulus to increase the likelihood that a desired behavior will be repeated.

Operant Conditioning

: Operant conditioning is a type of learning where behavior is controlled by consequences. Positive reinforcements or punishments are used to either increase or decrease the likelihood of a behavior happening again.

Positive Punishment

: Positive punishment is a concept in psychology that involves adding an undesirable consequence or stimulus to decrease the behavior that follows.

Positive Reinforcement

: Positive reinforcement involves adding a rewarding stimulus to increase the likelihood that a desired behavior will be repeated.

Punishment

: Punishment is a process that decreases the likelihood of a behavior recurring by applying an unpleasant stimulus following the behavior.

Ratio Schedule

: A ratio schedule in psychology refers to a program by which reinforcement depends on the number of correct responses.

Reinforcement

: Reinforcement is a consequence that strengthens or increases the likelihood of a behavior by providing a desirable outcome or removing an undesirable one.

Shaping

: Shaping refers to gradually teaching new behaviors through reinforcement until the target behavior is achieved.

Skinner Box

: The Skinner Box, also known as an operant conditioning chamber, is a device used in experiments conducted by B.F. Skinner to study animal behavior. It's designed to provide a controlled environment for reinforcing or punishing specific behaviors.

Spontaneous Recovery

: Spontaneous recovery refers to the reappearance of a previously extinguished conditioned response after some time has passed without exposure to the conditioned stimulus.

Variable Interval

: A variable interval is a schedule of reinforcement where a response is rewarded after an unpredictable amount of time has passed.

Variable Ratio

: In psychology, variable ratio refers to delivering reinforcements after an unpredictable number of responses.

Variable Schedule

: A variable schedule in psychology refers to a schedule of reinforcement where a response is reinforced after an unpredictable number of responses. This creates a steady, high rate of responding.

4.3 Operant Conditioning

5 min readdecember 20, 2022

John Mohl

John Mohl

Haseung Jun

Haseung Jun

John Mohl

John Mohl

Haseung Jun

Haseung Jun

 refers to when a behavior leads to an environmental response, which affects the likelihood of the behavior happening again. 

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-rsD09i9sZzVF.png?alt=media&token=b0ac2120-625c-4f43-a4af-d3f1130012fa

Image Courtesy of Verywell mind.

One of the earliest contributors to this aspect of learning was , who found that behaviors that had a favorable outcome became stronger. In contrast, behaviors that had an unfavorable outcome became weaker. He referred to this as his .  

 took this principle further and described different types of consequences that can occur and how they could affect the presentation of the behavior. He created what is called the "", where animals in the box only receive food if they press a lever or peck a disk. The food was used as a reinforcer, using the food as .

🎥 Watch: AP PsychologyOperant Conditioning with Pigeons

Reinforcement and Punishment

Reinforcing behavior means there is a greater likelihood that the behavior will occur again. Contrarily, punishing a behavior will create a lessened probability that the behavior will happen again. In the box,

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-VP44R4CRgySf.JPG?alt=media&token=19cf3dd5-15f6-4c3a-a54b-794836054126

Image Courtesy of Verywell Mind.

Describing a consequence as "positive" does not indicate a synonym of "good," similarly with "negative" and "bad." Instead, the use of the word "positive" suggests the presence of a result, whereas "negative" indicates the absence or disappearance.

Thus, when a behavior is positively reinforced, it means something is presented (usually something pleasant) to increase the likelihood of the behavior happening again. When something is negatively reinforced, something is taken away (usually unpleasant) to encourage that behavior to happen again.

When something is positively punished, something is presented (usually unpleasant), making the behavior happen less often. In contrast, something negatively punished has something taken away (usually something pleasant) to make that behavior happen less often.

Term

Description

Outcome

Example

Add or increase a pleasant stimulus

Behavior is strengthened 

You get a cookie for an “A.” 🍪

Negative 

 

Reduce or remove an unpleasant stimulus

Behavior is strengthened

Taking painkillers (removes pain), the behavior of taking painkillers is strengthened. 

Add an unpleasant stimulus

Behavior is weakened

Give more homework for misbehavior ✍️

Reduce or remove pleasant stimulus 

Behavior is weakened

No phone 📱 after breaking curfew 

Table adapted from Open Source Textbook.

Skinner used negative and positive reinforcements and punishments to train the rats inside the . But sometimes, we can't wait forever for the rat to learn. So in order to speed up the process, we can use . reinforces the steps to reach the end goal. For example, the rat could be reinforced for touching the lever with any part of its body. Because we're rewarding any behavior close to the behavior we want, we have a greater chance of getting the rat to stumble upon the behavior we want.

There's also something called . When the animals are trained enough, they can perform multiple tasks in order to get the reward. A good example of this would be going through an obstacle course to get the final reward.

There are basic conditioning phenomena also describe the process of .

Rat learns to press the food lever for food
The rat unlearns that connection between the lever and food and the rat stops pressing the lever
After a period of rest without the learned behavior, the rat presses the lever
The rat presses anything that looks like a lever, thinking it will give it food
The rat learns to only press a certain lever

🎥 Watch: AP PsychologyPositive and Negative Punishments

Limitations to Operant Conditioning

Despite stringent behaviorists’ claims, there are limitations to classical conditioning. When presented with a puzzle 🧩, some organisms can discover the solution to the problem without proper reinforcements to guide them to the solution. This phenomenon is known as . is sometimes referred to as the “a-ha moment” when one suddenly realizes the solution to a problem💡.

found that rats did not show any noticeable improvement in getting through a maze in the absence of . However, when was provided, he found a marked decrease in the time needed to finish the maze, suggesting that the rats knew the solution to the maze but did not express it behaviorally, meaning that they had a  of the maze. Tolman called this ⏳.

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2Frat.png?alt=media&token=fc717723-4e5f-457d-ba0d-e0bd6cf80c5a

Image Courtesy of Wikimedia Commons.

Not all types of stimuli will necessarily be conditioned with all types of responses. found that people are more readily predisposed to be conditioned to taste if the corresponding response is internal. For example, the behavioral response of nausea 🤢 is more likely to be conditioned to a taste stimulus than an external stimulus, such as a sound 🔊 .

Other research has shown that cognitive interpretations of conditioning also play a role. If a person believes that a particular stimulus, as opposed to the intended stimulus, causes the conditioning, then the stimulus designed to produce the conditioning will not occur.

Reinforcement Schedules

The probability of successful operative conditioning depends upon how the reinforcements are presented.

When something is produced on a fixed schedule, occurs in a predictable (but not continuous) pattern. One knows when the subsequent will be given, assuming behaviors are performed. When is presented on a variable schedule, it means that is not predictable, and it is not apparent when the next will exactly occur. 

When is given on an interval schedule, it means a certain amount of time must pass by, assuming the behavior is performed before is given. When is given on a ratio schedule, a certain number of behaviors must be performed before the is provided. 

Altogether, this makes four different types of schedules of

 

Schedule

Explanation 

Real World Example 

Rewarded after a specific number of responses #️⃣

You get paid $100 bucks after writing two columns. 

Rewarded after an average but unpredictable number of responses 

Put money in a slot machine.  It pays out after several plays, but the player is uncertain of the number because it varies. 

 

Rewarded after a set amount of time has elapsed 📅

People who earn a monthly salary

 

Rewarded after an average but unpredictable amount of time has elapsed

Person checks email messages and is rewarded with a message at varying times.  

Table adapted from Open Source Textbook.

🎥Watch: AP PsychologyOperant Conditioning

Key Terms to Review (31)

Acquisition

: In psychology, acquisition refers to the initial stage of learning or conditioning. It's when a response is first established and gradually strengthened.

B.F. Skinner

: B.F. Skinner was a psychologist who developed the theory of operant conditioning, which posits that behavior is determined by its consequences, be they reinforcements or punishments.

Chaining

: Chaining is a behavioral psychology term that refers to the process of linking together individual behaviors into a sequence or chain. Each behavior in the chain serves as both an end result of the previous behavior and the cue for the next one.

Cognitive Map

: A cognitive map is a mental representation or image of the layout of one's physical environment.

Discrimination

: In psychology, discrimination refers to an organism’s ability to differentiate between a specific stimulus and similar ones but not identical stimuli.

E.L. Thorndike

: Edward Lee Thorndike was an American psychologist who developed the law of effect, which states that behaviors followed by pleasant outcomes are likely to be repeated, while those followed by unpleasant outcomes are not.

Edward Tolman

: Edward Tolman was an American psychologist who developed a cognitive view of learning, which became known as latent learning. He believed that people and animals are active information processors and not just passive learners as suggested by behaviorism.

Extinction

: In psychology, extinction refers to the gradual weakening and eventual disappearance of a conditioned response. This occurs when the conditioned stimulus is repeatedly presented without the unconditioned stimulus.

Fixed Interval

: A fixed interval is a schedule of reinforcement where the first response is rewarded only after a specified amount of time has elapsed.

Fixed Ratio

: A fixed ratio schedule is a system of reinforcement in operant conditioning where a response is reinforced only after a specified number of responses.

Fixed Schedule

: In psychology, fixed schedule refers to delivering reinforcement after a specific number of responses or after a specific time interval has passed.

Generalization

: Generalization in psychology refers to the tendency for the conditioned response to be evoked by stimuli that are similar to the stimulus to which it was originally conditioned.

Insight Learning

: Insight learning is a form of cognitive learning where animals or humans solve a problem using a sudden understanding or realization, rather than trial and error.

Interval Schedule

: An interval schedule in psychology refers to a schedule of reinforcement where responses are reinforced after a specific amount of time has passed.

John Garcia

: John Garcia was a psychologist known for his research on taste aversion, showing that some species are biologically prepared to make certain associations, which led to the development of the Garcia Effect.

Latent Learning

: Latent learning is a type of learning that occurs without any obvious reinforcement and isn't demonstrated until there's an incentive to do so.

Law of Effect

: The Law of Effect is a psychological principle advanced by Edward Thorndike suggesting that responses closely followed by satisfaction will become firmly attached to the situation and therefore more likely to reoccur when the situation is repeated. Conversely, if the situation is followed by discomfort, connections to the situation will become weaker.

Negative Punishment

: Negative punishment involves taking away something desirable or enjoyable to decrease the likelihood of a particular behavior reoccurring.

Negative Reinforcement

: Negative reinforcement involves removing an aversive stimulus to increase the likelihood that a desired behavior will be repeated.

Operant Conditioning

: Operant conditioning is a type of learning where behavior is controlled by consequences. Positive reinforcements or punishments are used to either increase or decrease the likelihood of a behavior happening again.

Positive Punishment

: Positive punishment is a concept in psychology that involves adding an undesirable consequence or stimulus to decrease the behavior that follows.

Positive Reinforcement

: Positive reinforcement involves adding a rewarding stimulus to increase the likelihood that a desired behavior will be repeated.

Punishment

: Punishment is a process that decreases the likelihood of a behavior recurring by applying an unpleasant stimulus following the behavior.

Ratio Schedule

: A ratio schedule in psychology refers to a program by which reinforcement depends on the number of correct responses.

Reinforcement

: Reinforcement is a consequence that strengthens or increases the likelihood of a behavior by providing a desirable outcome or removing an undesirable one.

Shaping

: Shaping refers to gradually teaching new behaviors through reinforcement until the target behavior is achieved.

Skinner Box

: The Skinner Box, also known as an operant conditioning chamber, is a device used in experiments conducted by B.F. Skinner to study animal behavior. It's designed to provide a controlled environment for reinforcing or punishing specific behaviors.

Spontaneous Recovery

: Spontaneous recovery refers to the reappearance of a previously extinguished conditioned response after some time has passed without exposure to the conditioned stimulus.

Variable Interval

: A variable interval is a schedule of reinforcement where a response is rewarded after an unpredictable amount of time has passed.

Variable Ratio

: In psychology, variable ratio refers to delivering reinforcements after an unpredictable number of responses.

Variable Schedule

: A variable schedule in psychology refers to a schedule of reinforcement where a response is reinforced after an unpredictable number of responses. This creates a steady, high rate of responding.


© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.