Operant Learning for Dogs

A series of articles for professional dog trainers, those who want to become professional dog trainers, and those who want to become certified dog trainers.

Operant learning is any procedure in which a behavior becomes stronger or weaker (eg., more or less likely to occur), depending on its consequences. Also called instrumental learning. (Chance, Learning & Behavior, 5th ed., pg 453.)

What this means is that the animal makes a choice: a song popular during your “coming of age” summer plays on the radio and you turn the volume up; the electric can opener runs and the cat comes into the kitchen; you hear a police siren and you put your foot on the brake.

Edward Lee Thorndike was the first person to study animal intelligence using the scientific process. Thorndike’s work was done around the turn of the 20th Century, and through his work he came up with the law of effect. The law of effect states that behavior is a function of its consequences. Although this seems somewhat simplistic in today’s world, the law of effect was the foundation for B.F. Skinner’s work and what we now call operant learning.

In operant learning you are making a choice based on the environmental stimuli surrounding you. The stimuli do not cause you to do something; rather, they set the occasion for the performance of a behavior that is likely to be reinforced or help avoid something unpleasant. You don’t have to put your foot on the brake when you hear a police siren, but you know that you are less likely to receive a ticket if you do. However, if your wife is in labor with your first child, you may choose to ignore the siren and continue at full speed to the hospital. So, you’re making a choice based on the various environmental stimuli surrounding you.

Regardless of your choice, your heart rate probably still increased at the sound of the siren, because that is a respondent behavior and is not as easily controlled. Many people would say that they “reflexively” step on the brake at the sound of a siren, but in reality that is an operant behavior.

Remember from last issue’s article that the difference between respondent and operant learning is in the contingency. The contingency for respondent learning can be stated as: When “a” happens, “b” follows. The contingency for operant learning can be stated as: If I do “a,” then “b” will likely follow. So, in respondent learning, you don’t have control over the events, but in operant you do.

The Operant Model

We can’t talk about operant learning without discussing the operant model. There are four procedures in the model, and visualizing it often helps.

Basically, one of three things will happen to behavior – it will increase in frequency, it will decrease in frequency, or it will remain the same. The increase or decrease is the effect the operation has on the behavior. If behavior increases or remains the same, it has been reinforced. If it decreases, it has been punished.

Another way to think about the model is that you have an operation and an effect of that operation. So the adding or removing of stimuli is the operation and the increase or decrease in behavior is the effect of that operation.

There are two ways that the environment can change to affect an increase or decrease in behavior. Something can be added to the environment or something can be removed from the environment. If something is added it is positive, if it is removed it is negative.

All of these terms are derived from the vocabulary of the science of behavior. There is no emotional baggage that goes with the terms and they should not be thought of emotionally. Positive and negative are mathematical terms; reinforcement simply means that behavior increases and punishment simply means that behavior decreases.

Here are the definitions of the relevant terms:

  • Reinforcement – The procedure of providing consequences for a behavior that increase or maintain the strength of that behavior. (Chance, Learning & Behavior, 5th ed., pg 454.)
  • Punishment – The procedure of providing consequences for a behavior that reduce the strength of that behavior. ((Chance, Learning & Behavior, 5th ed., pg 454.)
  • Positive Reinforcement – A reinforcement procedure in which a behavior is followed by the presentation of, or an increase in the intensity of a stimulus. (Chance, Learning & Behavior, 5th ed., pg 453.)
  • Negative Reinforcement – A reinforcement procedure in which a behavior is followed by the removal of, or a decrease in the intensity of, a stimulus. (Chance, Learning & Behavior, 5th ed., pg 454.)
  • Positive Punishment – A punishment procedure in which a behavior is followed by the presentation of, or an increase in the intensity of, a stimulus (Chance, Learning & Behavior, 5th ed., pg 453.)
  • Negative Punishment – A punishment procedure in which a behavior is followed by the removal of, or a decrease in the intensity of, a stimulus. (Chance, Learning & Behavior, 5th ed., pg 454.)

Here are some examples of the four operations:

  • Positive Reinforcement (R+) – the dog sits, the trainer gives him a cookie; the dog’s frequency of sitting increases
    • A cookie was added (positive) to the environment and the sitting behavior increased (reinforcement)
  • Negative Reinforcement (R-) – the rat is in a box with an electrified floor, the rat presses a lever and the electricity stops; the rat’s lever pressing behavior increases
    • Electricity was removed (negative) from the environment and the lever pressing behavior increased (reinforcement)
  • Positive Punishment (P+) – The dog is spanked for getting in the garbage; the dog’s frequency of getting in the garbage decreases
    • Pain was added (positive) to the environment and the scavenging behavior decreased (punishment)
  • Negative Punishment (P-) – The dog paws at the owner and the owner gets up and walks out of the room; the dog’s pawing behavior decreases
    • The owner was removed (negative) from the environment and the dog’s pawing behavior decreased (punishment)

The toughest of the four operations to wrap your mind around is R-. An easy way to think of R- is to remember that it’s about avoidance. A behavior increases in order to avoid something undesired – pain, inclement weather, hunger, etc.

This hierarchy model was presented by
Dr. Susan Friedman for a Raising Canine telecourse

When using a humane hierarchy, we generally begin behavior modification with assessing the animals physical and mental well being, then use R+ procedures, moving down the hierarchy to extinction, P-, R-, and finally P+. There is some debate as to the order of extinction, P-, and R-, but there is no debate that R+ is the optimum choice and P+ the last choice.

Extinction

There is another way behavior changes, and that is through extinction. Although there is some argument that extinction falls into the punishment category because it causes decrease in behavior, there is no procedure to cause that decrease, so it does not fall within the operant model. Extinction is, in operant learning, the procedure of withholding the reinforcers that maintain a behavior. (Chance, Learning & Behavior, 5th ed., pg 451.)

Most trainers are familiar with the concept of extinction and often recommend it for annoying behaviors such as pawing for attention, barking to be let in, jumping up, etc. However, extinction can be very tricky and is probably not the best procedure to recommend to owners who have a very limited understanding of these procedures. A differential reinforcement schedule will work better and will be discussed in detail in a later article.

When using extinction to decrease a behavior, there are some interesting phemomena to be aware of. They are extinction bursts, spontaneous recovery and resurgence. The important thing to remember about extinction is that the word “extinction” is a behavioral term, and the reality is that an extinguished behavior does not go away entirely, but goes back to its baseline level – the level it was before it was reinforced. The potential for any behavior we do is already in us – i.e., an elephant will never fly and a salmon will never walk, because they do not have the genetic predisposition to do so. So any behavior we do is there, waiting for the right stimulus to come along and elicit that behavior.

An extinction burst is an increase in behavior during the early stages of extinction. Understanding reinforcement schedules helps in understanding extinction bursts. We know that a variable reinforcement schedule will create a stronger behavior, so when we initially put a behavior on an extinction schedule, it could be considered a variable reinforcement schedule – those first few trials when reinforcement is no longer available can result in a stronger, more variable behavior. This is where most people give up and reinforce the behavior because they think it’s getting stronger when, in fact, it is simply in the early stages of extinction. It is important to understand that extinction bursts can also increase emotional behaviors, including aggression.

Spontaneous recovery is when the environmental stimuli are such that they set the occasion for the behavior to recur. Remember that the behavior has been reinforced in the past, so if a reinforcing stimulus is present it’s not unusual for the behavior to reappear. The classic example is the vending machine that is out of a particular product. Our behavior may have extinguished after a couple of tries, but if we walk past that machine again in a few days, we may try it again. There are many examples in our lives of spontaneous recovery – using a TV remote that has dead batteries, for instance. We know the batteries are dead but we still give it a try! If a spontaneously recovered behavior is not reinforced, it will extinguish again quite quickly; however, if it is reinforced it is now on a variable reinforcement schedule and it will be much harder to extinguish.

Resurgence is another familiar event in a trainer’s life – although most of us don’t know what it’s called. Resurgence is when a behavior has been extinguished and another behavior is now on an extinction schedule, and the previously extinguished behavior reappears. We often refer to this in a trained animal as “running through his repertoire” trying to find the behavior that will be reinforced. We’re standing there with our dog’s food bowl and he throws out every behavior he knows, including behaviors he hasn’t been reinforced for in a long time – this is resurgence.

Extinction bursts, spontaneous recovery and resurgence are the reasons it’s difficult to use extinction as a method of decreasing a behavior. Unless you are familiar with these concepts, it is very likely that a reappearance of the behavior will result in reinforcement, making it much harder to extinguish.

Summary

In this article we have discussed two of the important concepts in operant learning. However, there is a lot more to know about operant learning, so the next article will discuss more operant concepts.

Raising Canine has a school for dog trainers which focuses on operant training for dogs, dog behavior, working with clients and addressing client compliance, and the science behind behavior modification.