Teaching husbandry behaviors with clicker training: Tooth Inspection

Red tooth inspectionIs your horse comfortable letting you look at his teeth?

In the last few years, I’ve encountered a variety of teeth issues with my own horses and it has made me realize the importance of being able to check their teeth on a regular basis. Without special equipment, I can’t do a complete mouth exam, but I can check their incisors for uneven wear patterns or other signs that they need the attention of a dentist. This allows me to catch problems early. Regular tooth inspection also prepares them for when the dentist does come.

When I started looking at teeth, I thought it would be simple to just move their lips and take a peek. My horses are used to being touched all over and they are comfortable having my hands near their mouths when I am hand feeding, grooming, haltering and bridling. But, I found that while they were ok with my hands near their mouths for routine tasks, asking them to open their lips was an unfamiliar behavior and they weren’t sure what to do. Some of them became confused and offered other things. Others just became anxious and put their heads up or moved them around.

So, I decided this was a great training project. I started with a fairly simple goal which was to teach each horse to do a behavior that would allow me to look at his or her incisors from the front and from both sides. This meant I needed a behavior with some duration and I needed the horse to hold his head in a position where I could see the view of the teeth that I wanted.

My first thought was that maybe I could make use of a behavior that I already had on cue. This is the flehmen response (“smile”) which I had captured or shaped with several of them as a fun trick. It is a great way to see their teeth and they all learned to do it quite easily. But, horses tend to pick their heads up quite high when they show their teeth this way and I wasn’t sure how easy it would be to build duration or be able to see the teeth from the side. For it to be useful, I would need to shape it into a more controlled behavior. I’m sure I could have done that, but I decided that rather than change a long-established behavior, it might be easier to start with something new.

I decided I would start over and teach them to hold their heads still and allow me to gently move their lips out of the way so I could look at their teeth. This would make it easier for me to keep the behavior on cue, and I could build in some flexibility in how I did it. This might be more practical, especially if I needed to see their teeth from a certain angle. I also thought it might also be an interesting challenge for one of my horses who tends to have a busy mouth. Learning to keep his lips and tongue still would be a good exercise for him.

This brings up an interesting point which is that there are always several ways to approach any husbandry behavior and it’s a good idea to think about the options before choosing one.  One of the first questions I always ask myself is if I want the horse to do the behavior on his own, or if I want the horse to allow me to physically manipulate him.

There are advantages and disadvantages to both approaches and I always consider the horse’s individual needs (which includes past training history) as well as how, when, and where the behavior will be used.  In many cases, the first approach, where the horse does the behavior on his own, gives the horse more control of the training and encourages the horse to be a more active participant than the other approach, where he is trained to let you do a behavior to him. But, in some cases, learning to allow manipulation requires just as much participation on the horse’s part and can lead to a horse that is more comfortable about tactile information in general. On the other hand, some horses will become less eager to participate if there is an element of manipulation (or any suggestion of “making them do it”), and it’s better to set up the training so they have more control over the process.

Therefore, when choosing how to approach tooth inspections, I had to think a little bit about the possible implications of choosing what might seem like a more passive behavior. The behavior “allowing a person to look at your teeth” is different than “showing your teeth” and I didn’t want the horses to just learn to accept something that was unpleasant. Could I train it in such a way that the horses were still able to communicate when and how the behavior would be done? Yes! Once I started training, I found that I could set up a nice dialog by using specific behaviors as starting points and waiting for the horse to indicate when he or she was ready to continue.

The Teaching Progression

I spent about six weeks working on this with three of my horses: Rosie, Red and Aurora. We didn’t work on it every day and some horses had more sessions than others.  Aurora had the most, Rosie had the least, and Red was somewhere in between.  They all came into the training with slightly different repertoires of trained behaviors and they all have very different personalities.

My plan was to start with a chin target and then refine it to include a closed mouth and quiet lips. One reason I chose to use a chin target is because it’s a behavior where my hand is in close proximity to the mouth, but it’s a position that is less likely to prompt lip movement. It also made it easy to stabilize the horse’s head when I was ready to use my other hand to lift the lips. But, once I started training, I realized that each horse needed a slightly different progression to get to the same basic behavior of chin target with a closed mouth and quiet lips.

Here’s how it worked out:

• Rosie: She is the most experienced of the group. She has done a lot of chin targeting, knows how to “smile” on cue, and is the least nibbly about fingers. To avoid cueing the smile behavior, I started with a chin target and then moved to reinforcing closed mouth and quiet lips. I was careful not to touch her between her nostrils until the chin target/quiet mouth was well established. A finger to that area between the nostrils is her cue to smile. So, Rosie’s progression was chin target -> mouth closed -> lips quiet.

• Red: he is a very clickerwise horse, but had not been taught to chin target and he is the most oral of the group. He loves to lick people and can be a little nibbly with his lips if my hands are near his face. He also knows how to smile on cue. Since I knew his biggest challenge would be having a quiet mouth, I started by clicking him for that before I put my hands near his face. Once he could keep his mouth closed, then I added a chin target. Once he could do both of those, then I clicked for quiet lips. I probably clicked for quiet lips at other points in the process if the opportunity presented itself, but his general progression was mouth closed -> chin target -> quiet lips.

• Aurora: She is the least experienced. I have not taught her a chin target or to smile, but she does know a hand target. I’ve spent time touching her all over and making sure she’s comfortable with it, but because she’s only 3, she has the least life experience with medical and husbandry procedures. She tends to have a quiet mouth and be pretty passive about having things done to her. She’s also the one I have to watch because her body language is the least clear of the group. She doesn’t always tell me when something is bothering her. Her progression was hand target -> chin target -> closed mouth. I didn’t have to focus on quiet lips with her.

With the chin target, I experimented with leaving my hand near their chin vs. removing it between repetitions and found that both were useful at different times. If I was doing a few repetitions in the same position, it was less distracting if I left my hand near, but not on, the chin while I fed the treat.  Then I could just ask the horse to target my hand when we were starting again. They can’t see it so I would just touch the chin gently and click if the horse maintained the contact. If I wanted the horse to have a break or was changing sides, then I would remove my hand and start over again for the next repetition.   For the most part, I kept my hand in position near (but not on) their chin if they moved their heads around a bit.

At a certain point, once everyone had the basic idea, the beginning sequence of behaviors started to look the same for all the horses. I would start with a chin target, wait for them to be ready (the horse indicated this by having a quiet mouth and lips) and then I would start moving the lips so I could look.   

I broke it down into small steps and followed this general progression:

  • Head still – click for relaxed/neutral position
  • Chin target – click for chin in hand
  • Mouth closed – click for mouth closed
  • Quiet lips – click for quiet lips (this step and the previous one were sometimes done together or in the other order)
  • Quiet lips when I placed my hand on them – click if the horse remained relaxed with no mouth or lip movement
  • Relaxed lips while I gently moved them away from the teeth – click if the horse allowed me to move the lips out of the way
  • Relaxed lips while I moved them more –  click for allowing me to move the lips. This step took quite a lot of time as they had to learn to relax their lips. If they wanted to move their lips out of the way for me (Rosie did this), then I was ok with that, although I had to be careful about what I was clicking.
  • Relaxed lips while I held up their lips for longer durations so I could see their teeth. I built duration slowly over time.  – click for a good moment (relaxed lips sufficiently open) and enough duration

I also monitored head and neck position to make sure they were comfortable. Once I could see the teeth, I might have to click and treat them for moments when they had their teeth together, as some of them would open their mouths slightly as I moved their lips.

One interesting challenge with training the tooth inspection behavior was that a quiet mouth included not chewing. In a lot of my training, I don’t need to wait for the horse to finish eating before I do another repetition and my horses are happy to start again while eating their last reinforcer. But, with this behavior, I found it was better to give them time to completely finish eating before I asked again. Waiting for the horse to be done chewing was something I had to be consciously aware of doing.

Waiting for them to be done chewing meant that once I was past the chin target stage, my rate of reinforcement dropped a lot unless I mixed in other behaviors. Rosie and Red didn’t seem to mind, and if they wanted to do another repetition, they would actually stop chewing even though they weren’t completely done eating. This became a great way for them to tell me when they were ready to go again.

Aurora preferred to finish eating before we did the next repetition of looking at her teeth, so I found it was helpful to add in targeting or other simple behaviors if the rate of reinforcement was getting too low.  I also had her practice the chin target outside of the tooth inspection sessions. I might mix in a few chin targets while I was grooming her, picking moments when she was standing quietly and not chewing. This strengthened the chin target behavior and she could finish eating while I moved on to grooming or something else.

What can I see?

I taught the horses to let me look at their teeth from three positions: the left side, the right side, and in front. I don’t have any professional training in dentistry, but my goal was to be able to look at and check the incisors for a few specific things. This would allow me to catch problems early and be more knowledgeable when my dentist came. As with any profession, there are different schools of thought on tooth care, but I think every horse owner can learn to identify a few simple deviations from correct alignment.

From the front:

1. Are the incisors level? Is the horse wearing the teeth on one side more than the other? Some horses will develop curved incisors (a “frown” or a “smile”) or a wedge mouth where the teeth are longer on one side than the other.

2. Are the upper and lower jaw centered over each other? From the front, I can look at how the top teeth line up with the bottom teeth. This tells me about the alignment of the jaw from side to side.

3. I can also look for any uneven wear pattern that might occur from repetitive action such as the horse biting at a stall grill or bar.  If these things are noticed early, you can take action to prevent further damage.

From the sides:

1. I can check the jaw alignment by looking at the relative positions of the last incisors on each side.  Do the edges of the top incisors and bottom incisors meet in a straight line? If not, then this tells me that the lower jaw is displaced to one side, which can indicate there is an issue in the back teeth (pre-molars and molars) and/or in the TMJ.

2. I can also check if the middle incisors (top and bottom) are meeting correctly so that the top teeth are positioned directly over the bottom teeth.  Sometimes these teeth will be misaligned so that top teeth extend farther forward than the bottom teeth (or the reverse).

3. I can look at the angle of the incisors. If the incisors are too long, the teeth (both top and bottom) will get pushed forward and the angle between them will get smaller.  Long incisors are more common with older horses but it’s worth checking on all horses.

With babies, being able to look at the teeth can help me monitor if the teeth are coming in at the right time and also if the caps are being shed. When Red was young, he retained a cap on one of his incisors and it affected how the adult tooth came in. If we had caught it sooner, he wouldn’t have had uneven front teeth for as long as he did.

I’ve made a little video to show how everyone is doing.  You can watch it at
https://www.youtube.com/watch?v=Fz78MbL8GEc. I am very pleased with their progress and we are starting to work on more duration.  At this point I can see well enough to check the alignment and watch for problems, but there are still plenty of things we can work on.

In addition to increasing duration, and continuing to work on relaxation, I might explore using the chin target in different ways. I haven’t taught them to come and line up with a chin target, so that might be fun to do. It might also be interesting to try and have them chin target on something else so I have two hands available for a more thorough check.

I also have to decide what to do about their tongues.  Both Red and Aurora tend to place their tongues between their front teeth, so just the tip is peeking out.  I can still see what I need to see, but I’d love to be able to see just their teeth with the tongue tucked behind them.  Once we have more duration, I may try and see if I can start reinforcing for moments when the tongue is tucked neatly inside.  I’ll update this blog if I decided to work on that and let you know how it goes.

Note: The best place for me to film is in my wash stall, so the video is in that location with the horses wearing halters and leads. I did do some of this training in the wash stall, and even did a few sessions where they were on the cross-ties. But, I also did some sessions in their stalls without any equipment.  I like to teach husbandry behaviors in a few different places and under different conditions because I find this makes the behaviors more robust and the horses seem to handle unexpected variations better.

If you are interested in learning more about teeth, here are some links that I have found to be helpful:

Descriptions of some common malocclusions:

Articles about the importance of teeth for digestion and proprioception: http://www.vossequine.com/

Article about the connection between feet and teeth: http://thenaturallyhealthyhorse.com/feet-teeth-connection-qa-dr-tomas-teskey/

Notes from the Art and Science of Animal Training Conference (ORCA): Choice

The idea of choice was one of the underlying themes of the conference and is always an important consideration for positive reinforcement based animal trainers. At some level, animal training is about teaching an animal to do the behaviors we want, and to do them when we want them, but there are many different ways to go about getting there.

This conference has always been about exploring how we can achieve our goals, while ensuring that the learning process is enjoyable, the learner is allowed to actively participate in the process, and that he becomes empowered by his new skills and his relationship with his trainer.

There were a lot of presentations that touched on some aspect of choice.

  • Dr. Killeen spoke on how understanding the use of the Premack Principle opens up more choices for reinforcement and can lead to a better understanding of how the value of different reinforcers can change depending upon environmental conditions.
  • Emily Larlham talked about how we can teach our dogs to make different choices instead of becoming stuck in behavior patterns that create stress for both parties. She also talked about how important attitude was in training and how being allowed to actively participate and make choices contributes to a dog’s enthusiasm for any chosen activity.
  • Alexandra Kurland talked about how trainers make choices based on the kind of relationship they want to have with their learner (authoritarian vs. nurturing), and how these decisions influence how much choice they give their animals.
  • Barbara Heidenreich provided lots of examples of how to provide choice through more types of reinforcers and a discussion of why it’s important for both the trainer and the learner to have options.
  • Dr. Andronis showed what happens when animals have limited choices about when and how to earn reinforcement, and how to recognize behaviors that indicate that the learner is no longer enjoying and engaged in the learning process.

These are just a few of the references to choice that came up in the other presentations, but they show that animal trainers have to think about choice all the time.  Sometimes we are looking for ways to increase it.  Sometimes we are looking for ways limit it so the animal cannot practice behavior we don’t want.  And sometimes we are educating the animal so he learns to make “good” choices.

Since choice is such an important topic, I saved it for last. The notes in this article are based on two presentations that dealt more specifically with choice.  The first presentation was given by Jesus Rosales-Ruiz and was titled “Premack and Freedom.” The second presentation was given by Ken Ramirez and was titled “Teaching an animal to say ‘No.'” They go together nicely because Jesús talked about choice from a more academic point of view and Ken talked about how to use choice as part of training.

Jesús Rosales-Ruiz: “Premack and Freedom.”

He started with a quote from David Premack:

“Reward and Punishment vs. Freedom”

“Only those who lack goods can be rewarded or punished. Only they can be induced to increase their low probability responses to gain goods they lack, or be forced to make low probability responses for goods that do not belong to them. Trapped by contingencies, vulnerable to the control of others, the poor are anything but free.”

This quote puts a little different perspective on the how reinforcement and punishment work and the idea of choice.  If you desperately need something and you work to get it, is that really choice?   And how are does that relate to reinforcement? We tend to think that by offering reinforcement, we are giving choices, but are we really doing this? Let’s look more closely at reinforcement and then see how it relates to choice.

How we think about reinforcement

The Language of Reinforcement

  • reinforcement as a stimulus (thing)
  • reinforcement as a behavior (activity)

Originally, it was more common to think of reinforcement as being about getting objects such as food, a toy or other item.  But, in Dr. Killeen’s talk, he spoke about the importance of recognizing that reinforcers are behaviors, an idea that came from Dr. Premack. There are some advantages to thinking of behaviors as reinforcers, because this change in thinking opens up the possibility for more types of reinforcers, and also makes it more obvious that the value of a reinforcer is variable.

Not all behaviors are going to be reinforcing at all times, and in some cases, there is reversibility between reinforcers and punishers. Dr. Premack did experiments showing the reversibility of reinforcers. He could set up contingencies that would make lever pressing, running and drinking all reinforcers (at different times), and he did this by adjusting the environment (adding constraints) so that access to some activities was contingent on other activities.

Want a drink? You have to run first. Running is reinforced by drinking.  Want to run? You have to have a drink first.  Drinking is reinforced by access to running.  He showed that behaviors can be either punishers or reinforcers,  depending upon the circumstances. So, perhaps the availability of reinforcers is not what defines choice.

What about freedom? Is choice about having freedom?

There are different ways we can think about freedom:

  • Freedom from aversive stimulation
  • Freedom to do what we want (positive reinforcement)
  • More generally (freedom from control)

How do these ideas about freedom apply to animal training?  And can they help us understand more about choice in animal training?

Clicker trainers meet all three of these “definitions,” to some degree.  Jesus went over this pretty quickly but it’s quite easy to see how clicker training can contribute to an animal’s sense of freedom or being able to make choices.  Clicker trainers avoid using aversives by shaping behavior using positive reinforcement.  They also avoid using aversives when the animal gives the “wrong” answer or does an unwanted behavior.

Clicker trainers don’t necessarily give the animal freedom to do whatever it wants, but they do use positive reinforcement, and over time positively reinforced behaviors do often become what the animal wants to do.  Also, during shaping, the trainer may want the animal to offer a variety of behaviors, so there is some element of choice there.

They may also choose to train behaviors or set up training so the animal has more control of its own training.  A lot of clicker trainers focus on training behaviors that the animal can use to communicate with the trainer so this can give the animal some feeling of control. We are still focusing on what we want them to do, but we are doing it in such a way that they feel they have more control.

At this point, Jesús mentioned that he doesn’t believe that total freedom exists.  If it did, then science would not exist because there would be no “laws.” Our behavior is always determined by something and while we may think we can control it, what we are really after is the feeling that we can control it, which may or may not be true.

I have to confess that at this point I found myself remembering endless college discussions about “free will” and whether or not it exists. I don’t think we need to go there, but I do think that it’s important to realize that it may sometimes be more accurate to say that our goal is for the animal to have the perception of control. Interestingly enough, one of the points that Steve White made was that perception drives behavior, not reality.

Let’s look at how we can control behavior in animal training:

With the idea of freedom and choice in mind, let’s look at four different ways to control behavior in animal training.

1. Control only through aversives:
(note: I added this one for completeness. Jesús referred to it but did not list it on his slides)

  • target behavior occurs -> animal is left alone or aversive ceases
  • target behavior absent -> aversive consequences
  • the animal has no control, choices or freedom

2.  Control through aversives, but with some added positive reinforcement

  • target behavior occurs -> positive consequences
  • target behavior absent -> aversive consequences
  • the animal can gain reinforcement, but it still does not have a choice and cues can become “poisoned” because the cue can be followed by either a positive or negative consequence, depending upon how the animal responds.

3.  Control through positive reinforcement

  • target behavior occurs -> positive consequences
  • target behavior absent -> no consequences
  • the animal can now “choose” whether or not it wants to gain reinforcement, without having to worry about aversive consequences for some choices.  But is it really choice if one option earns reinforcement and the other does not?

4.  Control through positive reinforcement with choices

  • target behavior occurs -> positive consequences
  • target behavior absent -> other positive consequences
  • animal always has another option for a way to earn reinforcement, so there is true choice between two options that both lead to reinforcement.

Jesús shared a couple of videos that showed animals making choices.  He did not spend a lot of time on these, so I can only provide a brief description and the point he was trying to make.

The first video was of a training session with a dog using clicker training and food.  The dog is loose and can participate or not.  As soon as the trainer starts clicking and treating, the dog leaves.  He followed that video with another one where the dog is trained with food alone (no clicker).  In this case, the dog stays and continues doing behaviors for reinforcement.  He said this was an example of a situation where the clicker itself was associated with “hard work” so when the clicker came out, the dog would leave.  He didn’t go into more details on the dog’s previous training history but those clips do suggest that the clicker itself can become an aversive stimulus.

The second video showed a horse being taught using clicker training and food. The horse is loose and the trainer is sitting in a chair.  Throughout the training session, the horse is eagerly offering a behavior, which the trainer clicks and reinforces with food. But, because the trainer is feeding small pellets of grain, some of the food is falling on the ground around the horse.  Despite the abundance of food on the ground, the horse prefers to keep offering behavior, and getting reinforced for it, rather than just eating the food off the ground. Jesús said this was an example of “real choice” because the same food reinforcement was available whether the horse did the behavior or not.

So far we have looked at how reinforcement and freedom contribute to the idea of providing choice for animals.  But there’s another consideration, and that’s hidden within the idea of repertoire size.

Restraint, constraint and the effect of repertoire size

Positive reinforcement trainers are usually very aware of the effect of restraint on animals and try to train under conditions in which the animal is not physically restrained.  They are also aware of the effect of constraint, but it seems to get less attention.   Constraint is when we control the “goodies” or limit the animal’s ways to earn them.

Jesús said it was important to avoid constraint in training.  Constraint can be physical, which means that the animal is in an environment where it might be “free” to move, but there are very few options for things to do.  A Skinner box is an example of an environment where the animal is constrained because the number of behaviors it can do is quite limited.

But, constraint is not always physical. It can also be more about skills or the repertoire of behaviors that are available to an individual. You could say this is a kind of “mental” constraint where the individual feels it has few options because it is only comfortable doing a few things.

He used some human examples to illustrate this. For example,  a person who has several skills has more freedom from constraint than someone who is only good at one thing.   If you are good at debating, dancing, and social interactions, then you can go to a debate, dance or eat lunch with your friends. If you are only good at debating (even if you’re really good at it), but you lack other skills, especially social ones that are important for many activities, then you are constrained by your own repertoire.

He called this being coerced by available behavior, because your options are limited if you have a limited repertoire. At the end of the presentation, Joe Layng made the comment that “feeling” free is actually being free. If you only have one way of getting the extraneous consequence, then you are still limited. Joe’s comment reminded me of Steve’s point that perception, not reality, drives behavior.  It’s always interesting to see how these things come together.

So, one way to limit constraint and increase choices in animal training is to increase the animal’s behavioral repertoire.  This gives them more choices on several different levels because they have more options for reinforceable behavior overall, and it also may make it possible for you to give them more options at any given time.

To summarize:

Choice is not just about providing reinforcement or about removing aversives.  It’s about providing the animal with opportunities to earn reinforcement in many ways and increasing the animal’s repertoire so it has the skills and opportunities to practice many different behaviors.

Remember that:

  • A small repertoire = more constraint
  • limited skills = limited opportunities

If our goal is to increase freedom, then we need to be aware that individuals can be constrained by the available environment and available behavior.

Jesús ended with the question, “If my dog will only walk with me when I have treats, why is that?”  

That’s kind of a loaded question and if I didn’t know Jesús was in favor if using treats, I might think he was suggesting that using treats was a problem.
But I don’t think he was saying that at all. I think he was just encouraging the audience to think about what it means if your dog will only walk with you if you have food.  What does that say about the choices he is making?  Does it tell you anything about how much you might be using food to limit his choices, not to give him more choices?

At the end of Jesus’s presentation, I found myself pondering the practical application of the material he presented.  Yes, I love the idea of giving animal’s choices and I know from personal experience that adding reinforcement is not the same as giving choices. But, I was thinking hard about what training would look like if several behaviors were all capable of receiving the same amount of reinforcement. The whole idea of clicker training is that we can select out and shape behavior by using differential reinforcement.

So, what would happen if you had several behaviors that could earn equal reinforcement? Well, lucky for me, Ken Ramirez’s presentation later in the day was on this exact topic.

Ken Ramirez:  Teaching an animal to say “No”

In his presentation, Ken shared some training that he did with a beluga whale who had become reluctant to participate in training sessions.  He started by saying that while his talk is titled, “Teaching an animal to say ‘No,'” he realizes that that phrase is just a convenient way to describe what they did, and is not necessarily how the animal perceived the training.

He spent a little time talking about terms like “no” and “choice.” They are labels we give to ideas so we can talk about them, but that’s not useful unless we make sure we are all using them in the same way, or have a common reference point.  He shared what he means by teaching “no,” choice, and how the two are related.

What is “no”?

  • Teaching “no” means teaching another reinforceable behavior, one that the animal can choose to do instead of the behavior that has been cued. In the example he’s going to share, they taught the whale that she could always touch a target for reinforcement.
  • Teaching “no” is different than teaching intelligent disobedience, which is more about teaching the animal that some cues override other cues. It’s also different than a Go, No-Go paradigm (hearing test where you don’t respond if you don’t hear the tone), or an “all clear” in scent detection which is just a new contextual cue.
  • We can only guess why the animal chooses to say “no.”  When the whale did touch the target, they had no way of knowing if she was doing it because it was easier, had a stronger reinforcement history, she didn’t want to do the other cued behavior, or …
  • But, regardless of why she chose it,  the value was that it gave her another option besides responding to the cue or ignoring the cue.  Tracking her “no” responses had the added benefit of allowing the trainer to gather information about her preferences and whether or not there was any pattern to when she chose to say “no.”

What is choice?

(these points should be familiar, as they are very similar to what Jesús discussed)

  • It is hard to define
  • Arguably, no situation every provides true choice (there are always consequences)
  • In “true choice,” the animal has the option to receive reinforcement in more than one manner (this goes along with Jesús’s point about true choice being where there are multiple ways to earn the same reinforcement).
  • It is about controlling outcomes

Choice Matters:

  • Real choice is rare
  • Choice is often forced (meaning it is limited, or only one option has a positive consequence)
  • Choice is a primary reinforcer (animals can be reinforced by the opportunity to control their own environment)

Choice in animal training:  a little history

The introduction of positive reinforcement training into zoos and other animal care facilities made it possible for trainers to choose training strategies that allowed their animals more choices.  In the beginning, it may just have been giving the animal a choice between earning reinforcement or not, but over time the training has gotten more sophisticated so that animals have more choices and can actively choose their level of involvement in training sessions.

Ken had some video clips that showed the ways that trainers in zoos can provide choice during husbandry behaviors.  One common practice is to teach stationing, which can be used to teach animals to stay in a specific location for husbandry behaviors.  The animal can choose whether or not to participate by either going to the station, or not.

Another option is to teach the animal an “I’m ready” behavior, which the animal offers when it is ready to start or continue. The trainer does not start until the animal offers the behavior, and she may pause and wait for the animal to offer it again, at appropriate intervals during the session, to make sure the animal is ready to continue.  Some common “I’m ready” behaviors are chin rests, targeting, the bucket game (Chirag Patel), and stationing.  These methods give the animal some choice because the animal is taught a specific way to tell the trainer whether he wants to participate or not.

Teaching stationing and “I’m ready” behaviors are examples of ways that trainers can give their animals more choices.  Teaching these kinds of behaviors usually leads to training that is more comfortable and relaxed for both the trainer and the learner.  A side benefit is that the trainers become much more skilled at observing their animals and paying attention to body language. And, they learn to wait until the animal is ready, which is always a good thing!

Husbandry behaviors can be unpleasant, but allowing the animal to control the timing and pace of the training can make a big difference in how the animal feels about what needs to be done.   However, this may still be quite different than providing “true choice.”

So, what does “true choice” look like?  This was the main part of Ken’s presentation.

The “No” Project:

The “no” project is the story of re-training Kayavak, a young beluga whale.  Kayavak was born at the Shedd Aquarium and has been trained for her entire life (5+ years) using positive reinforcement.  During that time, she developed strong relationships with several trainers and she would work both for food and for secondary reinforcers, especially tongue tickles.  She was easy to handle and responded well (to criteria and with fluency) to her cues.

In fact, she was so agreeable that they often let the younger and more inexperienced trainers work with her.  But, as she started to have more training sessions with less experienced trainers, and less with more advanced trainers, her behavior started to change. She became less reliable about responding to cues and was likely to just swim away, especially if they were working on medical behaviors.

This continued for several years until the “problem” was finally brought to Ken’s attention. By this time the staff was very frustrated and they needed to find a solution. Ken said that part of the reason it took so long for the problem to get to him was that she was handled by different trainers and it took a while for a clear pattern to appear.

Of course, the first thing one wonders is “What happened?”  Looking back, Ken’s best guess is that the change in her behavior was the result of many small mistakes that accumulated over time. None of them were significant events, but added up, they undermined her confidence and made her reluctant to participate in training sessions.

Here are some of the contributing factors:

  • She was trained more and more often by young trainers without strong relationships
  • They misread early, mild signs of frustration, and didn’t adjust
  • They used a LRS (least responsive scenario) inappropriately and it became long enough that it was more of a TO (Time Out.)
  • They felt pressure to get behavior and asked for it again after refusal, instead of asking for another behavior, taking a break, or one of many other options.
  • The problem exacerbated over time and she discriminated against less experienced or unknown trainers.

The Solution

Ken proposed a unique solution.  He felt that Kayavak needed to have a way to say “no.”  He thought she might be feeling as if she didn’t have any choices, and that was why she would just leave.  But, if she had a way to say “no,” and got reinforced for it, perhaps she would choose to remain and would learn to become more engaged in training again.

He suggested that they teach her to touch a target (a buoy tethered by the pool edge).  Touching the target would ALWAYS be a reinforceable behavior, and would be reinforced in the same way as other behaviors.  This was an important point.  They didn’t reinforce targeting with a lesser value reinforcer. They made sure the reinforcement for targeting was equivalent to the reinforcement for other behaviors.

While teaching “no,” might seem like a radical idea, Ken mentioned that he had a few reasons he thought this might work. One was some training he had seen at another aquarium where a sea lion was taught to touch a target at the end of his session so the trainer could leave.  The sea lion learned to do this, but also started touching the target at other times, and seemed to be using it to indicate when he wanted to be done.

The challenge was convincing the staff to try it.  They doubted it would work, because why would she do any of the “harder” behaviors if she could just touch the target all the time? This was a good question, and gets to the heart of what some clicker trainers believe, which is that animals like doing behaviors that have been trained with positive reinforcement.  But, if this is really true, then shouldn’t she be just as happy to do a variety of positive reinforcement trained behaviors instead of just repeating the same one over and over?

Since Ken is the boss, he convinced them to give it a try…

The training

  • Place a buoy close to where she is working and teach her to target it.
  • Practice targeting the buoy until it’s a very strong behavior.
  • Start mixing in some easy behaviors, but ask for targeting the buoy in between.  So they might cue behavior 1 -> click -> reinforce -> target buoy -> click -> reinforce -> behavior 1 (or 2) -> click -> reinforce -> target buoy ->…
  • Increase the number of other behaviors and/or difficulty, still mixing in targeting the buoy on a regular basis.
  • Throughout this process, she can touch the buoy as often as she likes. So if the trainer wants to alternate buoy touches with cued behaviors, and Kayavak offers several buoy touches in a row, she still gets clicked for all of them. If she makes an error, doesn’t get clicked and then touches the buoy, she gets clicked and reinforced.  A buoy touch is always a reinforceable option.

Evolution of a behavior

Ken showed us how the training progressed. He had some video of her at the various stages and some great charts to show how the number of buoy touches changed over time.  I thought this part was really fascinating because it showed how important it was to allow her to find her own way to use “no” and how challenging it was for the trainers to stick with her through the process!

After 3 weeks:

  • She touched it all the time, after every cue.
  • The staff thought her behavior meant that it wasn’t going to work.

At 4 weeks:

  • She started to work well and only chose the buoy under specific conditions, such as when ill, when wrong or no marker heard, when asked to do a medical behavior, with a new trainer, or if working with a trainer with whom she didn’t have a good relationship.
  • He had a clip of a training session with a trainer she didn’t like. She would touch the buoy repeatedly, usually doing it before the trainer had time to cue another behavior.
  • I asked Ken if they kept the session length the same, even if all she wanted to do was touch the buoy and he said “yes.”  That was partly because her training session is how she gets fed, but also because touching the buoy repeatedly was not “wrong.” If that’s what she felt like doing that day, that was fine.
  • With a trainer she trusted, she might not touch the buoy if she didn’t want to do a behavior, but would wait for the next cue instead.

At 4 months:

  • There were almost no refusals with experienced staff.
  • She still tests new trainers for a period of time.
  • They did use the buoy during free shaping, but she rarely touched it.  If she did, it was a sign that the trainer was not slicing the behavior finely enough.
  • They can use the buoy to test if she likes a behavior – which one does she do?
  • He had some nice charts showing how her behavior changed (#right answers, buoy touches, refusals).  They showed how she would test new trainers and then over time, the “no” behavior would get offered less and less.

Is this a useful approach?

Since doing this training with Kayavak, Ken has done the same thing with a sea lion and two dogs. They were all cases where the animal had lost confidence in the training.  Having a default behavior that was always reinforceable meant they always had a way to earn reinforcement and it gave them choices.

He did find that, as with Kayavak, once the “no” behavior had been learned, the animals were fairly discriminating in when they used it. They might offer “no” instead of doing a cued behavior if the cued behavior was difficult, uncomfortable, or unknown.  They also might offer it after an error.

Despite his success, he’s not sure you should use it with all animals. Usually if an animal is trained with positive reinforcement, it already has lots of ways to say “no,” so it’s not necessary to teach another one.  It may be more useful to work on your own training or observation skills so you notice the first signs of frustration and can adjust before the animal reaches the point where it needs to say “no.”

There may also be difficulties if you teach it too early because the animal might get “stuck” on that behavior. This point made me think of Jesus and his comments about the danger of having a limited repertoire. Ken thinks it’s better to teach the animal a larger repertoire and then add a “no” behavior if needed, either because the relationship has broken down, or the animal has lost confidence. If you do teach a “no” behavior, it’s important to choose an appropriate one, either one that is useful or is a starting point for other behaviors.

I enjoy Ken’s presentations because he always has the coolest projects and approaches them with a great blend of practical and scientific knowledge.  At some point in his presentation, he mentioned that the “no” project brought together a lot of scientific principles, including matching law, contra-freeloading, Premack, and others.  But he also said that he used what he had learned from observing other trainers, or observing the animals themselves.  I think this project was a great example of how we can give animals more choices as long as we have a well thought out plan and are willing to take the time to see it through.

This is the last of the articles I am planning on writing on the ASAT conference.  I have lots of ideas for what to do with what I learned from the conference, and may blog about some of my own training later this summer. In the meantime, I hope something in these articles has caught your attention and inspired you to go out and try something new.   I want to end by thanking all the speakers for their permission to share my notes. I also want to thank all the ORCA students who work hard to put to plan and run the conference. They are already busy planning the conference for next year and it will take place on March 24-25, 2018 in Irving, Texas.


Notes from the Art and Science of Animal Training Conference (ORCA): Dr. Jesús Rosales-Ruiz on “Conditioned Reinforcers are Worth Maintaining.”

click treat 1.jpg

In this short presentation, Jesús Rosales-Ruiz revisited the question:

“Do I have to treat every time I click?”

He said that this question constantly comes up and that different trainers have different answers.

Before I share the details of his presentation, I want to mention that he said he chose to use the words “click” and “treat” because he was trying to avoid using too much scientific jargon.    But, as he pointed out at the end of his talk, it would be more accurate to say “click and reinforce,” and probably even more accurate to say “mark and reinforce.”

Since he used “click and treat,” I’m using the same words in these notes, but you should remember that he is really looking at the larger question of how we use conditioned reinforcers and whether or not they always need to be followed by a primary reinforcer in order to maintain their effectiveness.

Back to the question…

Do you have to treat after every click?

Some say YES:

  • Otherwise the effectiveness of click may be weakened
  • Bob Bailey says: “NEVER sound the bridging stimulus idly (just to be ‘fiddling’) or teasing…it’s important that the “meaning” of the bridging stimulus is kept unambiguous and clear. It should ALWAYS signify the same event- The primary reinforcer.”  How To Train a Chicken (1997) Marian Breland Bailey, PhD and Robert E Bailey.
  • This view is supported by research that shows that the conditioned reinforcer should be a reliable predictor of the unconditioned reinforcer.

Some say NO:

  • Once a click is charged, you only have to treat occasionally
  • Once a behavior is learned, you only have to treat occasionally
  • Supported by research on extinction (in general, this means that if an animal learns that not every correct answer is reinforced, then it will keep offering the correct answer for some period of time, even if there’s no reinforcement.

So maybe there is some research for both.

He said that he started thinking about this question again after reading a blog by Patricia McConnell, who was sharing some thoughts on whether or not to treat after every click. She was wondering why clicker trainers recommend it, but other positive reinforcement trainers do not.

Patricia McConnell wrote:

  • “For many years I have wondered why standard clicker training always follows a click with a treat.”
  • “Karen Pryor strongly advocates for us to reinforce every click (secondary reinforcer) with a treat (primary reinforcer). Ken Ramirez, one of the best animal trainers in the world, in my opinion, always follows a click with a treat.”
  • “But Gadbois went farther, given the link between motivation and anticipation, suggesting that it was important to balance the “seeking” and “liking” systems, with more emphasis on the former than the latter during training. He strongly advocates for not following every click (which creates anticipation) with a treat, far from it, for the reasons described above.”

You can read the blog at: http://www.patriciamcconnell.com/theotherendoftheleash/click-and-always-treat-or-not.

If you have not heard of Simon Gadbois, you can read about him here: https://www.dal.ca/academics/programs/undergraduate/psychology/a_day_in_the_life/professors/simon-gadbois.html.

What happens if you don’t treat after every click?

Jesús was intrigued by Gadbois’s statement that you don’t want, or need to treat after every click because you want to balance “liking” with “seeking.” And that if you don’t treat after every click, you get more seeking.

One reason for his interest was that he already knew of an experiment that had been done to look at what happens if you don’t follow every click with a treat.  About 10 years ago, one of his students wanted to compare how the behavior of a dog trained under conditions where one click = one treat was different than a dog that was trained with multiple clicks before the treat.   The two conditions looked like this:

  • one click = one treat:  The trainer clicked and treated as normal after every correct response:  cue -> behavior -> click -> treat -> cue -> behavior ->click -> treat.
  • two clicks = one treat:  The trainer clicked for a correct response, cued another behavior and clicked and treated after the correct response: cue -> behavior -> click -> cue -> behavior -> click -> treat.

These dogs were tested by asking for previously trained behaviors. Each dog was trained under both conditions so some training sessions were under one click = one treat and some were done under two clicks = one treat.  There were multiple reversals so the dogs went back and forth between the two conditions several times over the course of the experiment.

Under the one click = one treat condition, the dogs continued to perform as they had in training sessions prior to the start of the experiment. Under the two clicks = one treat condition, both dogs showed frustration behaviors, deterioration in behavior and at times the dog would leave the session.

There were many factors that could have contributed to the result, including the fact the dogs were originally trained under one click = one treat,  the reversals themselves could have caused confusion, and the dogs might have done better if they were transitioned more gradually.  But, it was pretty clear that omitting the treat did not activate the seeking system, instead it created frustration. Why?

They considered two possibilities:

  • Perhaps because they were getting less food? Under the one click = one treat condition, each dog was getting twice as much food reinforcement as the dog training under the two clicks = one treat condition.
  • Properties of the click had changed.  What does the click mean to the dog?

Can we test if it’s about the decrease in food reinforcers?

If you want to test what happens when you click without treating, you have to change the ratio of clicks to treats. You can do that by omitting some treats, or by adding some clicks. But both options are probably not going to be perceived in the same way by the animal.

In the experiment described above, the trainer changed the ratio of clicks to treats by omitting food reinforcers after half the clicks. This is a significant decrease in the number of primary reinforcers that the dog was receiving. Could the results be more about the reduction in food reinforcers, than about whether or not each click was followed by a treat?

One way to test this would be to keep the number of food reinforcers the same, but add another click.  To do this, the trainer taught the dog to do two behaviors for one click.  The dog would touch two objects. When he touched the second object, he would get clicked and treated.

Once this behavior had been learned, the trainer decided to add another click by clicking for the first object, clicking for the second object and then treating. So the pattern would be behavior (touch) -> click -> behavior (touch) -> click -> treat. This works out to clicking after every second behavior, but the trainer got there by adding a click, not by removing a treat.

What she found was that the dog just got confused.  The dog would orient to the trainer on the first click, get no response, go back to the objects and touch again (either one).  Or he might just wait and look at the trainer, or he might leave. The additional click didn’t seem to promote seeking. Instead it interrupted the behavior and created confusion.

Why?  Well, perhaps it has to do the two functions of conditioned reinforcers. This goes along with the second point above, which is that the difference was due to how the click was being used.

The 2 Functions of Conditioned Reinforcers:

Let’s take a moment and look more closely at conditioned reinforcers.  Conditioned reinforcers are stimuli that become reinforcers through association with other reinforcers.  They usually have no inherent value. Instead, their value comes from being closely associated with another strong reinforcer for a period of time, (while it is being “conditioned”), and this association must be maintained through regular pairings in order for the conditioned reinforcer to retain its value.

In training, this is usually done by deliberately pairing the new stimulus with a primary reinforcer.  There are different kinds of conditioned reinforcers and their meaning and value will depend upon how they were conditioned and how they are used.  Marker signals (the click), cues, and keep going signals (KGS) are all examples of conditioned reinforcers.

Regardless of the type, all conditioned reinforcers have two functions. They are:

  • Reinforcing
  • Discriminating (they can function either as cues or event markers, or both)

Conditioned reinforcers are not just used in training and laboratory experiments.  They are everywhere.

Jesús used the example of a sign, which is a conditioned reinforcer for someone driving to a specific destination.  Let’s say you are driving to Boston and you see a sign that says “Boston, 132 miles.” The sign provides reinforcement because it tells you that you are going the right way. It also has a discriminatory function because it provides information about what to do next, telling you to stay on this road to get to Boston.

When talking about conditioned reinforcers, it’s easy to focus on only one of these functions.  Is this why there is confusion?  Perhaps the debate over whether or not to treat after every click is because some trainers are focused on the discriminating function of the click and others are focused on the reinforcing function of the click?

What does training look like if the focus is on the discriminating function?

When every click is followed by a treat, the click has a very specific discriminating function. It tells the animal it has met criteria and reinforcement is coming.  The trainer can choose what the animal does upon hearing the click (stop, go to a food station, orient to the trainer), so the trainer has to decide what behavior she wants the animal to do upon hearing the click. But, regardless of which you choose, the click functions to cue another behavior which is the start of the reinforcement process.

A lot of one click = one treat trainers emphasize the importance of the click as a communication tool.  There are two aspects to this. One is that it marks the behavior they want to reinforce and the other is that it tells the animal to end the behavior and get reinforcement. If the click is always followed by a treat, the meaning of the click remains clear and it provides clear and consistent information to the animal.

You can think of the click -> treat as part of a behavior chain, where the click has both a reinforcing function, from the association (click = treat), and also an operant function (click = do this).  Clicker trainers who promote the one click = one treat protocol still recognize that the click itself has value as a reinforcer, but they choose to focus on the click as an event marker and as a cue, more than as a reinforcer.

What does training look like if the focus is on the reinforcing function?

A lot of trainers who treat intermittently (not after every click) emphasize that the click is a reinforcer in itself, so it’s not necessary to also provide a treat after every click. They are looking at the reinforcing function of a conditioned reinforcer and would argue that the whole point of having a conditioned reinforcer is so that you don’t have to follow it with another reinforcer every time.

They are still using the discriminating function of the click because it can be used to mark behavior.  But, the click does not become an accurate predictor of the start of the reinforcement phase, so it is not going to have the same cue function as it does under the one click = one treat condition.

Jesús did mention that if the click is not a reliable cue for the start of the reinforcement process, then the animal will look for a more reliable way to tell when it will be reinforced. In most cases, the animal finds a new “cue” that tells it when to expect reinforcement and the click functions as a Keep Going Signal. If the animal can’t find a reliable cue for the start of reinforcement, or if it’s not clear when the conditioned reinforcer will be followed by reinforcement, and when it won’t, then he will get frustrated.

Back to the Literature…

With this information in mind, what can we learn by going back and looking at the research on conditioned reinforcers?  Well, it turns out that the literature is incomplete for several different reasons:

  • It doesn’t look at the cue function of the conditioned reinforcer.
  • Animals in the lab are often restrained or constrained (limited in their options) so the cue function of the conditioned reinforcer may be more difficult to observe.
  • It doesn’t take into account that the most consistent predictor of food is the sound of the food magazine as it delivers the reinforcement.   Even when testing other conditioned reinforcers, the sound of the food magazine is what predicts the delivery of the food reinforcement, and it’s on a one “sound” = one “treat” schedule.
  • To test a conditioned reinforcer that as sometimes followed by food and sometimes not, you would have to use two feeders, one with food and one without and even then you would have to worry about vibrations. Most labs are not set up with two feeders so this work has not really been done.

He also mentioned that a lot of what we know about conditioned reinforcers in the lab is from research where the conditioned reinforcer was used as a Keep Going Signal (KGS), and not as a marker or terminal bridge.

I asked Jesús if he had an example of an experiment using a conditioned reinforcer as a KGS and he sent me an article about a study that looked at the effect of conditioned reinforcers on button pushing in a chimpanzee.

The chimpanzee could work under two different conditions. In one condition, he had to push the button 4,0oo times (yikes!) and after the 4,000th push, a light over the hopper would flash and his food reinforcement would be delivered. In the other condition, he also had to press the lever 4,000 times, but a light would flash over the hopper after every 400 pushes, and then again at the end when the food was delivered after the 4,000th push.

The chimpanzee was tested under both conditions for 31 days and the results showed that he worked faster and with fewer pauses until he got to the 4,000th push when he was reinforced by the flashing light every 400 pushes.

Once the chimpanzee had been tested under both conditions for 31 days, they started the second part of the experiment.  In this part, the chimpanzee could choose the condition (by pressing another button) and he usually chose the one where the light flashed after every 400 pushes.

So, having a Keep Going Signal improved the speed at which the chimpanzee completed the 4000 pushes and was also the condition preferred by the chimpanzee.  This suggests that Keep Going Signals can be useful and an animal may prefer to get some kind of feedback.

In this experiment, the conditioned reinforcer they were testing (the flashing light) was functioning as a KGS and the sound of the food magazine was what told the chimpanzee that he had met criteria.  So, this is an interesting experiment about conditioned reinforcers as Keep Going Signals, but it also shows the difficulty of separating out the conditioned reinforcer from the stimulus that predicts food delivery.

An example of training a KGS with a dog

Jesús talked a little bit more about Keep Going Signals, using an example from one of his own students. She wanted to teach her dog a new conditioned reinforcer that she could use as a KGS. She started by teaching the dog to touch an object for a click and treat. Once the dog had learned the behavior, she said “bien” (her new KGS) instead of clicking, and waited for the dog to touch the object again. If the dog repeated the touch, then she would click and treat.

She was able to use the KGS to ask the dog to continue touching an object and I think she tested it on other objects. You do have to train a KGS with multiple behaviors in order for it to become a KGS, as opposed to a cue for a specific behavior. I don’t know if she tested it with other behaviors, but that would be the next step. I’m also not sure if they compared the dog’s performance, with and without the KGS, to see if adding a KGS increased the dog’s seeking behavior, as Gadbois had suggested it would.


The difficulty with the question “Do I have to treat after every click?” is that the answer depends upon how you are using the click and whether or not it cues the animal to “end the behavior” and expect reinforcement. Conditioned reinforcers have two functions. They function as reinforcers and as discriminators, and you need to consider these functions when choosing how to use the click.

If you are using the click as a Keep Going Signal, the animal learns to continue after the click and the click does not interrupt  the behavior.  This means you can click multiple times before delivering the terminal reinforcer. However, it’s likely that you will end up having a different cue that tells the animal when it has completed the behavior and can expect reinforcement. If you don’t, the animal may become confused about what it should do when it hears the click.

If you are using the click to indicate when the behavior is complete, the animal learns that the click is a cue to start the reinforcement process.  You can teach the animal a specific response to the click so that the animal knows what to do to get his reinforcement. If the click is being used in this way, then it will interrupt the behavior and you will want to wait until the behavior is complete before clicking.

We call both these types, the click as a KGS and the click as an “end of behavior” cue, conditioned reinforcers, but they are not the same thing. There are many kinds of conditioned reinforcers, and when you are not specific, it’s easy to think you are talking about the same kind, but you are not.  So both “camps” may be right, but for the wrong reasons.

Jesús finished by saying we need to study this more carefully in the laboratory and also in real life training situations.  One point he made was that an animal, who initially learned one click = one treat, could probably be re-trained to understand that the click was a KGS, if the transition was done more slowly (than the dogs in his student’s experiment), but he still thinks it would change the meaning of the click from an “end of behavior” cue to a “keep going signal.”

I thought this was a very interesting talk, partly because it shows how important it is to clearly decide how you are going to use conditioned reinforcers and to make sure that you teach your animal what it means. I don’t think it was intended to be the final word on a complicated subject, but the presentation certainly made me more aware of the importance of thinking about the many functions of conditioned reinforcers and how I am using them.

But… I’m not sure it left us with an answer to the question of what happens when the same conditioned reinforcer is used as both a KGS and to end the behavior, which is how many trainers describe their practice of clicking multiple times before delivering the terminal reinforcer. There needs to be research done on what happens if it is used as both.

A few personal thoughts

This presentation was informative, and made me feel more confident about the system I use, but it also left me with some unanswered questions.

I have always followed a click with a treat. It is how I originally learned to clicker train and it has worked well for me. If I want to use a different reinforcer, I have a different marker. If I want to provide information or reinforcement to the horse without interrupting the behavior, I have several other conditioned reinforcers I can use.

It’s never made sense to me to have the same conditioned reinforcer sometimes be a cue to “end the behavior” and sometimes be a cue to “keep going.” I question if that’s even really possible, unless the animal learns it has different meanings under different conditions, and that seems a bit awkward. It just seems simpler to have clearly defined conditioned reinforcers and use them in a consistent manner.

I was intrigued by the research into Keep Going Signals. I do use Keep Going Signals and have found them to be useful. But I have also found that I have to pay attention to maintaining them in such a way that they retain their value (through pairing with other reinforcers), but don’t become reliable predictors of reinforcement and morph into “end of behavior” cues. I’d love to see more research on how to effectively maintain Keep Going Signals, as well as some research on how effective they are at marking behavior.

Notes from the Art and Science of Animal Training Conference (ORCA): Dr. Paul Andronis – Adjunctive Behavior – So What Else Is Going On?

dog drinking

Dr. Paul Andronis is a professor at Northern Michigan University where he is an expert in the experimental and applied analysis of behavior.  In his presentation, he shared some information on “adjunctive” behavior both from an academic and a practical viewpoint. He discussed several varieties of adjunctive behavior, how it differs from other types of behavior, and the necessary conditions under which it occurs. He also talked about how it should be classified and why the concept of adjunctive behavior is relevant for animal trainers.

The history of adjunctive behavior

Adjunctive behavior was first identified in the laboratory as “schedule-induced” behavior (Falk 1961). It was observed in experiments where an animal was subjected to the same schedule or limited contingency over and over.

In this scenario, you would expect to see some regularity of behavior, and that the animal would become very efficient at doing what was necessary to earn reinforcement.  But, in some cases,  what they saw was that the animals were doing a lot of extra and unnecessary behavior.  The types of behaviors they observed, and how much these “extra” behaviors occurred, varied depending upon many factors. But they were often tied to reinforcement schedules or something about the environment, usually the experimental set-up.

Even though adjunctive behavior was not described until 1961, it’s likely that it had been present in other experiments and was ignored or not recognized as being of interest. Or perhaps it had occurred infrequently because the experimental set-up for many experiments had included some level of constraint or restraint that limited the animal’s behavior. Pavlov’s dogs were physically restrained and Skinner’s rats were placed in operant conditioning chambers that offered few options for alternate behaviors.  In any case, it was not until 1961, that adjunctive behavior started to receive attention.

Dr. Andronis did mention that it was around this time that Keller and Marion Breland published their article “The Misbehavior of Organisms.” The article described how some behaviors were difficult to train in certain animals if the behavior they wanted to train conflicted with a strong natural behavior. For example, the Brelands wanted to train a raccoon to put money in a slot, but the raccoon wanted to wash, or would go through washing actions with,  objects that it was given. Even though washing was not part of the behavior the Brelands wanted, or were intending to reinforce, they found it was very difficult, if not impossible, to eliminate it.  The fact that they could not control this “misbehavior” through a reinforcement contingency was used by some scientists to try and discredit operant work.

As part of the  discussion about how experimental set-up can influence results, Dr. Andronis talked about some of his own work training key pecking with pigeons. He said that, in one of his experiments, they had to use some older keys that were stiffer than the usual keys.  They found that the pigeons pecking the stiff keys produced a different pattern of behavior than the pigeons pecking lighter keys.  I think he was comparing results from two different experiments here, not saying they deliberately compared two types of keys.

When the pigeons were put on a fixed rate schedule,  they saw a nice scalloping pattern (surges of pecking followed by pauses) with the birds pecking the stiff keys.  But the birds pecking lighter keys showed a more consistent pecking behavior where they pecked at a more steady rate. The reason he shared this example was that the birds pecking the stiffer keys had time to do adjunctive behaviors because there were gaps in the pecking behavior. The bird with the lighter keys were busy pecking and less likely to do other behaviors. So, something as simple as key stiffness could make a difference in whether adjunctive behavior was likely to occur or not.

Adjunctive behavior in the literature

The history of adjunctive behavior can be difficult to trace because it goes by many different names in the literature, and while these names may refer to the same phenomena, they may also be used for similar but different phenomena as well.

Some of the most common names in the literature are:

  • schedule-induced behavior
  • Collateral responses
  • Adjunctive behavior
  • Ancillary behavior
  • Interim activities (or behavior)
  • Behavioral side-effects
  • Psychogenic behavior

Some of these terms are used more when referring to behaviors that are repeated because they end up being part of the behavior cycle (ABC cycle), even though they are not part of the criteria for reinforcement.  Other terms are more likely to be used to refer to behaviors that the animal does between opportunities for reinforcement, but that seem to be indicative of stress or frustration, such as might happen when the animal is placed on a lean reinforcement schedule.

For example, the term “collateral response” is usually used to describe behavior that is repeated because it gets reinforced along with the behavior the trainer wants the animal to do.  This might happen if the rat presses a lever, gets food,  and goes through a whole cycle of behavior before it is reinforced for lever pressing again.  From the animal’s perspective, it had to do the entire cycle to get reinforcement, because all the behavior in the cycle was reinforced along with the target behavior. The animal doesn’t know that it didn’t actually have to do it, it was just filling time.

On the other hand, behavioral side-effects or psychogenic behavior are terms that are more likely to be used to describe behaviors that occur at a higher frequency when an animal is stressed. These behaviors are usually not done intentionally by the animal.  Examples of these types of behavior are excessive defecation, urinating, and defecating.

I want to mention that professional animal trainers are unlikely to use the terms I listed above, which are used more in academic circles, but they certainly do recognize that adjunctive behaviors exist.  They just have their own names for them and may also have slightly different definitions.  Some common names for “extra” behavior that occurs when trying to train something else are displacement behaviors, stress or frustration behaviors or superstitious behaviors.

Okay, so with all this background information…

How do we define adjunctive behaviors?

Critical attributes of the concept.

An adjunctive behavior is a behavior that:

  1. Reliably accompanies another operant behavior targeted by experimenter-programmed contingencies;
  2. Is not explicitly required for meeting the requirements of those (E-programmed) contingencies;
  3. Is not reinforced (directly or adventitiously) by those contingencies maintaining the operant behavior they accompany; and
  4. Is occurring at rates considered excessive under the particular procedural arrangements.

Various types of adjunctive behaviors have been observed in laboratory settings. Here’s a list of some of them:

  • Polydipsia (excessive drinking)
  • Air-licking
  • Wheel running – common, but varies among species
  • Aggression – also fairly common
  • Bolt-pecking
  • Pica
  • Locomotor activity
  • Paw grooming
  • Defecation
  • Wing-flapping
  • “displacement preening”
  • Escape from schedule-requirements
  • Cigarette smoking (I hope this is among human subjects…)
  • Alcohol consumption (most animals avoid alcohol, but will drink it under certain conditions)
  • Chronic hypertension
  • Nitrogen “drinking”
  • Pellet “pouching”
  • Self-injection with nicotine

When and where does adjunctive behavior happen?

Most adjunctive behavior has been observed in the laboratory (is this suspicious?) and it has been studied in rats, pigeons, mice, hamsters, gerbils, humans, and rhesus macaques.

Dr. Andronis pointed out that adjunctive behavior is more likely to occur when animals are put on certain types of reinforcement schedules. This is where the name “schedule-induced behavior” comes from.  It is more likely to occur on schedules that have longer intervals between reinforcement, or when reinforcement is not delivered as expected.  It also tends to occur at predictable places in the schedule.

The type of schedule-induced behavior can tell you something about the underlying state of the animal.  An animal that is showing aggressive behavior is in a different state than one that thinks it has to do some extra body movement to earn reinforcement.

But, Dr. Andronis did point out, later in the presentation, that some schedule-induced behavior is simply behavior that has been reinforced as part of the reinforcement contingency.  So, even within the category of schedule-induced behavior, you can have adjunctive behavior that is operant, and is being done deliberately, or respondent, and is more a reflection of how the animal is feeling.

I think his point was that the type of behavior can be useful information, or not. You may have to collect additional information before you can interpret it. And of course, a behavior might initially occur because the animal is frustrated and then be maintained or increase because it is being reinforced.  One of the challenges of schedule-induced behavior is that it is probably the result of several variables, not just the schedule, and it’s not always clear how all the variables interact.

I think it’s interesting that most of the behaviors on the list above are mostly ones that you would not want to have happen during training, and some are ones you wouldn’t want to have happen at all, because they indicate that the animal is stressed.

Some examples of adjunctive behavior:

Polydipsia (excessive drinking):

    • Polydipsia was the schedule-induced behavior identified by J. Falk in 1961. He was studying rats and wanted to measure water intake under natural conditions. When he changed his reinforcement schedule to a F1, (1 minute between delivery of reinforcement), he saw more drinking. The rats would drink half their body weight in under one hour.


  • Dr. Andronis had an example of this with people living in a hospital.  The residents would walk the halls because they were bored.  At the end of each hall, there was a water fountain and they would stop to get a drink, probably just for something to do. This led to problems because the excessive urination affected their medicine and the hospital had to provide more activities to limit the hall walking and polydipsia behavior.


    • In experiments where an animal becomes frustrated due to lack of expected reinforcement, there may be increased aggression toward other animals.


  • You may also see aggression due to resource guarding in experiments where food is delivered for lever pressing.  One animal may choose to guard the lever so another animal cannot have access to it.

Increase in Social Behaviors:

  • You can see schedule-induced changes in social behavior if you have multiple animals and multiple levers.  The animals may switch places to “help” another animal earn reinforcement.


  • Schedule -induced defecation has been observed in rats. In one experiment, the researcher noticed that the rats were producing more feces than expected, and also that they were defecating at unusual times.  He discovered that the amount of defecation could be manipulated by changing reinforcement schedules and was also affected by whether or not a water bottle was present.  (Rayfield, 1982)

An experiment that showed how to generate different kinds of adjunctive behavior

Along with the examples of types of adjunctive behavior, Dr. Andronis also described an experiment he did with pigeons.  In the experiment, he had pigeons pecking keys under three different conditions. They were:

  • Hard: the bird had to peck a lot to earn reinforcement
  • In-between: moderate amount of pecking
  • Easy: few pecks required to earn reinforcement

Interestingly, the “in-between” schedule was the one that produced either aggression or social behavior. On the easy schedule, the birds were not stressed. On the hard schedule, the birds were too busy to do anything else. But the in-between one created some frustration and also provided opportunities for other behaviors.

The experiment had several steps:

    1. He taught the birds the schedules. Each bird was in a cage with a lever.


    1. He added two new additional keys (in another location in the same cage) that allowed the bird to change the schedule. One key made the schedule harder, the other made it easier.  The pigeons all learned to set their own requirements to “easy.”


    1. He switched the function of the keys so the key that used to make it easier now made it harder.  This led to some frustration because the bird would choose the key for “easy” and get frustrated that it no longer worked as it had before. But, the pigeons did learn that when that happened, they should choose the other key. And they also learned that the functions of the keys would change and would anticipate the change and start to peck the other key even before he switched it.


    1. He set up a social experiment by placing two birds in side by side cages. I can’t remember the exact arrangement, but it was designed so that one bird could control the “difficulty” level for the other bird, meaning it got to decide if the other bird’s schedule was easy, in-between, or hard.


    1. He observed the behavior of the two birds.  He found that the bird who could control the other bird’s level of difficulty would consistently choose to peck the key that made it harder for the other bird to get food.


Main Laboratory findings

He provided a brief summary of what they have learned about adjunctive behavior from work in the laboratory.

    • It’s often a “post-reinforcement phenomenon” (it happens right after reinforcement when there is a delay before the next reinforcement is available)


    • Rates of occurrence vary as an “inverted-U” function of the inducing-schedule parameters. Rises over some schedule requirements, and then drops off, similar to a dose response curve.


    • They tried to show that it was not reinforced by inducing contingency.  Scientists would insert a COD (change over delay) to try and separate the adjunctive behavior from the reinforcement, so it was not accidentally reinforced.  But you don’t know how the animal experiences the COD and some scientists argue that a separation doesn’t mean there’s no effect.


    • Probably related to potentiating effect of inducing-schedule on SDs  or reinforcers specific to induced behaviors.


    • Idiosyncratic patterns of substitutability among induced behaviors.


Where it fits, in theory

Scientist like to know why behavior occurs, so once they identified and starting studying adjunctive behaviors, they came up with some theories about them.

Are they? 

    • Adventitiously-reinforced, superstitious behaviors? (Skinner). This is a reference to Skinner’s experiment on the development of superstitious behavior (1948).


    • A third class of behavior? (Wetherington, 1982), i.e., respondent, operant and contingency-induced. She concluded that they were not a third class of behavior and that trying to put behaviors into respondent/operant categories was problematic.


    • “Induced states” (Staddon) – purely a function of reinforcement schedules that induce certain behaviors, particularly drinking.


    • “Just plain operant behavior,” related to joint-environmental effects of programmed contingencies


Of these, Dr. Andronis favors the operant contingency relation.  If you consider that animals are balancing costs and balances all the time, it seems possible that there is an operant aspect to adjunctive behavior.

This doesn’t really tell us where they adjunctive behaviors come from, but the theory that adjunctive behavior is operant fits in with some other things we know about behavior and appears consistent with the innate response hierarchies posited by Lorenz and other ethologists, and with probabilistic model of Epstein in generativity theory.  According to Epstein’s theory, creativity or the generation of “novel behavior” is predictable and you can calculate probabilities for the different options.

Though the languages differ, both acknowledge:

    • Prior histories of occurrence for the behaviors involved;


    • Differential probabilities of the response classes in repertoire depend upon the presence of specific antecedent stimuli (“innate releasers” or SDs) and/or other potentiating variables (“motivational” variables);


    • When the currently highest probability behavior in the repertoire is momentarily interrupted (eg. Suppressed by punishment, rendered less likely by breaking the contingency, disrupted by changes in background stimuli, physically restrained or other prevented from occurring, etc.) the next most probably behavior in the repertoire occurs, made more probably particularly by its specific antecedent stimuli being present.


Ok, there’s a lot of jargon in there. As best I understood, what he was saying was that adjunctive behaviors are behaviors that already exist in the animal’s repertoire and they are “released” by stimuli in the environment under certain conditions.  One of the conditions under which this happens is if the animal’s ability to earn reinforcement is interrupted.

An example of an experiment that supports this theory was described by Dr. Joe Layng in the question and answer session after the presentation. In the first part of the experiment, he and Dr. Andronis taught pigeons to do a new behavior, one that was not a natural behavior for pigeons. The behavior they chose was head banging. Yes the pigeons would bang their heads against the wall for reinforcement. But don’t worry. they made them wear helmets. (Yes, this is true – he showed us a picture).

After they trained this behavior, they put the pigeons on a reinforcement schedule for pecking that was likely to produce schedule-induced (adjunctive) behavior. And what did the pigeons do for the adjunctive behavior? Head banging. Head banging was a behavior with a previous history of high reinforcement, so when the pecking was interrupted, it’s the behavior that re-appeared.

Increasing the likelihood of contingency adduction

In the experiment described above, Dr. Layng and Dr. Andronis intentionally introduced an undesirable behavior, but the same process can be used to produce new behaviors through “contingency adduction.”  This goes back to the idea that adjunctive behaviors are operant and can become part of the behavior.

It is related to Dr. Epstein’s work on creativity in which he showed that what we think of as being creative is often just a new combination of some known behaviors.   What you want to do is teach several behaviors separately and then set up conditions where the learner combines them.

He stated that:

    • Evocative environmental arrangements can introduce controlled variability into behavioral stream (canalization).


  • They can make possible (or highly likely) some novel combinations of existing and emerging repertoires that in turn meet new and more complex contingency requirements posed by the trainer.

He showed a picture of Freud’s office, which was filled with lots of weird stuff.  This makes it easy for the “patient” to do the ice breaking and is a way to evoke behaviors without doing a lot of prompting. It’s also useful because history affects how a stimulus is perceived, so an individual’s response to different stimuli can tell you a lot about what has happened to them in the past.

For animal trainers, the equivalent of this would be setting up the environment to get some natural behaviors going.  Behavior is never isolated. You want to be alert to instances when behavior happens systematically and take advantage of it.  Depending upon the type of behavior you want, you may want to have more controlled variables – for motor behavior, or you can leave it more open ended if you are looking for more creativity or teaching cognitive tasks.

Here are a few points from the question and answer session:

If you sometimes reinforce the adjunctive behavior, you will get it at even higher rates than would be expected.  (this is why a mis-timed click that lands on an adjunctive behavior can be such a set-back). It doesn’t take much reinforcement to maintain it, once you have an adjunctive behavior going.

Dr. Killeen said they have done work that showed that a delay is not a guarantee of separation. You can train a rat to press a lever and wait 30 seconds before reinforcement and they still learn to press the lever

Notes from the Art and Science of Animal Training Conference (ORCA): Barbara Heidenreich on “Maintaining Behavior the Natural Way.”

Barbara Heidenreich is a professional animal trainer who does extensive consulting with zoos and also works with individuals training many different species.  In her work as a consultant, she often finds herself in situations where she has cannot rely solely on food for reinforcement, so she has learned to identify and use non-food reinforcers of many different kinds.

In her presentation, she shared some tips on finding and using non-food reinforcers.  This was a great follow-up to the discussions on using The Premack Principle because she had a lot of good examples of behaviors that could be used for reinforcement.  The examples included a lot of videos, which I cannot include here, but if you go to her website (www.goodbirdinc.com), or look her up on YouTube, she has lots of videos available.

Why food might not be the best option

She started with a discussion about why food might not be the reinforcer she chooses to use.  In some cases, it has more to do with the rules of the facility at which she is consulting, but there are lots of other reasons why food might not be an option, or might not be the best option.

Here’s her list:

  • The animal has no motivation for food
  • The diet is limited for health reasons
  • The diet was fed out already or needed for other purposes
  • She has no authorization to use food
  • The animal is fasting (snakes, alligators, etc.)
  • The same reinforcer can become predictable and less effective (she didn’t expand much on this but some trainers do believe that reinforcement variety is important)

While some of these limitations are less common with horses, I have found myself in situations where I could not use food for dietary reasons, because it was not allowed at the facility, or because the horse would not be able to eat during the training (this is usually only the case with dental or some medical procedures). I’ve also found that, in some situations, food may not be the best reinforcer.  So, having other options is always a good idea.

What do you do if you can’t use food?

If you feel it’s very important to be able to use food, you may need to address the issue directly by getting permission, coming back at a better time, or locating an appropriate food. But, in many cases, the better option is to try using non-food reinforcers. These can  be just as, or even more, effective than food reinforcers when used correctly.

She had some examples of natural behaviors that could be used as reinforcers.  One example was a beaver who could be reinforced with permission to take browse back to her cage.  Another example was a bird that could be reinforced with a colored object that he could take back to his nest.

Identifying possible non-food reinforcers

The challenge for most people is that if they are not used to thinking about non-food reinforcers, then it’s hard to know where to start.  But it all starts by observing your animal.  What does it engage in? What does it seek to acquire? What are species specific trends? Is there social behavior that is reinforcing?

She stated that “ANYTHING an animal seeks to acquire/engage in/have access to/do and can be delivered contingently has the potential to reinforce behavior.”

Here are some types of non-reinforcers that she has used:

  • Scent stimulation (smelling bedding from another animal can function as a reinforcer
  • Tactile stimulation – decide where, when, how  (pigs love to be scratched)
  • Visual stimulation (penguins can be reinforced by chance to chase lasers.  She worked with a gorilla that was reinforced by the chance to look at the ultrasound screen.)
  • Auditory stimulation (people do this inadvertently all the time by responding to behavior with laughter or by verbal responses, also mating calls, conspecifics)
  • Social interactions (in parrots, reinforcers can be facial expressions, mimicry, head bobbing, chuffing, etc.)
  • Enrichment items (toys, boxes, blankets, etc.)
  • Mental stimulation (a chance to solve a problem can be reinforcing)
  • Physical activities (running, flying, destroying, dust bathing, etc.)
  • Access to preferred people, preferred animals, preferred locations (darkness, sunlight, water for bathing or washing food,

There are many subsets within each category and some reinforcers will fall into multiple categories.

Evaluating reinforcers (food and non-food)

Even though many behaviors can be reinforcing, you can’t assume that all of them can be used as reinforcement for other behaviors. Therefore, it’s a good idea to spend some time observing the animal so you become familiar with his body language and responses to different kinds of stimuli. An easy way to start this is by looking more closely at how he responds to food reinforcers.  Learning how he responds to food reinforcers will help you learn to read his body language when you offer opportunities to engage in behaviors as reinforcement.

You can use some of this information to help evaluate non-food reinforcers once you know how they respond to food they like. It can help to compare a few different types of reinforcers to get a larger picture of the possible ways an animal can respond.

Evaluating food reinforcers:

When the animal is relaxed and comfortable and asked to do nothing:

  • Show the animal a food item and observe body language, does the animal look at, lean toward, orient toward the food? Anticipatory behaviors?
  • Offer a small item: how does the animal take it?
  • Observe how it eats the food: speed, using species specific behaviors?
  • What does it do when it’s done? Does it look for another piece? Engage in other activities?
  • Throughout: look for signs (species specific) that indicate the level of satiation

Evaluating non-food reinforcers:

The animal needs to be calm, relaxed, and in a distraction free environment before being offered the opportunity to engage in the activity.

  • Identify criteria for measuring motivation – are there behavioral signs that show the animal finds the activity reinforcing?
  • Are there ways to measure responses to the criteria?  How do you know when the animal wants more or is approaching satiation?

I think it’s a great idea to evaluate reinforcers before starting to use them. It’s so easy to jump into training with a reinforcer that you assume the animal will like, and then find that things don’t quite work out as planned.

The challenges of using non-food reinforcers

There are some differences between using food and non-food reinforcers, so if you are going to use a non-food reinforcer, you will want to think carefully about when and how it will be the most effective.

You also should consider if you need to spend time teaching the animal how it gets access to the reinforcer, how long it should expect to have the reinforcer, and how the reinforcement ends. Just as we teach our animals what to expect with food reinforcers, we need to teach them how we are going to use non-food reinforcers so they have clear expectations and we can use them effectively.

While the examples below touch on the need to have a clear understanding (on both sides) of how and when you are going to use the reinforcer, I wanted to emphasize it because I think it’s important to realize that you can’t just start using non-food reinforcers in place of food reinforcers without some preparation. This is especially true if your animal is used to food reinforcers and has some expectation about how behaviors will be reinforced.

Here are some of the challenges with non-food reinforcers:

  • You might have to wait longer between repetitions (doing a behavior can take more time than eating).
  • The animal might need to “give up” an item, so you need to plan for that (do you teach it to release it or set up the environment so the behavior naturally ends?). You might need to teach a drop, trade or retrieve behavior so the animal gives up what it has in order to do another repetition of the behavior.
  • If novelty is part of the reinforcement value, then you have to keep coming up with novel stimuli, which can be difficult.
  • Some non-food reinforcers are best used either before or after food reinforcers, and not mixed in with them. For example: if you try to mix food and non-food reinforcers with pigs, it won’t work because food is their #1 choice.  So use food before or in separate sessions. With other animals, it might be better to use food reinforcers after non-food reinforcers.
  • Some non-food reinforcers can build high levels of arousal which can flip to aggressive behavior.
  • Some non-food reinforcers can trigger unwanted behaviors (sexual).
  • Some non-food reinforcers require a lot of trainer participation.
  • If using a tactile reinforcer, you may need to train a signal that tells the animal you are going to touch. This prepares the animal and also makes it more easy to tell if the animal wants it. Wiggle fingers -> head scratch, repeat a few times, then delay and see if the animal will come forward to get it. If desired, you can add distance so the animal has to move toward the handler to get the reinforcer.

Video examples:

  • A bird was reinforced for putting a ring in a cup by getting to shred paper as reinforcement.
  • A tiger was shown shifting to new cage for an enrichment item. The item was placed in the new location ahead of time (pre-baiting).
  • An orangutan was reinforced by the opportunity to play with a box if she sat in a particular spot on a log.
  • Other apes were reinforced with a chance to look in a mirror and to play with a blanket.
  • there were more, but I’ve forgotten all of them…

Benefits of Thinking Outside of the Treat Pouch 

  • Great when there is no motivation for food
  • Some reinforcers facilitate calm body language (tactile in pigs and tapirs) which can help with cooperation for medical care
  • Animals may not satiate on some types of reinforcers very quickly
  • Some may be even higher value than food under the right conditions
  • More opportunities for animals to communicate what they want
  • Adding variety of reinforcers can facilitate keeping motivation strong when in maintenance mode for a behavior or a repertoire of behaviors (you can extend your training session because you have more reinforcer options)
  • Creates an attentive, insightful, well-rounded trainer

She ended with a few other video clips.  One was of her rabbit doing an agility course. At the end, the rabbit returned to her crate and could choose her reinforcer. If she faced forward, she got a head scratch.  If she faced to the back, she was reinforced with food.

She also had a clip of a program she is developing where you can go for a walk with a hornbill.  The bird accompanies the hikers and this behavior is reinforced in a variety of ways, both by the hikers and with natural reinforcers it finds along the way

Final thoughts

All the information Barbara presented clearly showed the value of using non-food reinforcers. Since I work mostly with horses, I find it’s easy to end up using food for most reinforcement, but there have been times when non-food reinforcers were good options.  Still, I suspect I haven’t used them as much as I could.

I sometimes think it must be easier to see how to use them when you work with a variety of species and are exposed to lots of different types of behavior.  The differences help to broaden your thinking about how animals engage with their environment and what might be reinforcing. At the same time, when you see similar behaviors functioning as reinforcers for several species, it might make you wonder about whether or not those behaviors might be reinforcing for even more species.

I found Barbara’s presentation had a lot of great ideas for evaluating both food and non-food reinforcers and it has prompted me to look more carefully at how my horses respond to reinforcement.  Do I see signs of enjoyment?  Do I recognize when the horse wants more? When he has had enough?  When he would prefer a different reinforcer or to end the session? Her presentation was a good reminder that good trainers are constantly monitoring and adjusting reinforcers and that both food and non-food reinforcers can be used effectively.

Notes from the Art and Science of Animal Training Conference (ORCA): More on The Premack Principle

horse and ball
In the last article, I wrote about the Premack Principle which states that:

  • • Behaviors are reinforcers, not stimuli
  • • More probable behaviors reinforce less probable behaviors.
  • • Less probable behaviors punish more probable behaviors.

What can I do with information? First, the idea of behaviors as reinforcers opens up many new possibilities for ways to reinforce behavior. And second, it means I need to pay more attention to the effect that each behavior is having on other behaviors that occur either before or after it.

In many cases, especially if I train with food, this information may not change much about what I do. But there are some types of training where knowing about the Premack Principle can make it easier to come up with a training plan that will be successful and will help you make decisions along the way. At the Art and Science of Animal Training Conference, there were two presentations that explained how to deliberately use the Premack Principle in your training.

They were:

  • Emily Larlham – Reversibility of Reward: Harnessing the Power of your Animal’s worst distractions
  • Alexandra Kurland – Do it again: The use of patterns in training

I am going to share some notes from each one separately and then end with a summary about what we have learned at the conference about the Premack Principle.

Emily Larlham – Reversibility of Reward: Harnessing the Power of your Animal’s worst distractions

Emily started us off with a video showing one of her dogs who loves to chase frogs. For this dog, frog chasing is clearly a behavior that has a high probability of occurring in certain environments. Then she shared a video of her dog walking calmly among some chickens, showing that in that scenario, chasing was actually a lower probability behavior.

Her question was “Would you like to get from this to this?”

Of course, most of us would say “yes”, but it can sometimes seem impossible to imagine how we can replace one behavior with another, especially if the dog has a previous history of doing the less desirable behavior, or if it’s hard to control the environment so we can train a new behavior in small steps.

But it is very possible, and she provided some simple guidelines for how to do it. She did make a point of saying that this topic could not adequately be covered in a 20-minute presentation, so she was just providing a basic outline and some examples in order to illustrate the process. If you want to learn more about how to do this, she has additional resources on her website (www.dogmantics.com).

Before getting into details, she brought up the question of whether or not we are brainwashing dogs. I thought this was a really interesting point because when we try to change behavior, we do always need to think about the function of that behavior and what effect it has on the well-being of the animals. We also need to think about if the new behavior can serve the same function as the original behavior, or if we need to change something else so the needs of the dog are still being met.

Every case is going to be different, but she has found that there can be positive changes in health and well-being when a dog learns a new, calmer way to respond to stimuli that would previously have caused a great deal of excitement and stress.

The steps:

  1. Train desirable behavior first, and maybe an interrupter
  2. Train controlled version of undesirable behavior (I’m not sure exactly what she means)
  3. Set up environment in which the dog chooses to the desirable behavior after being released to do the undesirable behavior. If using food, the dog must be calm enough to take treats.
  4. Add criteria/generalize for the final scenario

She had some great examples and video showing the steps. Here are some of them.

Example 1: Teaching a dog to come away from an item of interest.

  • She had a video of her dog learning to recall away from a snuffle mat. She also showed teaching a dog to recall away from a Frisbee.
  • In both cases, she started by making sure the reinforcer she provided for the recall was of higher value than the item of interest.
  • She also built the behavior in small steps, starting with a recall from a short distance or a recall before the dog got too close to the Frisbee.

Example 2: Teaching a dog to orient to the handler, even in the presence of a rabbit.

This was with an Irish Wolfhound and they started by practicing the desirable behavior, which was having the dog orient to the handler. The rabbit was introduced in stages. First, they had a person stand or walk around at a distance. Then they had the person carry the rabbit, and so on. The complete progression was:

  • Practice orienting to handler
  • Practice with a person (no rabbit) at a distance
  • Practice with  a person holding the rabbit, still at a distance
  • Practice with a person and rabbit placed on the ground, still at a distance
  • You can vary the distance as needed, maybe adding distance with each new step, then repeating it with less distance before moving on to the next step.
  • You may want to use interrupter so dog doesn’t practice getting too excited
  • You can decrease distance as part of each step or as a gradual process over the whole sequence, or some combination.

Example 3: She also showed some video of a dog that wanted to chase deer and would get very excited, jumping up and down and straining at the leash. She was able to teach the dog to remain calm, even when deer were present, and the dog can now be walked without trying to chase deer.

A few other points:

  • The distraction can become a cue for the new behavior. In the second example, seeing the rabbit triggers the desire to chase it. The trainer’s goal is to change the response of the dog so that the sight of the rabbit becomes a cue to do the behavior she wants. It’s important to build this slowly by starting with a “picture” that doesn’t elicit the undesired response and then slowly change it so the rabbit cues the new response.

  • The distraction can be used as a reinforcer, if appropriate.  There are some behaviors that can used as reinforcers for the new behavior you have trained. She often uses “permission to sniff” as a reinforcer for waiting.

She ended with a review of these important concepts:

  • Maintain correct arousal level
  • Ability to control training set-ups
  • Building strength of desired behaviors first
  • Break down training into small steps
  • Reinforce dog’s choice
  • Manage/prevent unwanted behavior in between
  • Generalization/brush up

Emily’s presentation was on using the Premack Principle to decrease unwanted behavior by strengthening  new behaviors and make them more probable.  The next section is on how to use the Premack Principle to teach new behaviors by using patterns.

Alexandra Kurland- Do it again: The use of patterns in training

In Alexandra’s presentation, she showed how patterns can be used in training. Patterns are another way to use behaviors as reinforcers, especially if you build your patterns carefully so they include higher probability behaviors that can be used to reinforce lower probability behaviors.

Alex started with a video showing a simple pattern in which a horse went from mat to mat, passing a cone along the way. The set-up of the environment provided clear information to the horse about what to do next. This kind of exercise is a great way to teach some basic skills or behaviors in a systematic and thoughtful way.

Alexandra’s basic premise is that teachers and learners thrive on patterns.

Patterns create:

  • Predictability
  • Opportunities for do-overs (each repeat of the pattern provides another opportunity to practice or improve each behavior)
  • Time to plan ahead (you and your horse know what’s coming next so you can be prepared)
  • Consistent high rates of reinforcement (you can reinforce as much as needed. A pattern can have reinforcement at multiple places, not just at the end)
  • Emotional balance (if well designed)
  • An adjustable degree of difficulty

Patterns add a level of complexity that draws attention to areas where you need to do more work. A behavior might be easy for the horse when you practice it on its own, but be more difficult when inserted in a pattern. Figuring out why can identify areas that need more work. Combining several behaviors together in sequence may also make it easier to see a general trend such as a weakness in one particular skill or a missing component.

What do you need to know to teach patterns?

Patterns are connected to Loopy Training. Loopy training is Alexandra Kurland’s term for training where a behavior is built in loops. A loop can contain one or more ABC cycles (antecedent -> behavior -> consequence).

The basic guidelines for Loopy Training are:

  • Start with a tight loop with clean behavior (one ABC loop)
  • When the loop is clean, you get to move on. Not only do you get to move on, but you should move on. (Alex calls this phrase “The Loopy Training Mantra”)
  • Moving on means expanding the loop to include more behaviors, with additional cues and reinforcers if needed.
  • A loop is clean when it is fluid and there are no unwanted behaviors
  • Both sides of the click need to be clean

Some additional details about loops:

  • For every behavior you teach, there is an opposite behavior you need to teach and you need to balance behaviors within a loop. For example, a well-balanced loop would contain movement and stationary behaviors.
  • This is not recipe driven training. Every loop can be customized for every horse.
  • People get stuck in patterns because they stay too long. The Loopy Training mantra tells you to move on when the loop is clean.

Example of a Pattern created with Loopy Training: Walking with the trainer to a mat

Basic Principle: There’s always more than one way to shape a behavior. What is the future of the behavior? What skills do you want to learn?

Alex often uses mats in her training. They can be used to ask the horse to go forward (go to the mat) and to stop (stand on the mat) so they help to create well-balanced loops. But before you can use mats in your patterns, you need to have basic mat manners which include:

  • The horse walks on a slack lead beside me, or at liberty
  • He stops on the mat on his own
  • He stays on the mat
  • He leaves the mat when asked

To meet the criteria listed above, the horse and trainer need to learn how to go forward together, how to stop together and how to step back (if the horse overshoots the mat). She teaches these first.

Step 1: Teach the individual behaviors first:


  • Can the horse come forward when asked?
  • Alex showed how you can use targeting to ask a horse to come forward
  • She can continue to use the target or add a cue to come forward
  • Stopping can also be taught out of targeting as the horse learns to come forward and stop at the target

Step back:

  • Can the horse step back when asked?
  • Alex showed how to use food delivery to teach backing as part of a basic targeting exercise.
  • Once the horse has learned to back through food delivery, she can add a cue
    Once the horse can come forward, stop and step back, she can start to train matwork.

Step 2: Teach Going to the mat

Alex teaches matwork using what she calls a “runway.” The runway is made of two sets of cones arranged to form an open V (wide). A mat is placed at the base of the V. You can click and reinforce at any point in this pattern. When Alex first teaches it, she will reinforce for most correct responses. Once the horse knows the pattern, she can thin out the reinforcement if desired.

Runway lesson:

  • Start at the open end of the V
  • Ask the horse to come forward one step and then step back one step (needlepoint) at the open part of the V – this allows the trainer to practice asking for one step at a time. The trainer can move on when she chooses, usually when she sees some improvement.
  • Walk with the horse to the mat
  • Stop with the horse on the mat
  • If the horse doesn’t step on the mat, you can ask the horse to come slightly forward (using your needlepoint skills). When the horse does put a foot on the mat, you can use a high rate of reinforcement to reinforce that behavior.
  • Walk off the mat and turn so you are coming around the side (outside the cones) to return to the top of the V. You can switch directions so you work from both sides.
  • The pattern builds the skills that are needed for performance work (turns, forward and back).

The matwork pattern works well because it shows a nice balance of behaviors. It also allows for a lot of adjustability. The trainer can spend more time on the behaviors that need improvement, but is less likely to get stuck on one for too long, as the pattern itself encourages the trainer to move on after any small improvement.

You can make lots of different patterns to teach different behaviors and skills. I use a lot of patterns in my own training and have found that they provide a nice framework for teaching more advanced behavior combinations and can be adjusted to keep the training interesting.

Here are some of the main points from the three presentations on the Premack Principle:

• Think of reinforcement as being about behaviors. What does your animal want to do? Can you use it as a reinforcer?

• Don’t forget that Premack is also about punishers. If you follow a behavior you want with one that the horse is less likely to choose on his own, there will be a decrease in the behavior you want.

• The value of reinforcers can change. Premack showed that the same behaviors can function as reinforcers and punishers, so a behavior can be reinforcing in one situation and punishing in the next.

• You can shift a behavior from more probable to less probable by building reinforcement history for that behavior in an environment where the animal can be successful, and then slowly raising the criteria until he can do the new behavior under the original conditions.

• When choosing a behavior to use as reinforcer, consider behaviors that the animal enjoys or that have a strong reinforcement history.

• As with any type of reinforcer, you do want to choose one that provides the appropriate level of motivation and doesn’t tip the animal over into a state where he is too excited to learn. Barbara Heidenreich gave a presentation on non-food reinforcers that included some tips on choosing and using non-food reinforcers.

• Chains and sequences of behaviors will be stronger if the behaviors are arranged so that more probable behaviors follow less probable behaviors.

Notes from The Art and Science of Animal Training Conference (ORCA): Dr. Peter Killeen on “Skinner’s Rats, Pavlov’s Dogs, Premack’s Principles.”

Dr. Killeen is a professor of psychology at Arizona State University and has been a visiting scholar at the University of Texas, Cambridge University, and the Centre for Advanced Study, Oslo.  He gave the keynote address on Saturday morning.  Here’s the description from the conference website:

“Reinforcement is a central concept in the enterprise of training, and yet it remains a controversial one. Much of the opinion about its nature is derived from laboratory protocols involving food or water deprived animals. This does not always translate into the more complex and pragmatic world of animal training. In this talk I take a step back, to re-embed the concept of reinforcement in an ecological context. Reinforcement is always caused by the opportunity for an animal to make a transition from one action pattern to the next. The Premack principle is a simple deployment of this insight. I will discuss the Premack principle, alternate versions of it, and the relevance of the emotional state of the animal.”

I want to preface this article with a few thoughts.  This is a long article and at times it may seem overly academic for the needs of most animal trainers.  By the time I was done writing it, I found myself wondering if anyone would want to read it. 

But I hope that you will do so, because I think Dr. Killeen has shared an important perspective on animal training and behavior that combines the work of psychologists, ethologists and other professionals in related fields.
This is somewhat unusual.  In Karen Pryor’s closing remarks, she commented it was common for professionals in related fields to be isolated from one another, even though they each have important information that they would benefit from sharing.  A presentation that shows the connections between different fields (psychology, ethology, biology), takes the information we have learned from all of them, and puts it in a larger framework, is a great resource.

But this presentation was not just about the big picture.  He included a lot of useful information about what we have learned in the past 100 years, and I found there lots of practical tidbits scattered throughout it. I also found it was very helpful to see the context in which each “discovery” was made and how the new information built on, and either complemented or required some re-thinking about previous discoveries. I hear references to the work of Pavlov, Skinner and Premack all the time, but without understanding more about the historical significance, how the work was actually done, and future applications, my knowledge of how to use that information has been and will be somewhat limited. Putting their work in context has made it easier for me to see what we can learn from the science, as well as what we still need to learn.

And finally, I think learning this stuff can be fun. Yes, I said it. Ok, I am a bit of a behavior geek and I like reading about scientific discoveries, but I think that it can be very eye-opening to read about the actual research and what it tells us about behavior. The first year I attended ClickerExpo, I went to Kathy Sdao’s “A Moment of Science” lectures and found my brain fizzing with excitement.  Previously I had only had a limited understanding of the science behind clicker training, so learning more about it was exciting, but there was something more. Something about seeing all the little connections (and how we learned about them),  seeing more clearly that behavior is not generated randomly, but follows predictable (well mostly…) patterns, and that by observing, analyzing and changing the conditions under which behavior happens, we can influence it.

So what did Dr. Killeen have to say? The following article is based on my notes from his talk and is shared with his permission. He also generously shared the slides with me, so I could study them in more detail and include some diagrams.

Dr. Killeen started by staying that training requires art and science.  He has spent most of his life as a laboratory scientist, but recognizes that knowing the science is only part of animal training. Still, he feels it’s very important to get the scientific information out to the public so that the knowledge can be shared and also viewed in the proper context. With this in mind, he took us on a “brief tour of modern learning theory” and looked at the contributions of Pavlov, Skinner, Premack and a few other scientists along the way.

He started with a little review of what we’ve learned from studying behavior:

  • Classical (Pavlovian conditioning)
    -sign learning – pairing of stimuli to create associations
  • Effect (Skinnerian conditioning)
    -self learning – responses get connected to consequences
  • Attraction (Thorndikian conditioning)
    -approach to incentives – surprisingly general and powerful law
  •  Premack Principle
    -transition to higher probability response
  • Timberlake’s Ethograms
    -organizes the Premackian insight

Then he went into more detail:

Ivan Pavlov:

Ivan Pavlov was a Russian scientist who was studying digestion in dogs in the early 1900s. In his experiment he wanted to measure salivation the amount of saliva produced when dogs were fed meat.  But, he started to have trouble because the dogs were salivating before he could feed them, and eventually even before he showed them the meat.

He referred to these as “psychic secretions” and ended up studying them instead. He did this by pairing a sound (typically a metronome) with the presentation of the meat and studying how the response to the metronome changed over time.  After a few pairings the dog would salivate in response to the metronome, instead of to the meat itself.

This work led to an understanding of the process through which conditioned stimuli can become associated with unconditioned stimuli to form new associations and responses, the process we now call “classical” or “Pavlovian” conditioning.  It is also led to the basic laws of association which describe the relationships between the US (unconditional stimuli), CS (conditional stimuli),  UR (unconditional response) and CR (conditional response.)

Note:  Dr. Killeen used the terms “conditional,” not “conditioned” as is often seen.  According to Paul Chance, the term conditional is closer to Pavlov’s original meaning, but the two terms (conditioned and conditional) are often used interchangeably.

Pavlov’s motto was “Control your conditions and you will see order.”

A few items of note from his experiments:

  • Pavlov’s dogs were restrained and he was positioned such that he could not see all their responses to the stimuli.
  • The investigators only paid attention to the smooth muscle (visceral) response, not to other behaviors that the animals did. This is important because it led to a limited view of classical conditioning, with scientists assuming it only occurred with certain types of responses.
  • The original description of classical conditioning was one of substitution, where you could replace one stimuli with another through conditioning.

Further research into Pavlovian conditioning has shown that it should be viewed somewhat differently. In the 1970’s scientists (H.M. Jenkins and others) were studying Pavlovian conditioning in unrestrained animals and found that there were numerous responses to the unconditional stimulus. They said it was more accurate to call the conditioned response a “conditional release” because it was releasing a number of natural responses.

Their conclusion was that:

  • The CS-US episode mimics a naturally occurring sequence for which preorganized action patterns exist. The CS “substitutes for a natural signal, not for the object being signaled as in the Pavlovian concept of substitution …”
  • CR should mean Conditional Release
  • The topographies of CRs “are imported from the species’ evolutionary history and the individual’s pre-experimental history”

H.M. Jenkins’ work showed that the textbook description of the CS as a “faint image” of the US is not accurate. It is more accurate to say that it is a signal to engage some new action patterns.  This is called induction.

Edward Lee Thorndike

“Psychology is the science of the intellects, characters, and behaviors of animals including man.”

Dr. Killeen remarked that most psychologists study the behavior of man and perhaps it’s time to turn that around a bit…

Thorndike is best known for stating the Law of Effect, which he formulated after observing the behavior of cats placed in puzzle boxes. The cats learned to escape through trial and error, but learned from each experience, so they were quicker at escaping once they had done it successfully.

“The Law of Effect is that: Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal, will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with that situation weakened, so that, when it recurs, they will be less likely to occur. The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond.” (taken from his slide, which credits Kenneth M. Steele)

The Law of Effect is the foundation for Skinner’s work on operant conditioning because it clearly states the connection between response and reinforcing context. Dr. Killeen described it as “A law of selection by consequences. It is a probabilistic law.”

So far we have:

  • Pavlovian Conditioning – connection of context to stimuli, CS, US
    -> Similarity, proximity, regularity
  • Thorndike – connection of response to context, S-> R
    -> when they lead to satisfiers
  • Skinner – connection of response to reinforcers, R -> S (superscript R)
    -> dropped the need for satisfaction
    -> wanted to vest all variables in the environment

These are different ways of looking at behavior, but you have to realize that they are all going on at the same time.

To understand the importance of these discoveries (laws), Dr. Killeen had a slide showing  where they would be placed in a list of either the top 10 laws in Psychology or the most important laws in Psychology at the end of the 20th century.

Top 10 Laws in Psychology  (this list was taken from textbooks):

  • 2. Law of Effect
  • 3. Laws of association
  • 6. Laws of continguity
  • 8. Law of exercise

(note:  the laws of association, contiguity and exercise are roughly equivalent, or contain essential components of numbers 6, 8 and 10 below.)

Important laws in Psychology at the end of the 20th century (this list was taken from a journal article.):

  • 2. The Law of Effect
  • 6. Premack’s Principle
  • 8. Classical conditioning
  • 10. Reinforcement/operant conditioning

Dr. Killeen said that he, personally, thinks that Premack’s Principle is the most powerful of all.

David Premack

Dr. David Premack was a psychologist who studied reinforcement and cognition in chimpanzees.  Two of his most notable contributions are his work on Theory of Mind in chimpanzees and the Premack Principle.

The Premack Principle states that:

  • Behaviors are reinforcers, not stimuli
  • More probable behaviors reinforce less probable behaviors.
  • Less probable behaviors punish more probable behaviors.

This re-defining of reinforcers as behaviors was a very important shift in thinking and changed the way that scientists (and others) looked at reinforcers.  Previously, reinforcers had been defined as stimuli (food, objects, etc.) but Dr. Premack showed that it was the activity associated with that stimulus that was reinforcing.  It’s EATING the food that is reinforcing.  It’s PLAYING with the ball that is reinforcing.

If you aren’t sure about this, think about some of the activities you enjoy doing and ask yourself if the end goal is reinforcing, or if it is the activity itself. Why do you eat? Is it just to feel full? Why do you read a book? Is the book better if someone tells you the ending ahead of time?

When you are trying to change behavior, you want to look at possible activities and see which ones are more probable and which ones are less probable. This gives you a preference hierarchy of possible activities, which can then be used to shift behavior in the direction you want.

Dr. Premack did a number of interesting experiments looking at changing more probable and less probable behavior by limiting access to resources. He found that there was a “reversibility” of reinforcers, so an activity that was reinforcing in one situation might function as a punisher in another.

This work was done in the laboratory, but the Premack Principle explains the relationship between behavior and reinforcement under many conditions. Dr. Killeen showed an example (from Jesús Rosales-Ruiz) of using the Premack Principle with a barking dog. The amount of barking could be decreased by moving the dog either toward or away from the other dog, depending upon which behavior was more reinforcing for the individual dog at that moment.

Dr. Killeen stated that he thinks all reinforcement principles come down to Premack’s Principle.  But, there are objections and some difficulties in figuring out how to measure probabilities.

One problem is how to measure probability? It’s difficult to measure as it’s not solely based on duration or intensity, but may depend upon many factors.  Eventually Dr. Premack decided to use how much time the animal will spend on a task, if not satiating.

There was also the question of whether or not the animals had to be in a state of deprivation for an activity, in order for it to become more probable. In his experiment with rats where he was able to reverse the probabilities of wheel running and drinking, he did use deprivation to make one of the behaviors more likely.

While deprivation can certainly change probabilities,  there was an interest in looking for a better way to calculate (or predict) more and less probable behaviors.  There were also scientists who were interesting in finding a larger framework within which to view Premack’s Principle.  William Timberlake, a psychologist at Indiana State University, had developed  a way of mapping behavior that proved to be useful. His Behavioral System offered an way to describe and map behavior that showed how an animal will naturally progress through a sequence of behaviors, with each one reinforcing the previous one.

Timberlake’s Behavioral System was based on looking at the natural sequence of behaviors that are part of specific activities. Dr. Killeen had a series of slides that showed predatory behavior, and showed how each step leads to several choices, which leads to several more choices, and so on.  When an animal makes one choice, it makes some new choices more likely and some other choices less likely.  If you want to read more about Timberlake’s work, this article goes into more detail: http://indiana.edu/~bsl/behavior.pdf.

Here’s one of Timberlake’s Behavioral Systems charts that Dr. Killeen showed:

Timberlake1For the purposes of this article, what you need to know is that Timberlake looked at patterns of behavior; identifying systems and subsystems, modes, modules, and actions. You could trace and predict an animal’s behavior by making a diagram of the Behavioral System that showed possible pathways.

This provided a framework for viewing how one behavior might reinforce another. For example, in predation,  the mode “focal search” might lead to the behaviors “investigate”, “chase”, “lie in wait”,  “capture”, and “test”.  If an animal continued down the chase pathway, it might “track the animal” or try to “cut it off”.

Once you can identify different behaviors states (modes, modules or actions), you can collect data by observing animals to see which pathways are more likely.  This gives you general tendencies, not absolute values, as there are many variables and an animal can start down one pathway and be forced to shift to a different one. So it’s not useful in the absolute sense, but it can provide information about what behaviors tend to reinforce other behaviors, and it can help to identify the most common sequences.  This information helps to see the connection between what Premack learned from his laboratory work and the behavior of animals in their natural habitat.

I want to make a comment here.  When I see people describing the application of the Premack Principle in training, they often put an emphasis on using an available activity, one that is what the animal would choose to do on its own.  So a dog might be taught to orient to its owner in the presence of squirrels, and they would try to reinforce that behavior by providing the opportunity to run in the direction of a squirrel.

But there’s nothing in the Premack Principle that says you need to use a “naturally” reinforcing activity. I asked Dr. Killeen about that and he said that you can use any behavior, as long as you take the time to build a strong reinforcement history so that it can function as a reinforcer.  In Emily Larlham’s presentation, she talked about how to use Premack to decrease deer chasing and she did it by building a high probability for an alternative behavior that had nothing to do with chasing deer.

I think it can be helpful to look at the Premack Principle in the context of naturally occurring behavior sequences, and you may be able to use them in some cases, but don’t let that limit how you think about using it.

Unified Theory of Connection (Peter Killeen)

Dr. Killeen pulled all the Laws of Connection (Pavlov, Skinner, Thorndike, Premack) and Timberlake’s Behavioral Systems together to make his Unified Theory of Connection. This is where you start to see how the different laws fit together to create a complex repertoire of behavior. The Behavioral Systems provide a framework  and movement through the Behavioral Systems can be explained using the Laws of Connection.

Some Key Points of the Unified Theory of Connection:

  • Different subsystems (predatory, defensive, sexual) make different modes attractive.
  • Reinforcers are responses, not stimuli.
  • Movement down the modules constitutes reinforcement.
  • Movement from state to state (subsystem -> mode -> module ->  action) is possible because of satisfying events.
  • Animals approach stimuli that make progress possible (these stimuli are unconditioned or may be classically conditioned.)
  • Within modules, the actions and how they are done are subject to the law of effect, operant strengthening, etc.

Using this chart as an example, he provided some specifics on what movement within each column indicates, and what prompts transitions:

unified theory

The “action” column:

  • More probable (and thus reinforcing) responses are ones lower in their action space (lower responses reinforce higher responses).
  • Transition points enable progression through the actions. An animal moves through transition points for one (or more ) reasons.
  • —- 1. They are satisfying (Thorndike)
    —- 2. They are approached (Thorndike)–are incentive motivators
    —- 3. They elicit other species-typical actions (Pavlov)
    —- 4. They Reinforce the particular responses that lead to them (Skinner)

The “module” column:

  • Moving from one module to the next provides a “conditional release” (Jenkins) for what classes of responses are most likely.
  • The topography of the conditional release comes from the animal’s natural behavior (pre-organized action patterns).
  • Signs of such transitions are Pavlovian CSs – the CS substitutes for a natural signal.(in this context, I think a “sign” is what we might call a cue or a signal to proceed to the next behavior).

The “mode” column:

  • Moving from one mode to the next “sets the occasion”(Holland) for what classes of stimuli are most effective, what responses are most likely.
  • Such transitions are “motivating operations.”
  • “Occasion setters” follow different rules than CSs.
  • Training and interactions in general are as much about configuring motivational operations as about applying reinforcers. You want the animal to be in the right mode.

Readiness and Regulatory Fit:

Understanding how animals move either horizontally or vertically down the chart is essential when trying to change behavior. He provided a little additional information on this subject by looking more at readiness and regulatory fit.

Thorndike’s Law of Readiness (1914) already provided some information about how moving into a module provides readiness to move down the chain.

“When a child sees an attractive object at a distance, his neurons may be said to prophetically prepare for the whole series of fixating it with the eyes, running toward it, seeing it within reach, grasping, feeling it in his hand, and curiously manipulating it.”

Skinner and Premack had also talked about how behavior tends to move down action chains, as the body is already anticipating the next action.  Some of the behavior in the action chain may be innate and some may be learned.

Dr. Killeen provided a simple example of how our behavior can be influenced by the mode we are in.  This example comes from Tony Higgins who studies human behavior to see the effects of different modes (promotion vs. prevention, approach vs. avoidance) on behavior.

If you are trying to sell someone something, you have to put them in the right mode.

  •  If you want to sell them a yacht, you put them in “adventure” mode by telling them stories that make them want to go out and do something new.
  • If you want to sell them life insurance, you put them in “life is dangerous” mode by sharing stories about people who have died, accidents, etc..

Key Points from the Unified Theory of Connection:

  • Animals approach satisfiers. He shared a number of slides citing research supporting the basic idea that behavior is motivated by approach or withdrawal.  The research included field data and experimental data.
  • Satisfiers are contexts with higher rates of reinforcement/action relevant to their current state, or contexts associated with more attractive actions.
  • Satisfying contexts are those that lead to actions at deeper levels in their Behavioral System. Always look at behavior from an ethological viewpoint.  Sometimes these actions become satisfying in their own right and animals get stuck in them.  We have purposely bred dogs to get “stuck” at some actions (retrievers, pointers, etc.).
  • Behavior is a trajectory through a field of attractors — modules.
  • Conditioned stimuli are signposts on the journey. If they are extrinsic, it’s Pavlovian sign learning, if they are intrinsic/proprioception, it’s Skinnerian act learning.
  • If moving to a better state, CSs function as conditioned reinforcers. If moving to a worse state, they function as conditioned punishers.
  • Many actions are shared by different systems and some are shared by different modes, so an action can have one meaning in one context and a different meaning in another one. A bite can be predatory or sexual. Actions shared between multiple systems or modes can lead to short-circuiting.
  • What gets learned are more efficient routes/actions ways to get to satisfying actions within modules or ways to get to modules that are deeper/more satisfying in their action hierarchy.

Sign Tracking:

As stated above, one of the key points of the Unified Theory of Connection is that animals approach satisfiers. An example of this can be found by looking at sign tracking, which has been found in dozens of species, and shows a tendency to approach and contact signs of reinforcement.

He described an experiment by Hearst and Jenkins (1974) in which they put a pigeon in a long cage with a light on one end and a food hopper on the other.  When the light came on, the pigeon would approach it, which meant the pigeon was actually moving away from the food hopper when the light came on.

But the food hopper was set up so that the food was only available for a short time after the light came on. By the time the pigeon got back to the food hopper (after going to the light), the food would no longer be available. You would think the birds would learn to wait at the food hopper and watch for the light, but they never did. That’s how powerful sign tracking, and therefore the desire to approach, can be.

Role of Affect:

He finished up by looking a little bit at the role of affect (emotions).

  • No matter how we think about stimuli and their settings –
  • We must also know how to feel about them –
  • Affect tells us which action modes to engage, what kind of “readiness” in Thorndike’s terms.
  • Different actions modes are associated with different emotions and they can tell us whether to approach, avoid or kick back: wait it out.

You can think of emotions as the signatures of different behavioral modes. They:

  • differentially prime perception
  • prime motor systems
  • inhibit competing systems
  • tell us what to do, and simultaneously empower that action
  • hold us in relevant modes

And this leaves us with the New Laws of Connection:

  • Approach -> To stimuli that mark transitions/routes down our hierarchy (Pavlovian sign-learning). They are are pleasurable/satisfying or scary; emotion empowers responses relevant to modes
  • Effect -> In similar contexts we approach the actions that gained that improvement (Skinnerian self-learning)
  • Act for Action -> It is access to better actions that constitutes reinforcement (Premack Principle)–Imposition of adverse actions that constitutes punishment.

There were two other presentations that looked specifically at using Premack Principle in training and I was originally going to include them as part of this article. But I think it would be better to write about them separately so they will be in a future article.