equine clicker training

using precision and positive reinforcement to teach horses and people

ASAT Conference 2017: Dr. Peter Killeen on “Skinner’s Rats, Pavlov’s Dogs, Premack’s Principles.”

IMG_2510
Dr. Killeen is a professor of psychology at Arizona State University and has been a visiting scholar at the University of Texas, Cambridge University, and the Centre for Advanced Study, Oslo.  He gave the keynote address on Saturday morning.  Here’s the description from the conference website:

“Reinforcement is a central concept in the enterprise of training, and yet it remains a controversial one. Much of the opinion about its nature is derived from laboratory protocols involving food or water deprived animals. This does not always translate into the more complex and pragmatic world of animal training. In this talk I take a step back, to re-embed the concept of reinforcement in an ecological context. Reinforcement is always caused by the opportunity for an animal to make a transition from one action pattern to the next. The Premack principle is a simple deployment of this insight. I will discuss the Premack principle, alternate versions of it, and the relevance of the emotional state of the animal.”

I want to preface this article with a few thoughts.  This is a long article and at times it may seem overly academic for the needs of most animal trainers.  By the time I was done writing it, I found myself wondering if anyone would want to read it. 

But I hope that you will do so, because I think Dr. Killeen has shared an important perspective on animal training and behavior that combines the work of psychologists, ethologists and other professionals in related fields.
This is somewhat unusual.  In Karen Pryor’s closing remarks, she commented it was common for professionals in related fields to be isolated from one another, even though they each have important information that they would benefit from sharing.  A presentation that shows the connections between different fields (psychology, ethology, biology), takes the information we have learned from all of them, and puts it in a larger framework, is a great resource.

But this presentation was not just about the big picture.  He included a lot of useful information about what we have learned in the past 100 years, and I found there lots of practical tidbits scattered throughout it. I also found it was very helpful to see the context in which each “discovery” was made and how the new information built on, and either complemented or required some re-thinking about previous discoveries. I hear references to the work of Pavlov, Skinner and Premack all the time, but without understanding more about the historical significance, how the work was actually done, and future applications, my knowledge of how to use that information has been and will be somewhat limited. Putting their work in context has made it easier for me to see what we can learn from the science, as well as what we still need to learn.

And finally, I think learning this stuff can be fun. Yes, I said it. Ok, I am a bit of a behavior geek and I like reading about scientific discoveries, but I think that it can be very eye-opening to read about the actual research and what it tells us about behavior. The first year I attended ClickerExpo, I went to Kathy Sdao’s “A Moment of Science” lectures and found my brain fizzing with excitement.  Previously I had only had a limited understanding of the science behind clicker training, so learning more about it was exciting, but there was something more. Something about seeing all the little connections (and how we learned about them),  seeing more clearly that behavior is not generated randomly, but follows predictable (well mostly…) patterns, and that by observing, analyzing and changing the conditions under which behavior happens, we can influence it.

So what did Dr. Killeen have to say? The following article is based on my notes from his talk and is shared with his permission. He also generously shared the slides with me, so I could study them in more detail and include some diagrams.

Dr. Killeen started by staying that training requires art and science.  He has spent most of his life as a laboratory scientist, but recognizes that knowing the science is only part of animal training. Still, he feels it’s very important to get the scientific information out to the public so that the knowledge can be shared and also viewed in the proper context. With this in mind, he took us on a “brief tour of modern learning theory” and looked at the contributions of Pavlov, Skinner, Premack and a few other scientists along the way.

He started with a little review of what we’ve learned from studying behavior:

  • Classical (Pavlovian conditioning)
    -sign learning – pairing of stimuli to create associations
  • Effect (Skinnerian conditioning)
    -self learning – responses get connected to consequences
  • Attraction (Thorndikian conditioning)
    -approach to incentives – surprisingly general and powerful law
  •  Premack Principle
    -transition to higher probability response
  • Timberlake’s Ethograms
    -organizes the Premackian insight

Then he went into more detail:

Ivan Pavlov:

Ivan Pavlov was a Russian scientist who was studying digestion in dogs in the early 1900s. In his experiment he wanted to measure salivation the amount of saliva produced when dogs were fed meat.  But, he started to have trouble because the dogs were salivating before he could feed them, and eventually even before he showed them the meat.

He referred to these as “psychic secretions” and ended up studying them instead. He did this by pairing a sound (typically a metronome) with the presentation of the meat and studying how the response to the metronome changed over time.  After a few pairings the dog would salivate in response to the metronome, instead of to the meat itself.

This work led to an understanding of the process through which conditioned stimuli can become associated with unconditioned stimuli to form new associations and responses, the process we now call “classical” or “Pavlovian” conditioning.  It is also led to the basic laws of association which describe the relationships between the US (unconditional stimuli), CS (conditional stimuli),  UR (unconditional response) and CR (conditional response.)

Note:  Dr. Killeen used the terms “conditional,” not “conditioned” as is often seen.  According to Paul Chance, the term conditional is closer to Pavlov’s original meaning, but the two terms (conditioned and conditional) are often used interchangeably.

Pavlov’s motto was “Control your conditions and you will see order.”

A few items of note from his experiments:

  • Pavlov’s dogs were restrained and he was positioned such that he could not see all their responses to the stimuli.
  • The investigators only paid attention to the smooth muscle (visceral) response, not to other behaviors that the animals did. This is important because it led to a limited view of classical conditioning, with scientists assuming it only occurred with certain types of responses.
  • The original description of classical conditioning was one of substitution, where you could replace one stimuli with another through conditioning.

Further research into Pavlovian conditioning has shown that it should be viewed somewhat differently. In the 1970’s scientists (H.M. Jenkins and others) were studying Pavlovian conditioning in unrestrained animals and found that there were numerous responses to the unconditional stimulus. They said it was more accurate to call the conditioned response a “conditional release” because it was releasing a number of natural responses.

Their conclusion was that:

  • The CS-US episode mimics a naturally occurring sequence for which preorganized action patterns exist. The CS “substitutes for a natural signal, not for the object being signaled as in the Pavlovian concept of substitution …”
  • CR should mean Conditional Release
  • The topographies of CRs “are imported from the species’ evolutionary history and the individual’s pre-experimental history”

H.M. Jenkins’ work showed that the textbook description of the CS as a “faint image” of the US is not accurate. It is more accurate to say that it is a signal to engage some new action patterns.  This is called induction.

Edward Lee Thorndike

“Psychology is the science of the intellects, characters, and behaviors of animals including man.”

Dr. Killeen remarked that most psychologists study the behavior of man and perhaps it’s time to turn that around a bit…

Thorndike is best known for stating the Law of Effect, which he formulated after observing the behavior of cats placed in puzzle boxes. The cats learned to escape through trial and error, but learned from each experience, so they were quicker at escaping once they had done it successfully.

“The Law of Effect is that: Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal, will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with that situation weakened, so that, when it recurs, they will be less likely to occur. The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond.” (taken from his slide, which credits Kenneth M. Steele)

The Law of Effect is the foundation for Skinner’s work on operant conditioning because it clearly states the connection between response and reinforcing context. Dr. Killeen described it as “A law of selection by consequences. It is a probabilistic law.”

So far we have:

  • Pavlovian Conditioning – connection of context to stimuli, CS, US
    -> Similarity, proximity, regularity
  • Thorndike – connection of response to context, S-> R
    -> when they lead to satisfiers
  • Skinner – connection of response to reinforcers, R -> S (superscript R)
    -> dropped the need for satisfaction
    -> wanted to vest all variables in the environment

These are different ways of looking at behavior, but you have to realize that they are all going on at the same time.

To understand the importance of these discoveries (laws), Dr. Killeen had a slide showing  where they would be placed in a list of either the top 10 laws in Psychology or the most important laws in Psychology at the end of the 20th century.

Top 10 Laws in Psychology  (this list was taken from textbooks):

  • 2. Law of Effect
  • 3. Laws of association
  • 6. Laws of continguity
  • 8. Law of exercise

(note:  the laws of association, contiguity and exercise are roughly equivalent, or contain essential components of numbers 6, 8 and 10 below.)

Important laws in Psychology at the end of the 20th century (this list was taken from a journal article.):

  • 2. The Law of Effect
  • 6. Premack’s Principle
  • 8. Classical conditioning
  • 10. Reinforcement/operant conditioning

Dr. Killeen said that he, personally, thinks that Premack’s Principle is the most powerful of all.

David Premack

Dr. David Premack was a psychologist who studied reinforcement and cognition in chimpanzees.  Two of his most notable contributions are his work on Theory of Mind in chimpanzees and the Premack Principle.

The Premack Principle states that:

  • Behaviors are reinforcers, not stimuli
  • More probable behaviors reinforce less probable behaviors.
  • Less probable behaviors punish more probable behaviors.

This re-defining of reinforcers as behaviors was a very important shift in thinking and changed the way that scientists (and others) looked at reinforcers.  Previously, reinforcers had been defined as stimuli (food, objects, etc.) but Dr. Premack showed that it was the activity associated with that stimulus that was reinforcing.  It’s EATING the food that is reinforcing.  It’s PLAYING with the ball that is reinforcing.

If you aren’t sure about this, think about some of the activities you enjoy doing and ask yourself if the end goal is reinforcing, or if it is the activity itself. Why do you eat? Is it just to feel full? Why do you read a book? Is the book better if someone tells you the ending ahead of time?

When you are trying to change behavior, you want to look at possible activities and see which ones are more probable and which ones are less probable. This gives you a preference hierarchy of possible activities, which can then be used to shift behavior in the direction you want.

Dr. Premack did a number of interesting experiments looking at changing more probable and less probable behavior by limiting access to resources. He found that there was a “reversibility” of reinforcers, so an activity that was reinforcing in one situation might function as a punisher in another.

This work was done in the laboratory, but the Premack Principle explains the relationship between behavior and reinforcement under many conditions. Dr. Killeen showed an example (from Jesús Rosales-Ruiz) of using the Premack Principle with a barking dog. The amount of barking could be decreased by moving the dog either toward or away from the other dog, depending upon which behavior was more reinforcing for the individual dog at that moment.

Dr. Killeen stated that he thinks all reinforcement principles come down to Premack’s Principle.  But, there are objections and some difficulties in figuring out how to measure probabilities.

One problem is how to measure probability? It’s difficult to measure as it’s not solely based on duration or intensity, but may depend upon many factors.  Eventually Dr. Premack decided to use how much time the animal will spend on a task, if not satiating.

There was also the question of whether or not the animals had to be in a state of deprivation for an activity, in order for it to become more probable. In his experiment with rats where he was able to reverse the probabilities of wheel running and drinking, he did use deprivation to make one of the behaviors more likely.

While deprivation can certainly change probabilities,  there was an interest in looking for a better way to calculate (or predict) more and less probable behaviors.  There were also scientists who were interesting in finding a larger framework within which to view Premack’s Principle.  William Timberlake, a psychologist at Indiana State University, had developed  a way of mapping behavior that proved to be useful. His Behavioral System offered an way to describe and map behavior that showed how an animal will naturally progress through a sequence of behaviors, with each one reinforcing the previous one.

Timberlake’s Behavioral System was based on looking at the natural sequence of behaviors that are part of specific activities. Dr. Killeen had a series of slides that showed predatory behavior, and showed how each step leads to several choices, which leads to several more choices, and so on.  When an animal makes one choice, it makes some new choices more likely and some other choices less likely.  If you want to read more about Timberlake’s work, this article goes into more detail: http://indiana.edu/~bsl/behavior.pdf.

Here’s one of Timberlake’s Behavioral Systems charts that Dr. Killeen showed:

Timberlake1For the purposes of this article, what you need to know is that Timberlake looked at patterns of behavior; identifying systems and subsystems, modes, modules, and actions. You could trace and predict an animal’s behavior by making a diagram of the Behavioral System that showed possible pathways.

This provided a framework for viewing how one behavior might reinforce another. For example, in predation,  the mode “focal search” might lead to the behaviors “investigate”, “chase”, “lie in wait”,  “capture”, and “test”.  If an animal continued down the chase pathway, it might “track the animal” or try to “cut it off”.

Once you can identify different behaviors states (modes, modules or actions), you can collect data by observing animals to see which pathways are more likely.  This gives you general tendencies, not absolute values, as there are many variables and an animal can start down one pathway and be forced to shift to a different one. So it’s not useful in the absolute sense, but it can provide information about what behaviors tend to reinforce other behaviors, and it can help to identify the most common sequences.  This information helps to see the connection between what Premack learned from his laboratory work and the behavior of animals in their natural habitat.

I want to make a comment here.  When I see people describing the application of the Premack Principle in training, they often put an emphasis on using an available activity, one that is what the animal would choose to do on its own.  So a dog might be taught to orient to its owner in the presence of squirrels, and they would try to reinforce that behavior by providing the opportunity to run in the direction of a squirrel.

But there’s nothing in the Premack Principle that says you need to use a “naturally” reinforcing activity. I asked Dr. Killeen about that and he said that you can use any behavior, as long as you take the time to build a strong reinforcement history so that it can function as a reinforcer.  In Emily Larlham’s presentation, she talked about how to use Premack to decrease deer chasing and she did it by building a high probability for an alternative behavior that had nothing to do with chasing deer.

I think it can be helpful to look at the Premack Principle in the context of naturally occurring behavior sequences, and you may be able to use them in some cases, but don’t let that limit how you think about using it.

Unified Theory of Connection (Peter Killeen)

Dr. Killeen pulled all the Laws of Connection (Pavlov, Skinner, Thorndike, Premack) and Timberlake’s Behavioral Systems together to make his Unified Theory of Connection. This is where you start to see how the different laws fit together to create a complex repertoire of behavior. The Behavioral Systems provide a framework  and movement through the Behavioral Systems can be explained using the Laws of Connection.

Some Key Points of the Unified Theory of Connection:

  • Different subsystems (predatory, defensive, sexual) make different modes attractive.
  • Reinforcers are responses, not stimuli.
  • Movement down the modules constitutes reinforcement.
  • Movement from state to state (subsystem -> mode -> module ->  action) is possible because of satisfying events.
  • Animals approach stimuli that make progress possible (these stimuli are unconditioned or may be classically conditioned.)
  • Within modules, the actions and how they are done are subject to the law of effect, operant strengthening, etc.

Using this chart as an example, he provided some specifics on what movement within each column indicates, and what prompts transitions:

unified theory

The “action” column:

  • More probable (and thus reinforcing) responses are ones lower in their action space (lower responses reinforce higher responses).
  • Transition points enable progression through the actions. An animal moves through transition points for one (or more ) reasons.
  • —- 1. They are satisfying (Thorndike)
    —- 2. They are approached (Thorndike)–are incentive motivators
    —- 3. They elicit other species-typical actions (Pavlov)
    —- 4. They Reinforce the particular responses that lead to them (Skinner)

The “module” column:

  • Moving from one module to the next provides a “conditional release” (Jenkins) for what classes of responses are most likely.
  • The topography of the conditional release comes from the animal’s natural behavior (pre-organized action patterns).
  • Signs of such transitions are Pavlovian CSs – the CS substitutes for a natural signal.(in this context, I think a “sign” is what we might call a cue or a signal to proceed to the next behavior).

The “mode” column:

  • Moving from one mode to the next “sets the occasion”(Holland) for what classes of stimuli are most effective, what responses are most likely.
  • Such transitions are “motivating operations.”
  • “Occasion setters” follow different rules than CSs.
  • Training and interactions in general are as much about configuring motivational operations as about applying reinforcers. You want the animal to be in the right mode.

Readiness and Regulatory Fit:

Understanding how animals move either horizontally or vertically down the chart is essential when trying to change behavior. He provided a little additional information on this subject by looking more at readiness and regulatory fit.

Thorndike’s Law of Readiness (1914) already provided some information about how moving into a module provides readiness to move down the chain.

“When a child sees an attractive object at a distance, his neurons may be said to prophetically prepare for the whole series of fixating it with the eyes, running toward it, seeing it within reach, grasping, feeling it in his hand, and curiously manipulating it.”

Skinner and Premack had also talked about how behavior tends to move down action chains, as the body is already anticipating the next action.  Some of the behavior in the action chain may be innate and some may be learned.

Dr. Killeen provided a simple example of how our behavior can be influenced by the mode we are in.  This example comes from Tony Higgins who studies human behavior to see the effects of different modes (promotion vs. prevention, approach vs. avoidance) on behavior.

If you are trying to sell someone something, you have to put them in the right mode.

  •  If you want to sell them a yacht, you put them in “adventure” mode by telling them stories that make them want to go out and do something new.
  • If you want to sell them life insurance, you put them in “life is dangerous” mode by sharing stories about people who have died, accidents, etc..

Key Points from the Unified Theory of Connection:

  • Animals approach satisfiers. He shared a number of slides citing research supporting the basic idea that behavior is motivated by approach or withdrawal.  The research included field data and experimental data.
  • Satisfiers are contexts with higher rates of reinforcement/action relevant to their current state, or contexts associated with more attractive actions.
  • Satisfying contexts are those that lead to actions at deeper levels in their Behavioral System. Always look at behavior from an ethological viewpoint.  Sometimes these actions become satisfying in their own right and animals get stuck in them.  We have purposely bred dogs to get “stuck” at some actions (retrievers, pointers, etc.).
  • Behavior is a trajectory through a field of attractors — modules.
  • Conditioned stimuli are signposts on the journey. If they are extrinsic, it’s Pavlovian sign learning, if they are intrinsic/proprioception, it’s Skinnerian act learning.
  • If moving to a better state, CSs function as conditioned reinforcers. If moving to a worse state, they function as conditioned punishers.
  • Many actions are shared by different systems and some are shared by different modes, so an action can have one meaning in one context and a different meaning in another one. A bite can be predatory or sexual. Actions shared between multiple systems or modes can lead to short-circuiting.
  • What gets learned are more efficient routes/actions ways to get to satisfying actions within modules or ways to get to modules that are deeper/more satisfying in their action hierarchy.

Sign Tracking:

As stated above, one of the key points of the Unified Theory of Connection is that animals approach satisfiers. An example of this can be found by looking at sign tracking, which has been found in dozens of species, and shows a tendency to approach and contact signs of reinforcement.

He described an experiment by Hearst and Jenkins (1974) in which they put a pigeon in a long cage with a light on one end and a food hopper on the other.  When the light came on, the pigeon would approach it, which meant the pigeon was actually moving away from the food hopper when the light came on.

But the food hopper was set up so that the food was only available for a short time after the light came on. By the time the pigeon got back to the food hopper (after going to the light), the food would no longer be available. You would think the birds would learn to wait at the food hopper and watch for the light, but they never did. That’s how powerful sign tracking, and therefore the desire to approach, can be.

Role of Affect:

He finished up by looking a little bit at the role of affect (emotions).

  • No matter how we think about stimuli and their settings –
  • We must also know how to feel about them –
  • Affect tells us which action modes to engage, what kind of “readiness” in Thorndike’s terms.
  • Different actions modes are associated with different emotions and they can tell us whether to approach, avoid or kick back: wait it out.

You can think of emotions as the signatures of different behavioral modes. They:

  • differentially prime perception
  • prime motor systems
  • inhibit competing systems
  • tell us what to do, and simultaneously empower that action
  • hold us in relevant modes

And this leaves us with the New Laws of Connection:

  • Approach -> To stimuli that mark transitions/routes down our hierarchy (Pavlovian sign-learning). They are are pleasurable/satisfying or scary; emotion empowers responses relevant to modes
  • Effect -> In similar contexts we approach the actions that gained that improvement (Skinnerian self-learning)
  • Act for Action -> It is access to better actions that constitutes reinforcement (Premack Principle)–Imposition of adverse actions that constitutes punishment.

There were two other presentations that looked specifically at using Premack Principle in training and I was originally going to include them as part of this article. But I think it would be better to write about them separately so they will be in a future article.

Categories: Uncategorized

Tags: , , , , ,

4 replies

  1. I wanted to read it, and I did 🙂 Thank you for all of the work of putting this down and sharing it.

    Like

  2. Thanks so much for this analysis. I find it very interesting and useful in the future, especially for talking with researchers who use Pavlovian procedures rather than Skinnerian, and deprivation as a motivator rather than anticipation of positive outcomes.

    Like

    • Thank you, Karen. He shared a lot of interesting information. I found it very useful to understand the context in which discoveries were made and how they fit together to make the bigger picture of what we have learned so far.

      Like

Leave a comment