ASAT Conference 2018: Steve White on “Keep Going Signals.”

tracking dogThis is the third in a series of posts based on my notes from the 2018 Art and Science of Animal Training Conference that was held in Irving, Texas on March 24-25, 2018.

While I try to take accurate notes, it is possible that there are errors or that some detail is lacking. If you post a comment or email me, I can try to clarify or provide some additional information. Many thanks to the speakers and organizers who allow me to share.  To learn more about the conference, you can visit the conference website.

Steve White gave a presentation titled “Keep Going Signals: what you need to know before you even consider using them.”

Keep Going Signals (KGS) are often used by some trainers to build duration or train more complex sequences of behavior.  Steve talked about seeing Attila and Fly’s freestyle routine at Crufts and then  meeting Attila at ClickerExpo.  In his demonstration, Attila showed how he used a single click as a KGS, and multiple clicks as the end of behavior (EOB) marker.

Steve was intrigued by this, because it was a different use of the click.  He shared a quote from Karen Pryor, “A training method will be successful to the extent it complies with the principles of learning.”  Attila was clearly being successful, so maybe there was something to the idea of having another marker, in addition to the end of behavior marker.  I liked that he talked about how important it was to be curious when you see someone doing something different and how the dog is the “ultimate arbitrator” of what works.

Steve had already been using Keep Going Signals, but seeing Attila’s training made him look more closely at how he had been using them, and what you should consider before teaching one.  In this presentation, he talked about how they use them with police dogs.

What is a Keep Going Signal?

  • A “cue” that means “I like that, give me more”
  • May also be called an intermediate bridge (IB)
  • Useful in some training situations
  • The original KGS was continuous (used by Bob Bailey)
  • In his work they usually use an intermittent one

Why use a Keep Going Signal?

Operational uses:

  • Law enforcement tracking
  • Detector dogs
  • SAR (search and rescue)
  • Service dogs
  • Remote guidance

Training benefits of a KGS:

  • Useful for duration or repetition
  • Connect movement with position
  • Parallel shaping of multiple behaviors.  Police dogs need to do behaviors like indicate a gun while they are also doing another behavior like scanning the environment.
  • Maintain situational awareness
  • Can build/maintain behavioral momentum

Example of a KGS in police scent work:  The dog is following a scent, loses it and then picks it up again. They want to be able to tell the dog to follow it again. They train this by intentionally setting up a track and putting a break in it – maybe by stopping the dog at a road, putting him in a car and driving him to where he can pick up the track again. When he finds the new track, they use the KGS to tell him to follow it.

It’s an operational “cheat:”

  • They have to cover a lot of material in entry level k9 (the repertoire  includes tracking, obedience, search, evidence search, handler protection, suspect control, obstacles)
  • Compressed timeline – they have to work fast
  • They don’t always have time to finish training before the dogs go to work

Pitfalls of a KGS

  • It can be a form of prompting, and if used when the dog is struggling, it can lead to handler dependence. (It’s better to use it when the dog is doing well)
  • It can be a distraction
  • Preferred route to R+ – Most police dogs will prefer a toy over food – (not sure what he meant here. He made a comment about not making the KGS too valuable)
  • Must be used with precision. If mis-directed and you do end up marking the wrong thing, the only solution is to dilute it
  • You can’t be sure if the dog is responding to the KGS as intended.  With any cue, the dog may be paying more attention to some aspect of it (and not necessarily the one you intended), and this can also be true with a KGS.

You must have solid EOB (end of behavior) marker skills before you teach it – You should NOT teach it unless the trainer is clearly proficient. This means:

  • Clean mechanics
  • Precise timing
  • Being able to handle sublime criteria shifts
  • You need to know how to maintain the momentum of the behavior
  • You need to be able to divide your attention (multi-tasking slows you down)

“You can’t break the rules until you know how to play the game.”  Rickie Lee Jones

More pitfalls:

  • Shortchanging fluency – single biggest problem he sees is trainers trying to build chains when each behavior is not strong enough.
  • Shortchanging generalization – they also don’t have enough generalization before they start – the behavior should be able to be done anytime anywhere
  • Inconsistent criteria shifts – they are not paying attention to what else is happening, this is because they are distracted, multi-tasking

Building an effective KGS

Choose what stimulus you want to use:

  • They use verbal ones in police work because the dog is usually facing away.
  • If you already use a click as an EOB marker, a verbal “good” can work (he had a video of this)
  • Also could do “nice” as KGS, “good” as EOB marker
  • Any variation is fine as long as you are clear and consistent

Introduce it:

  • Train with all 4 classes of activity: stable duration, dynamic, homogeneous chains and  heterogeneous chains
  • At a minimum, train with stable duration and one form of dynamic behavior.
  • Introduce it as a “subliminal” pre-cue, inserting it before the next cue (cue2).
  • Gradually increase salience so the dog is more aware of it
  • Gradually fade cue2, if appropriate
  • Simple…but not always easy

Example:  They cue the dog to do a behavior (Beh1), and then give the KGS (“good”) while the dog is doing the behavior.  This is followed by another cue (cue2).

Beh1 -> “good” -> Cue2.

They want to introduce it very quietly so it doesn’t distract the dog from the task. If you were building duration, you would re-cue the dog to do the same behavior. If you were building a chain, you would use the KGS and then follow it with the next cue.

Remember about the J-curve of learning – don’t get frustrated when you’re in the low part of the curve.

Rounding out your KGS

  • introduce it in the remaining classes of activity
  • introduce it in environmentally cued chains

You should see an increase in the behavior after the KGS – this is how you know it’s working. He had a story about sending a dog out on an obstacle course without him.  He said “good” when the dog was at the top of jumps and doing well.  The dog learned to associate the word “good” with good moments.

Before you consider a KGS:

  • Benefits
  • Costs
  • Risks
  • Training
  • Operations
  • You don’t NEED it – but that doesn’t mean it can’t be useful in some way

Wrapping up:

  • KGS can have operational and training benefits
  • KGS poses very real risks
  • Develop a KGS installation plan – how are you going to teach it
  • Generalize your KGS
  • Use your KGS mindfully (keep emotion out of it)

 

ASAT Conference 2018: Dr. Jesús Rosales-Ruiz on “How Movement Cycles Can Improve Your Shaping.”

dog cycle 2This the second in a series of posts based on my notes from the 2018 Art and Science of Animal Training Conference that was held in Irving, Texas on March 24-25, 2018.

While I try to take accurate notes, it is possible that there are errors or that some detail is lacking. If you post a comment or email me, I can try to clarify or provide some additional information. Many thanks to the speakers and organizers who allow me to share.  To learn more about the conference, you can visit the conference website.

Dr. Jesús Rosales-Ruiz gave a short (20 min) talk on movement cycles . This was part of a series of talks on how to build precise behaviors by understanding and analyzing movement.

Jesús started by sharing B. F. Skinner’s definition of behavior (1938)

What is behavior?

“The movement of an organism or its parts in a frame of reference provided by the organism itself or by various external objects or fields of focus.”

The definition includes two components:

  • What part of the organism was involved(movement)
  • How it relates to the environment (space and time)

Examples:  Pressing (movement) a lever (environment) or looking towards (movement) the light (environment)

It is convenient to speak of this as the action of the organism on the outside world, but it is sometimes easier to deal with an effect than with the movement itself.  In the case of the production of sounds, we can’t see the vocal cords moving so we have to rely on some other movement, such as the movement of the lips or mouth.  Therefore, when defining behavior, we need to be aware of both the observable behavior and the physical action that creates it.

Movement Cycles

Ogden Lindsley, who studied under B.F. Skinner, introduced the idea of “movement cycles” in 1969.  He said that it was not enough to look at an isolated behavior. Instead we should study the entire cycle which contains the behavior. 

  • Each response has a beginning and an end.
  • The behavior is not done until the organism is in a position to do a new one.
  • Sitting – we usually define sitting as contact with the chair, but the cycle starts when you are standing up, includes all the steps that immediately precede sitting down, sitting, and then all the steps that follow – until you are in a position from which you could sit down again.

How do movement cycles relate to shaping?

Jesús had a slide with a graphic of a movement cycle.  The picture at the beginning of this post shows something similar, with the movement cycle for a dog sitting.  The cycle starts at 9:00, the middle is at 3:00, and the end is back at 9:00 again.   Between 9:00 and 3:00, there would be all the steps a dog goes through as preparation for sitting. Between 3:00 and 9:00, there would be all the steps a dog goes through as he stands up.  In many cases, the behavior in the middle of the cycle is the one that the trainer would click.

He had another graphic of a chain as a series of movement cycles that were linked together.

How can we use the idea of movement cycles in shaping?

We can focus on the process of getting the behavior, instead of the outcome.

  • Begin shaping at the beginning of the movement cycle.
  • Follow the movement cycle as you shape.
  • You can feed to produce the beginning of the movement cycles, then click the action in the middle of the cycle.

Video examples:

Kay Laurence shaping Quiz to put her foot on the dice, showing the difference between clicking for touching the dice with her foot (the middle of movement cycle) and clicking for lifting the leg (the beginning of movement cycle).  Quiz learned the behavior more quickly and with fewer errors when Kay clicked for lifting the leg and added in touching the dice later.

Alexandra Kurland’s students using microshaping to teach a horse to step back, showing how you can shape a step back by clicking for a tiny weight shift (starting at the beginning of the cycle) and then change the timing so the click marks a behavior that is farther into the movement cycle to get a full step back.

Mary Hunter shaping Ginger to go out, touch a stool and return.  The video showed what happened when they moved the click later and later in the movement cycle.  When you reinforce, you reinforce the whole movement cycle, not just the part where you click.

  • Teach going out and touching a stool. Click for stool touch (1/2 way through the cycle)
  • Then move the click to ¾ of the way through the cycle. Now she is clicking as the dog comes back after touching the stool.
  • This is moving the click in the direction the behavior is going.
  • When you move the click later in the cycle, you can do hundreds of repetitions before you see any deterioration in the behavior.  I don’t think they actually did this – he was saying it as a general observation –  but someone in the audience pointed out that the behavior was already changing as soon as he moved the click.  Perhaps more data is needed…

Mary Hunter with Drill Bit (dog – unusual name!) and clicking for attention.  Mary is sitting and Drill Bit is lying down watching her.

  • She is clicking for attention.
  • When she starts to shape for more duration, several unwanted “extra” behaviors start to creep in (tail and leg movement).
  • Go back to the beginning and shape in small increments to clean up the behavior.

Summary

When we teach and analyze behaviors, the tendency is to focus on a specific moment in time, the moment when the desired behavior happens.  But, it can sometimes be more effective to look at the entire movement cycle, which includes what happens both before and after the behavior occurs.

  • Focus on Movement, not Outcome
  • Use movement cycles to define or plan the shaping steps
  • Use movement cycles to clean up behaviors

A personal note: 

A few years ago when I was teaching hoof handling to my young horse, I became more aware of the importance of looking at the entire movement cycle in order to avoid reinforcing unwanted behaviors between the click and treat.  She arrived with the habit of striking when her front feet were handled. I could shape a nice leg lift, but as soon as I clicked, she would strike out and then put her foot down. Not what I wanted.

So, I changed my shaping plan and I taught her to pick up her foot a tiny bit, and then allow me to place it back down.  To do this, I started by clicking for a foot lift as normal, but then, instead of building duration for holding it up, I mixed in some clicks for allowing me to put the foot back down.  When she could pick up and put her foot down nicely, I slowly added more to the middle of the behavior until I could pick the foot up and hold it up for longer and longer periods of time.  By paying attention to the entire cycle from the very beginning, I was able to avoid reinforcing unwanted behavior between the click and treat.

ASAT Conference 2018: Ken Ramirez on “No Reward Markers (NRMs): Science and Practice”

NMR bucket1This the first in a series of posts based on my notes from the 2018 Art and Science of Animal Training Conference that was held in Irving, Texas on March 24-25, 2018.   To learn more about the conference, you can visit the conference website.

While I try to take accurate notes, it is possible that there are errors or that some detail is lacking.  If you post a comment or email me, I can try to clarify or provide some additional information. Many thanks to the speakers and organizers who allow me to share.

Ken Ramirez:  No Reward Markers (NRMs): Science and Practice

Whether or not No Reward Markers can be used as part of a positive reinforcement training strategy is always a controversial topic.  As part of the Sunday morning presentations on reinforcers and conditioned reinforcers, Ken shared his thoughts on the subject. This was a 20 minute talk.

Ken started by clarifying what he meant by a No Reward Marker.   The term is used by many trainers, but there are significant variations in both definition and practice, so it’s a good idea to start by defining it.

What is an NRM?

  • Most common use is that it marks the moment the animal does the wrong or incorrect answer
  • Opposite of the click
  • Conditioned punisher

If you use NRMs, you might agree with the first two points, but you will probably question the third point.  Most trainers who use NRMs would not describe them as conditioned punishers.  Instead, they prefer to describe them as providing information to the animal so that he doesn’t waste time pursing behaviors that will not earn reinforcement.  But Ken said that in all his years of training,  he has only seen 13 people (out of thousands), who can use an NRM without any visible side effects.

It may be easier to see this if you look at how conditioned punishers are taught and the possible side effects.

What are conditioned punishers?

A punisher is a stimulus that, when applied immediately after a behavior, decreases the likelihood (frequency) of that behavior happening in the future. A conditioned punisher is a stimulus that has been conditioned, through association with another punisher, so that it can be used to decrease behavior.

He shared a video example of a verbal conditioned punisher that was learned through pairing with a finger poke (I’m sure you can guess who).  The dogs clearly responded to the sound with defensive body posture and by recoiling.   The video showed that the conditioned punisher was effective, but also that it had side effects.  There’s no argument that conditioned punishers can be effective, but they are not without risks.

The rest of the talk was looking at various applications of NRMs and evaluating both their effectiveness and side effects.  The conundrum is this… If the NRM is effective at reducing the behavior, then it is, by definition, punishment.  If the NRM is not effective – it does not function as a punisher to decrease the behavior in the future – then why use it?

To unravel this, you have to look at the different applications of NRMs to see whether the NRM is functioning as a punisher, has no effect, or is perhaps functioning as something else like a new cue or a means of redirection.

NRMs: Varied uses and applications

To indicate “no” or “wrong”

  • Marks incorrect response
  • Trainers say they just want it to be information
  • Trainers think it’s ok if it is delivered in a passive manner. How about a passive “oops?”
  • The problem is that if it is effective, then by definition it is a punisher

As a warning signal

  • Last chance before something bad is coming
  • Warning prior to a more aversive stimulus (or a more severe one)
  • Varied effectiveness
  • Can become a new cue for the behavior
  • Do generate an emotional response

Ken had two examples to show some of the things that can happen when an NRM is used as a warning signal.

Example 1:  When he was a kid, his Mom would ask him to take out the garbage.  She might ask him a few times (“Kenny, take out the garbage”) and then if he didn’t do it, she would call him by his full name.  When he heard his full name, he got up and did it.  The use of his full name was effective in that it did cause him to get up and take the garbage out, but it didn’t change his future behavior – he was still likely to ignore her when she said “Kenny, take out the garbage.”  If his full name as an NRM was effective, then he should have learned to take the garbage out when she asked him the first time, but he didn’t. And, over time, the use of his full name just became the new cue (or part of the new cue) to take the garbage out.

Example 2:  The warning “ding, ding, ding” in his car when he leaves the lights on.  The sound is aversive and he feels a moment of frustration when he hears it.  It is effective because when he hears it, he does turn the lights off. But, has it made him less likely to leave the lights on? Maybe a little over time, but it could only be considered a weak punisher because it doesn’t change his behavior very quickly.  He did joke that if it was followed by a strong aversive, it might be more effective, but then he would probably sell the car.   In addition to being a warning, the sound also becomes a cue for a specific behavior – turn off the lights.

To indicate “correct the behavior or you will not be reinforced” 

He had a video showing a blood draw in a hyena where the hyena moved away before the trainer was done holding off the spot. The trainer said “ah ah” and cued the hyena to move back.  He came back into position, she finished, clicked and reinforced him.

Was the NRM effective?

  • Ken can’t see any change in the hyena (good or bad)
  • She uses her cue to bring him back
  • It’s possible the “ah ah” will just become a cue to come back into position
  • the “ah ah” is possibly just superstitious behavior on the trainer’s part

One of the points he made, using this example,  was that since the trainer only uses positive reinforcement, it’s likely that the “ah ah” has no meaning to the hyena, which is why Ken doesn’t see any response. The hyena doesn’t return to position when she says “ah ah,” (he responds to her cue), but it might learn to over time, if she continued to follow it with her cue.

Used as an interrupter

  • Stops behavior in the moment, but doesn’t always change future behavior
  • Still aversive
  • Weak or ineffective punisher (more like redirection)

Used as a “stop” cue?

This is a more common (growing) use among R+ trainers. The idea is to use the NRM and then immediately redirect and reinforce the alternative behavior.  Again, you have to look at the effect on behavior and the animal’s emotional response. What does the animal look like?

Example:  He had a clip of Susan Garrett teaching a dog to do weave poles.   The video shows several NRMs being used.  In each case, the dog is not reinforced and is re-started. He shared this as an example of an NRM that doesn’t seem to have any aversive side-effects.

  • She has a variety of NRMs (I think he said 4)
  • The dog maintains a high level of enthusiasm even after the NRM
  • In the last part of the clip, she placed a toy a short distance from the end of the weave poles. If the dog went through correctly, he retrieved the toy and she would play tug.  If he made an error, she used her NRM and he returned (without retrieving the toy) and was re-started.
  • Note: Steve White pointed out that it’s not the toy that is the reinforcer, but playing with the toy – if the dog learns that he won’t get to play with the toy – then there’s no point in going and getting it.

Final thoughts

  • Traditional use is that an NRM functions as a punisher
  • Can assist in shaping behavior, but can also create frustration
  • Other similar uses may not actually be an NRM (it’s more likely they are a cue or redirection)
  • Often conditioned inadvertently
  • Only skilled and disciplined trainers can use them well, not a bad tool, or at least should be used with thought and care.

Cones, mats, poles, and targets: Putting them to use in ground and ridden work to teach new behaviors and facilitate learning

IMG_3357Some of the most practical behaviors that I teach with clicker training are ones that involve the use of objects. These objects can function as prompts for specific behaviors, visual cues, or provide guidance for how to do other behaviors.  Because they are taught with positive reinforcement, they usually take on additional value (positive valence)through classical conditioning.  Clicker trained horses eagerly approach targets, mats and other physical items used as part of their training and will seek out opportunities to interact with them.

Often this happens automatically without any deliberate effort on the trainer’s part, but we can also choose to build or take advantage of these associations so that the positive emotions associated with clicker training are carried over into new behaviors or new activities.  This is very helpful when training any behavior, but I find it is especially helpful when training new behaviors that might have an aversive component (medical or husbandry behaviors) or for behaviors that might require physical effort (riding or groundwork).  It changes how the horse feels about the activity and makes it easier for him to learn.

I ride in an arena most of the time and I know that it’s important to plan my sessions carefully so that my horse stays mentally engaged and enjoys the work.  Therefore, I’m always looking for ways to combine familiar material (behaviors and skills) with new ideas (can you do it in a slightly different way?) to keep the work interesting. It’s always a challenge to find the right combination so that the horse can be successful but also continue to advance in his training.

One thing I have found helpful is to set up patterns using objects like poles, cones and mats.   They can be used as markers or visual guides to indicate changes of bend, direction, or transitions.  They can also be used to cue specific behaviors like standing (mats), touching, following, or head lowering (targets).  I usually use portable objects so I can set up many different configurations and move them around as needed to make the exercises suitable for the needs of each horse, and to build the right combination of consistency and flexibility.

Using objects to guide a horse and rider through a training exercise is not new and certainly not unique to clicker training. But, when we combine this training strategy with positive reinforcement, and use objects that are associated with known behaviors and positive emotions, the benefits are even greater.

Here’s a short list of some of the reasons I like to use objects:

  • They can be associated with specific behaviors and desirable emotions.
  • With some horses, especially ones that have emotional baggage about traditional ridden work, using objects that are associated with positive reinforcement changes the context and allows me to re-introduce groundwork or ridden work in a new way.
  • They encourage active participation on the part of the horse, give the horse more control, and can provide a “sense of purpose.” It may seem anthropomorphic to talk about a “sense of purpose,” but I know Rosie does better when she knows what she is supposed to do, as opposed to if I ride random patterns (as it may seem to her) until I click.
  • When used as markers, objects provide visual guidance for the horse and rider and also make it easier to evaluate how successfully the exercise has been done.  How close did he come to the cone? Was he straight between the poles?
  • For some behaviors, the horse’s eagerness to participate can be useful information. Did he walk directly to the target? How did he touch it?
  • If used in patterns, they can also help both horse and trainer learn new movement patterns and develop a feel for correct movement.  Once you can ride a straight line between two poles or a line of cones, you know what “straight” feel like.
  • Objects can also be used as an intermediate step between liberty work and more traditional handling.  You can go both ways –  start at liberty with objects and add cues for more traditional handling, or start with more traditional handling using objects and work toward liberty work.

What kind of objects can you use?

Here are some of the objects that I use, and a few ideas for how to use them.  This is not a complete list, but just some examples to get you started.

Targets: 

I use a variety of targets including both hand-held and stationary targets.

They can be useful for:

  • Leading (following a hand-held target)
  • Standing (standing at a stationary target)
  • Isolating and moving body parts (head, legs, shoulders and hips)
  • Going to a location (go to a stationary target)

Cones:

They can be used as stationary targets or for marking patterns (visual guides.)   I find it can be confusing for some horses if cones are used as both markers and targets in the same pattern, so I don’t recommend doing that unless you have already taught different behaviors for different types of cones.  A simple way to use a cone as a target, without confusing the horse, is to place a target stick upright in the cone.

I tend to use them for:

  • visual guides – marking geography to practice patterns (turns, circles, serpentines, etc.)
  • Destinations – if used with a stationary target, a cone/target combination can encourage a lower head upon approach.
  • Turns – to encourage bend and help a horse learn to turn without falling on to his inside shoulder or counter-bending.
  • They work well in the middle of patterns as they provide guidance but don’t necessarily become associated with stopping
  • I sometimes use two cones placed close together to make a “cone gate” and direct a horse on to a line of travel. Having two cones is a different visual than a single cone and can help a horse learn to follow the desired path a little more closely.

Mats: 

Mats come in a variety of shapes and sizes. I tend to use solid mats (stiff rubber or wood) for movement exercises as I don’t want to have to keep adjusting a mat if it gets crinkled as can happen with some of the thinner mats.   Most of the time, I am just looking for two front feet on the mat, but I have sometimes used mats for hind foot targeting, or asked for all four feet on a mat, or on two mats placed close together.

They are useful for:

  • Stationary behaviors
  • Destinations (go to the mat)
  • They can be used to isolate body parts (moving front or hind feet while the other feet stay on the mat)
  • the terminal behavior in a chain
  • teaching a horse to go forward – going from mat to mat can encourage forward behavior
  • teaching a horse to slow down – going from mat to mat can provide places to stop for a horse that wants to go
  • Mats can also be placed in patterns and used to teach turns (toward and away from the trainer) and even backing.

Poles:

I also use ground poles, either placed on the ground or on short risers (blocks that elevate them a few inches).  Cavaletti also work well.

I have used them as:

  • visual guides – a pole placed parallel to the line of travel (or two poles placed to create a chute) can tell a horse where to go.
  • visual guides – a pole placed perpendicular to the line of travel can also be used to direct a horse. I sometimes place one or more poles to mark the size of the circle I want, when working at liberty or on the lunge line.
  • cues/visual guides – horses can learn to go from pole to pole (going over them) when laid out in a pattern in the ring. The poles function as cues for the behavior of “step over the pole,” but they are associated with movement, so they also function as visual guides for where to go.  You can teach a horse to do several in a row before he is clicked and reinforced.
  • shaping movement – different spacing or configurations of poles can be used to shape movement.

Other objects:

In addition to targets, mats, cones, and poles, I’ve used other objects to make movement training more interesting, or tap into existing behaviors that have been trained under other conditions.  I think as long as something is safe, portable (or can be easily incorporated), has value to the horse, and will contribute to your training goals, then it could be a useful addition.

Here are some other ideas:

  • hula hoops
  • balls
  • pedestals or platforms
  • buckets
  • mounting blocks – most people focus on using a mounting block to get on, but if you’ve trained it with positive reinforcement, it has high value and can become a place to stop or do some targeting or …
  • toys for fetch (giving your horse a fun activity at the end of a chain can build enthusiasm)
  • what does your horse like?…

When using objects, there are some important things to consider:

  • Have you clearly defined the behavior associated with the object?  When teaching a behavior, it’s important to have clear criteria and be consistent about only reinforcing those efforts that meet them. But, when I start using the object as part of a larger pattern, I might have to adjust or relax the criteria, at least initially.  If so, I need to plan for that and also consider how I am going shift back to the original criteria, or if that’s even necessary.  For example, with mat work, how much precision do I  want? 2 feet on? Fronts square? All square? Orientation to the mat? With targeting, do I want a touch? A touch and hold? An approach?
  • How much or what kind of stimulus control do I want? Is the object the cue, or do I want to use another cue to tell the horse when to go? I find that verbal cues often function more as “release” cues and the verbal combined with the object tells the horse what to do. Objects as visual cues are very strong and this may become problematic if I don’t consider this in my plan.
  • If I am building patterns, I want to consider the best way to assemble the pattern. I can teach it adding each individual behavior one at a time, or by teaching sections and then combining them. I also have to decide if I want to assemble it by forward chaining or backchaining.
  • There’s often a fine line between the object being helpful as part of the pattern and the object becoming the most relevant piece of information.  If I am using the object to set up a pattern where the horse can practice a specific movement pattern, it may be acceptable if the arrangement of objects becomes the cue.  My horses know the cone set-up for a serpentine and that’s ok with me. On the other hand, I have other cone set-ups that are associated with multiple patterns and I want them to use the cones for guidance, but also pay attention to my cues so they know which one we are doing. This builds in flexibility.  My experience has been that if I don’t plan for flexibility, I don’t get it…
  • Do I want the objects to remain as part of the behavior, or do I want to fade them out?  Always consider the long term goal.   If I want to fade them out, then I need to include that as part of my training plan.  There are lots of ways to do this, but I usually do it gradually.  Sometimes I can tell when the horse no longer needs the object(s) because they will start to anticipate or offer the behavior in the absence of the object. Other times I have to play with the set-up to see if they are ready to have them removed.

Common Patterns Using A Combination of Objects:

  • cone circle with mats – Alexandra Kurland teaches balanced work on a circle and turns using cones, with mats placed at various locations to provide breaks, reinforcement or direction
  • mat circle – mats placed in a circle to teach a horse to go from mat to mat on a curve line
  • exploding cone circle – another from Alexandra Kurland, although the name is mine (I’m pretty sure she wouldn’t use the term “exploding”) – start with a tight circle of cones and then slowly expand it. Useful to teach a horse to stay on the outside of the cones a liberty.
  • connected cone circles – two circles with a path in between. Alexandra Kurland uses these for mat work, changes of bend, etc.
  • cone lozenge – stretch your cone circle out to include straight lines so your horse learns to go from a curved line to a straight line. Also lots of possibilities for patterns across the lozenge.
  • serpentines using cones, poles, or cone gates to help the horse learn to bend and straighten
  • shallow serpentines to teach bending lines at walk, trot, canter and introduce counter-canter
  • chutes of poles to teach straightness – I’ve used these when teaching Aurora to trot on a lead so she learned to stay in her lane, without crossing into mine. I placed a mat or cone gates at various distances so she knew where to stop trotting

Aurora’s pole circle- a more detailed description of how to choose, adjust and expand upon a basic object defined exercise:

This fall I wanted to introduce Aurora to the idea of trotting on a circle so I put some thought into various options, taking into account what I had available (cones, mats, and poles) and what she knew about them.

I had previously done a little work on a cone circle, a useful set-up that I had learned from Alexandra Kurland, who uses cone circles a lot.  But I only had enough cones to mark a small circle, so we had just done it at a walk.  I had also used cones and poles as visual guides to teach her to trot in a straight line so I could jog her for the vet, if needed.  To do this, I set up “cone gates” to mark the start and end of the trot, and placed poles in between to mark the line of travel. She would start at one cone gate, trot through the chute of poles and stop trotting at the second cone gate.

This basic pattern with ground poles and cones was easy to convert into a new configuration that would set her up to trot on a circle.  I knew I would initially need to define the line of travel on both sides because she has a tendency to either want to be very close to me, or to zoom off.  Finding a middle distance is difficult for her.  So,  I simply laid out the ground poles in a large circle and placed cones on the inside so she had a “track” with poles on one side and cones on the other.

I did debate about whether to have the poles mark the inside or outside of the circle.  Both would work, and ultimately I did have her do both, but I started with the poles on the outside because I thought this had the additional benefit of teaching her the idea of working along the track in a defined space like an arena.

Therefore, my plan was to start with the poles on the outside and use cones to mark the inside of the track.  With this set-up, I would be able to mark and reinforce her for staying between the  poles and cones. Then, as she got better at staying between them, I could slowly decrease the number of cones and reinforce her for staying near the poles, until eventually she learned to just follow the poles and didn’t need the cones anymore.   When I use objects, I always try to plan ahead if I want to fade some of them out.  It seemed like it would be easy to decrease the number of cones and maintain the behavior, if she built up enough reinforcement history for going around next to the poles.

I want to mention here that my goal was not to have her trot around and around.  She’s still young and I knew she didn’t know how to balance on a curved line in trot.  I didn’t want to stress her either physically or mentally.  This exercise was more about introducing the idea of a circle and teaching some basic skills like how to go out, stay at a distance, and maintain the trot with a little duration.  My goal was to get her to trot twice around.  At the same time, I wanted a chance to observe how she carried herself (balance and posture) so I could start to put together some groundwork exercises that would be beneficial to her.

The initial set-up was this:

IMG_3345

I simply laid out my available rails (12) and set them on small horse blocks which raise the poles about 2 inches off the ground. This created a round ring about 45 feet in diameter, which left room for wide track around the outside.  I did leave a “gate” which is where the larger white blocks are in the picture. I could have made the circle slightly smaller and just used a pole as a gate if I didn’t want a clearly defined entrance and exit.

I introduced the circle by walking her in and around the “track” next to the poles.  The first time I walked with her into the circle, she was quite funny because she walked around the entire thing with her nose on the poles. I think she was sniffing them, but maybe she was just tracking them with her nose.  After that first inspection, she walked with her head in a more normal position.

Over the next week, I did several short sessions of just walking with me, next to the poles. I just let her walk around with me in the circle did this for a few days. I didn’t want the circle to become a cue to trot, so I wanted her to learn to walk quietly in there before we went to a faster gait.

Once she was comfortable walking in there, then I asked her to trot and jogged around the track with her.   I gave her enough space that she didn’t have to go too close to the poles, if she didn’t want to.   She didn’t seem worried about them and was happy to trot next to me.

Then I added the cones. It looked like this:

IMG_3342

We went back to walk and I spent a few sessions reinforcing her for staying on her side of the cones, while I was on my side of the cones.  Over the course of these sessions, I added more distance and reinforced her for staying in her “track” even when I was not right next to her.  On several occasions, she clearly adjusted her line of travel if she started to cut to the inside the cones, so I knew she was getting the idea.

Once she got the idea, I started doing the same exercise at the trot.  This was interesting as she got a little confused and went out over the poles a few times.  No big deal. I just had her stop, walked her back in and tried again.  We haven’t worked on trotting over poles, so I don’t think she was doing it deliberately. She would just get moving and keep going in a straight line.

Once she was figured out that I wanted her to stay inside the poles, then she would have moments when she followed them and stayed between the poles and cones, but she would sometimes veer in and come to the inside of the cones.  This was where the cones were useful as a visual marker because if she cut in, I could just cue her to move out and click her for going back to “her” side of the cones.  I had taught her to move out away from me with a target during her regular leading to the field and back, and this cue came in handy when she started to cut in.  After a few sessions in the trot, she started to correct herself if she cut in, and would change direction to get on “her” side of the cones.  I was a little careful about what I reinforced as I didn’t want her to think the goal was to weave around the cones.

We’ve been working on this on and off for about a month. I often do two or three days in a row and then leave it for a bit.   So far she’s learned to:

  • trot when asked, but not before
  • go out on the circle
  • stay next to the poles (there are a few spots where she often drifts slightly in, but not a significant amount)
  • stop trotting when I click

We still need to work on:

  • staying out and waiting for me to bring her reinforcement to her (she wants to come to me)
  • relaxation in the trot (she’s a little high headed)
  • a little more duration
  • clarifying that the click is not for being in a certain location. She seems to think she gets clicked for being at a certain spot on the circle – probably because I clicked in the same spot a few too many times in an early session.  So she has some confusion as to whether the click is for duration or location (this is not uncommon as horses often fall into patterns where they do the same thing at the same spot so the location and the clickable behavior get sort of intertwined. )

Other uses for the circle:

While Aurora has been working on her training goals, I have left the circle set up in my arena. One disadvantage to using poles is that it does take more time to set up and take down.  Since I don’t feel like doing that every day and I have other horses that do either groundwork or riding sessions in the arena, I started thinking about ways I could incorporate the circle into their training.  This has turned out to be a lot of fun as the circle has a lot of possibilities, especially if I remove a few poles so there are openings.

Here’s one possible configuration for the circle with some poles removed:

IMG_3343

I can ride around and through the circle at all three gaits and have used it to practice familiar patterns as well as create some new ones. Sometimes I leave the cones in and use them to add clarity or to encourage better turns, etc. Other times I take them out so there are more options.  For groundwork I sometimes add a mat or two, or I might do the patterns with a target stick. To avoid confusing my horses, I don’t ask them to do any patterns that require going over a pole.   It keeps things simpler if I only use them to mean “go around” and not “go over.”

I’ve used this set-up for both groundwork and ridden work. What I’ve found is that the different options create a lot of “clickable” moments and I can click a nice turn, response to a cue, balance shift, or change in the horse’s gaits.  The combination of straight and curved lines and different types of turns make it easy to explore what the horse knows and find some new variations that challenge him a bit, or let him practice what he needs to learn.

Horses who struggle with changes of bend seem to find it easier to go from one clearly defined opening to another clearly defined opening and I can change which poles are present/removed to create different options.  In the past I’ve done similar things with cones, but I found that I really liked the clarity of poles vs. openings.

Here are some of the things I’ve played around with so far:

  • go around the outside (larger circle – useful if a horse tends to fall in)
  • go around the inside (smaller circle – useful if a horse tends to drift out)
  • practice changes of direction by going around the outside and then taking a path through the inside.  sometimes I just remove two poles so there’s only one path. Other times I remove several so I can practice different lines of travel (bending lines, leg yielding lines, etc.)
  • practice turns and circles by going around one or more poles
  • practice figure 8’s by going around one pole and then another pole. I can do these using poles opposite each other, or along the edges
  • practice leg yield in combination with bending lines by weaving along the outside (I often remove every other pole for this)
  • practice loopy turns (serpentine type turns) by removing a pole and then leaving two so I have bigger turns
  • I can use a turn out of the center of the circle to get more engagement if I ride it with the idea of a square turn).  Once I’m on the line of the circle, I can either ask the horse to collect more or extend.
  • I have also used it with Red in his long-line sessions and found that having specific openings in the circle improved my ability to steer him.
  • I don’t have to stay “tight” to the circle to use it. Sometimes I use the full arena but just pop in and out of the circle at various points, doing a turn and then going out on to a bigger pattern.

Other possible configurations:

One day I removed some of the poles and placed cones across the openings. The cones still provided a visual barrier but were easier to move around than the poles. If you only had a few poles and wanted to do a combination of poles and cones, this option might work well.  Eventually I will probably shift Aurora to more cones than poles, so that I can set the circle up more easily and/or make it bigger.  I do think that, for her, the poles were much clearer and it was worth doing in the beginning, but I don’t think she’ll need them forever.

Here’s the circle with a combination of poles and cones:

IMG_3341

I also set this up one day, adding a pole to go over and just one cone to mark the track on the opposite side.

IMG_3340

The pattern of one cone to go around and one pole to go over is how I have defined Red’s liberty circle for quite a while, so I was comfortable adding “go over the pole” and didn’t expect it to confuse him.  I set it up inside the pole circle to see if he was really doing a round circle or if it was getting more egg shaped. Turns out, he is pretty accurate.

I’m sure I will come up with some new ideas for patterns to do through the circle. I haven’t done much at the canter. And then it will be time to come up with a new set-up.

Teaching husbandry behaviors with clicker training: Tooth Inspection

Red tooth inspectionIs your horse comfortable letting you look at his teeth?

In the last few years, I’ve encountered a variety of teeth issues with my own horses and it has made me realize the importance of being able to check their teeth on a regular basis. Without special equipment, I can’t do a complete mouth exam, but I can check their incisors for uneven wear patterns or other signs that they need the attention of a dentist. This allows me to catch problems early. Regular tooth inspection also prepares them for when the dentist does come.

When I started looking at teeth, I thought it would be simple to just move their lips and take a peek. My horses are used to being touched all over and they are comfortable having my hands near their mouths when I am hand feeding, grooming, haltering and bridling. But, I found that while they were ok with my hands near their mouths for routine tasks, asking them to open their lips was an unfamiliar behavior and they weren’t sure what to do. Some of them became confused and offered other things. Others just became anxious and put their heads up or moved them around.

So, I decided this was a great training project. I started with a fairly simple goal which was to teach each horse to do a behavior that would allow me to look at his or her incisors from the front and from both sides. This meant I needed a behavior with some duration and I needed the horse to hold his head in a position where I could see the view of the teeth that I wanted.

My first thought was that maybe I could make use of a behavior that I already had on cue. This is the flehmen response (“smile”) which I had captured or shaped with several of them as a fun trick. It is a great way to see their teeth and they all learned to do it quite easily. But, horses tend to pick their heads up quite high when they show their teeth this way and I wasn’t sure how easy it would be to build duration or be able to see the teeth from the side. For it to be useful, I would need to shape it into a more controlled behavior. I’m sure I could have done that, but I decided that rather than change a long-established behavior, it might be easier to start with something new.

I decided I would start over and teach them to hold their heads still and allow me to gently move their lips out of the way so I could look at their teeth. This would make it easier for me to keep the behavior on cue, and I could build in some flexibility in how I did it. This might be more practical, especially if I needed to see their teeth from a certain angle. I also thought it might also be an interesting challenge for one of my horses who tends to have a busy mouth. Learning to keep his lips and tongue still would be a good exercise for him.

This brings up an interesting point which is that there are always several ways to approach any husbandry behavior and it’s a good idea to think about the options before choosing one.  One of the first questions I always ask myself is if I want the horse to do the behavior on his own, or if I want the horse to allow me to physically manipulate him.

There are advantages and disadvantages to both approaches and I always consider the horse’s individual needs (which includes past training history) as well as how, when, and where the behavior will be used.  In many cases, the first approach, where the horse does the behavior on his own, gives the horse more control of the training and encourages the horse to be a more active participant than the other approach, where he is trained to let you do a behavior to him. But, in some cases, learning to allow manipulation requires just as much participation on the horse’s part and can lead to a horse that is more comfortable about tactile information in general. On the other hand, some horses will become less eager to participate if there is an element of manipulation (or any suggestion of “making them do it”), and it’s better to set up the training so they have more control over the process.

Therefore, when choosing how to approach tooth inspections, I had to think a little bit about the possible implications of choosing what might seem like a more passive behavior. The behavior “allowing a person to look at your teeth” is different than “showing your teeth” and I didn’t want the horses to just learn to accept something that was unpleasant. Could I train it in such a way that the horses were still able to communicate when and how the behavior would be done? Yes! Once I started training, I found that I could set up a nice dialog by using specific behaviors as starting points and waiting for the horse to indicate when he or she was ready to continue.

The Teaching Progression

I spent about six weeks working on this with three of my horses: Rosie, Red and Aurora. We didn’t work on it every day and some horses had more sessions than others.  Aurora had the most, Rosie had the least, and Red was somewhere in between.  They all came into the training with slightly different repertoires of trained behaviors and they all have very different personalities.

My plan was to start with a chin target and then refine it to include a closed mouth and quiet lips. One reason I chose to use a chin target is because it’s a behavior where my hand is in close proximity to the mouth, but it’s a position that is less likely to prompt lip movement. It also made it easy to stabilize the horse’s head when I was ready to use my other hand to lift the lips. But, once I started training, I realized that each horse needed a slightly different progression to get to the same basic behavior of chin target with a closed mouth and quiet lips.

Here’s how it worked out:

• Rosie: She is the most experienced of the group. She has done a lot of chin targeting, knows how to “smile” on cue, and is the least nibbly about fingers. To avoid cueing the smile behavior, I started with a chin target and then moved to reinforcing closed mouth and quiet lips. I was careful not to touch her between her nostrils until the chin target/quiet mouth was well established. A finger to that area between the nostrils is her cue to smile. So, Rosie’s progression was chin target -> mouth closed -> lips quiet.

• Red: he is a very clickerwise horse, but had not been taught to chin target and he is the most oral of the group. He loves to lick people and can be a little nibbly with his lips if my hands are near his face. He also knows how to smile on cue. Since I knew his biggest challenge would be having a quiet mouth, I started by clicking him for that before I put my hands near his face. Once he could keep his mouth closed, then I added a chin target. Once he could do both of those, then I clicked for quiet lips. I probably clicked for quiet lips at other points in the process if the opportunity presented itself, but his general progression was mouth closed -> chin target -> quiet lips.

• Aurora: She is the least experienced. I have not taught her a chin target or to smile, but she does know a hand target. I’ve spent time touching her all over and making sure she’s comfortable with it, but because she’s only 3, she has the least life experience with medical and husbandry procedures. She tends to have a quiet mouth and be pretty passive about having things done to her. She’s also the one I have to watch because her body language is the least clear of the group. She doesn’t always tell me when something is bothering her. Her progression was hand target -> chin target -> closed mouth. I didn’t have to focus on quiet lips with her.

With the chin target, I experimented with leaving my hand near their chin vs. removing it between repetitions and found that both were useful at different times. If I was doing a few repetitions in the same position, it was less distracting if I left my hand near, but not on, the chin while I fed the treat.  Then I could just ask the horse to target my hand when we were starting again. They can’t see it so I would just touch the chin gently and click if the horse maintained the contact. If I wanted the horse to have a break or was changing sides, then I would remove my hand and start over again for the next repetition.   For the most part, I kept my hand in position near (but not on) their chin if they moved their heads around a bit.

At a certain point, once everyone had the basic idea, the beginning sequence of behaviors started to look the same for all the horses. I would start with a chin target, wait for them to be ready (the horse indicated this by having a quiet mouth and lips) and then I would start moving the lips so I could look.   

I broke it down into small steps and followed this general progression:

  • Head still – click for relaxed/neutral position
  • Chin target – click for chin in hand
  • Mouth closed – click for mouth closed
  • Quiet lips – click for quiet lips (this step and the previous one were sometimes done together or in the other order)
  • Quiet lips when I placed my hand on them – click if the horse remained relaxed with no mouth or lip movement
  • Relaxed lips while I gently moved them away from the teeth – click if the horse allowed me to move the lips out of the way
  • Relaxed lips while I moved them more –  click for allowing me to move the lips. This step took quite a lot of time as they had to learn to relax their lips. If they wanted to move their lips out of the way for me (Rosie did this), then I was ok with that, although I had to be careful about what I was clicking.
  • Relaxed lips while I held up their lips for longer durations so I could see their teeth. I built duration slowly over time.  – click for a good moment (relaxed lips sufficiently open) and enough duration

I also monitored head and neck position to make sure they were comfortable. Once I could see the teeth, I might have to click and treat them for moments when they had their teeth together, as some of them would open their mouths slightly as I moved their lips.

One interesting challenge with training the tooth inspection behavior was that a quiet mouth included not chewing. In a lot of my training, I don’t need to wait for the horse to finish eating before I do another repetition and my horses are happy to start again while eating their last reinforcer. But, with this behavior, I found it was better to give them time to completely finish eating before I asked again. Waiting for the horse to be done chewing was something I had to be consciously aware of doing.

Waiting for them to be done chewing meant that once I was past the chin target stage, my rate of reinforcement dropped a lot unless I mixed in other behaviors. Rosie and Red didn’t seem to mind, and if they wanted to do another repetition, they would actually stop chewing even though they weren’t completely done eating. This became a great way for them to tell me when they were ready to go again.

Aurora preferred to finish eating before we did the next repetition of looking at her teeth, so I found it was helpful to add in targeting or other simple behaviors if the rate of reinforcement was getting too low.  I also had her practice the chin target outside of the tooth inspection sessions. I might mix in a few chin targets while I was grooming her, picking moments when she was standing quietly and not chewing. This strengthened the chin target behavior and she could finish eating while I moved on to grooming or something else.

What can I see?

I taught the horses to let me look at their teeth from three positions: the left side, the right side, and in front. I don’t have any professional training in dentistry, but my goal was to be able to look at and check the incisors for a few specific things. This would allow me to catch problems early and be more knowledgeable when my dentist came. As with any profession, there are different schools of thought on tooth care, but I think every horse owner can learn to identify a few simple deviations from correct alignment.

From the front:

1. Are the incisors level? Is the horse wearing the teeth on one side more than the other? Some horses will develop curved incisors (a “frown” or a “smile”) or a wedge mouth where the teeth are longer on one side than the other.

2. Are the upper and lower jaw centered over each other? From the front, I can look at how the top teeth line up with the bottom teeth. This tells me about the alignment of the jaw from side to side.

3. I can also look for any uneven wear pattern that might occur from repetitive action such as the horse biting at a stall grill or bar.  If these things are noticed early, you can take action to prevent further damage.

From the sides:

1. I can check the jaw alignment by looking at the relative positions of the last incisors on each side.  Do the edges of the top incisors and bottom incisors meet in a straight line? If not, then this tells me that the lower jaw is displaced to one side, which can indicate there is an issue in the back teeth (pre-molars and molars) and/or in the TMJ.

2. I can also check if the middle incisors (top and bottom) are meeting correctly so that the top teeth are positioned directly over the bottom teeth.  Sometimes these teeth will be misaligned so that top teeth extend farther forward than the bottom teeth (or the reverse).

3. I can look at the angle of the incisors. If the incisors are too long, the teeth (both top and bottom) will get pushed forward and the angle between them will get smaller.  Long incisors are more common with older horses but it’s worth checking on all horses.

With babies, being able to look at the teeth can help me monitor if the teeth are coming in at the right time and also if the caps are being shed. When Red was young, he retained a cap on one of his incisors and it affected how the adult tooth came in. If we had caught it sooner, he wouldn’t have had uneven front teeth for as long as he did.

I’ve made a little video to show how everyone is doing.  You can watch it at
https://www.youtube.com/watch?v=Fz78MbL8GEc. I am very pleased with their progress and we are starting to work on more duration.  At this point I can see well enough to check the alignment and watch for problems, but there are still plenty of things we can work on.

In addition to increasing duration, and continuing to work on relaxation, I might explore using the chin target in different ways. I haven’t taught them to come and line up with a chin target, so that might be fun to do. It might also be interesting to try and have them chin target on something else so I have two hands available for a more thorough check.

I also have to decide what to do about their tongues.  Both Red and Aurora tend to place their tongues between their front teeth, so just the tip is peeking out.  I can still see what I need to see, but I’d love to be able to see just their teeth with the tongue tucked behind them.  Once we have more duration, I may try and see if I can start reinforcing for moments when the tongue is tucked neatly inside.  I’ll update this blog if I decided to work on that and let you know how it goes.

Note: The best place for me to film is in my wash stall, so the video is in that location with the horses wearing halters and leads. I did do some of this training in the wash stall, and even did a few sessions where they were on the cross-ties. But, I also did some sessions in their stalls without any equipment.  I like to teach husbandry behaviors in a few different places and under different conditions because I find this makes the behaviors more robust and the horses seem to handle unexpected variations better.

If you are interested in learning more about teeth, here are some links that I have found to be helpful:

Descriptions of some common malocclusions:
http://discerninghandsequinedentistry.com/malocclusions.html

Articles about the importance of teeth for digestion and proprioception: http://www.vossequine.com/

Article about the connection between feet and teeth: http://thenaturallyhealthyhorse.com/feet-teeth-connection-qa-dr-tomas-teskey/

Notes from the Art and Science of Animal Training Conference (ORCA): Choice

maze2
The idea of choice was one of the underlying themes of the conference and is always an important consideration for positive reinforcement based animal trainers. At some level, animal training is about teaching an animal to do the behaviors we want, and to do them when we want them, but there are many different ways to go about getting there.

This conference has always been about exploring how we can achieve our goals, while ensuring that the learning process is enjoyable, the learner is allowed to actively participate in the process, and that he becomes empowered by his new skills and his relationship with his trainer.

There were a lot of presentations that touched on some aspect of choice.

  • Dr. Killeen spoke on how understanding the use of the Premack Principle opens up more choices for reinforcement and can lead to a better understanding of how the value of different reinforcers can change depending upon environmental conditions.
  • Emily Larlham talked about how we can teach our dogs to make different choices instead of becoming stuck in behavior patterns that create stress for both parties. She also talked about how important attitude was in training and how being allowed to actively participate and make choices contributes to a dog’s enthusiasm for any chosen activity.
  • Alexandra Kurland talked about how trainers make choices based on the kind of relationship they want to have with their learner (authoritarian vs. nurturing), and how these decisions influence how much choice they give their animals.
  • Barbara Heidenreich provided lots of examples of how to provide choice through more types of reinforcers and a discussion of why it’s important for both the trainer and the learner to have options.
  • Dr. Andronis showed what happens when animals have limited choices about when and how to earn reinforcement, and how to recognize behaviors that indicate that the learner is no longer enjoying and engaged in the learning process.

These are just a few of the references to choice that came up in the other presentations, but they show that animal trainers have to think about choice all the time.  Sometimes we are looking for ways to increase it.  Sometimes we are looking for ways limit it so the animal cannot practice behavior we don’t want.  And sometimes we are educating the animal so he learns to make “good” choices.

Since choice is such an important topic, I saved it for last. The notes in this article are based on two presentations that dealt more specifically with choice.  The first presentation was given by Jesus Rosales-Ruiz and was titled “Premack and Freedom.” The second presentation was given by Ken Ramirez and was titled “Teaching an animal to say ‘No.'” They go together nicely because Jesús talked about choice from a more academic point of view and Ken talked about how to use choice as part of training.

Jesús Rosales-Ruiz: “Premack and Freedom.”

He started with a quote from David Premack:

“Reward and Punishment vs. Freedom”

“Only those who lack goods can be rewarded or punished. Only they can be induced to increase their low probability responses to gain goods they lack, or be forced to make low probability responses for goods that do not belong to them. Trapped by contingencies, vulnerable to the control of others, the poor are anything but free.”

This quote puts a little different perspective on the how reinforcement and punishment work and the idea of choice.  If you desperately need something and you work to get it, is that really choice?   And how are does that relate to reinforcement? We tend to think that by offering reinforcement, we are giving choices, but are we really doing this? Let’s look more closely at reinforcement and then see how it relates to choice.

How we think about reinforcement

The Language of Reinforcement

  • reinforcement as a stimulus (thing)
  • reinforcement as a behavior (activity)

Originally, it was more common to think of reinforcement as being about getting objects such as food, a toy or other item.  But, in Dr. Killeen’s talk, he spoke about the importance of recognizing that reinforcers are behaviors, an idea that came from Dr. Premack. There are some advantages to thinking of behaviors as reinforcers, because this change in thinking opens up the possibility for more types of reinforcers, and also makes it more obvious that the value of a reinforcer is variable.

Not all behaviors are going to be reinforcing at all times, and in some cases, there is reversibility between reinforcers and punishers. Dr. Premack did experiments showing the reversibility of reinforcers. He could set up contingencies that would make lever pressing, running and drinking all reinforcers (at different times), and he did this by adjusting the environment (adding constraints) so that access to some activities was contingent on other activities.

Want a drink? You have to run first. Running is reinforced by drinking.  Want to run? You have to have a drink first.  Drinking is reinforced by access to running.  He showed that behaviors can be either punishers or reinforcers,  depending upon the circumstances. So, perhaps the availability of reinforcers is not what defines choice.

What about freedom? Is choice about having freedom?

There are different ways we can think about freedom:

  • Freedom from aversive stimulation
  • Freedom to do what we want (positive reinforcement)
  • More generally (freedom from control)

How do these ideas about freedom apply to animal training?  And can they help us understand more about choice in animal training?

Clicker trainers meet all three of these “definitions,” to some degree.  Jesus went over this pretty quickly but it’s quite easy to see how clicker training can contribute to an animal’s sense of freedom or being able to make choices.  Clicker trainers avoid using aversives by shaping behavior using positive reinforcement.  They also avoid using aversives when the animal gives the “wrong” answer or does an unwanted behavior.

Clicker trainers don’t necessarily give the animal freedom to do whatever it wants, but they do use positive reinforcement, and over time positively reinforced behaviors do often become what the animal wants to do.  Also, during shaping, the trainer may want the animal to offer a variety of behaviors, so there is some element of choice there.

They may also choose to train behaviors or set up training so the animal has more control of its own training.  A lot of clicker trainers focus on training behaviors that the animal can use to communicate with the trainer so this can give the animal some feeling of control. We are still focusing on what we want them to do, but we are doing it in such a way that they feel they have more control.

At this point, Jesús mentioned that he doesn’t believe that total freedom exists.  If it did, then science would not exist because there would be no “laws.” Our behavior is always determined by something and while we may think we can control it, what we are really after is the feeling that we can control it, which may or may not be true.

I have to confess that at this point I found myself remembering endless college discussions about “free will” and whether or not it exists. I don’t think we need to go there, but I do think that it’s important to realize that it may sometimes be more accurate to say that our goal is for the animal to have the perception of control. Interestingly enough, one of the points that Steve White made was that perception drives behavior, not reality.

Let’s look at how we can control behavior in animal training:

With the idea of freedom and choice in mind, let’s look at four different ways to control behavior in animal training.

1. Control only through aversives:
(note: I added this one for completeness. Jesús referred to it but did not list it on his slides)

  • target behavior occurs -> animal is left alone or aversive ceases
  • target behavior absent -> aversive consequences
  • the animal has no control, choices or freedom

2.  Control through aversives, but with some added positive reinforcement

  • target behavior occurs -> positive consequences
  • target behavior absent -> aversive consequences
  • the animal can gain reinforcement, but it still does not have a choice and cues can become “poisoned” because the cue can be followed by either a positive or negative consequence, depending upon how the animal responds.

3.  Control through positive reinforcement

  • target behavior occurs -> positive consequences
  • target behavior absent -> no consequences
  • the animal can now “choose” whether or not it wants to gain reinforcement, without having to worry about aversive consequences for some choices.  But is it really choice if one option earns reinforcement and the other does not?

4.  Control through positive reinforcement with choices

  • target behavior occurs -> positive consequences
  • target behavior absent -> other positive consequences
  • animal always has another option for a way to earn reinforcement, so there is true choice between two options that both lead to reinforcement.

Jesús shared a couple of videos that showed animals making choices.  He did not spend a lot of time on these, so I can only provide a brief description and the point he was trying to make.

The first video was of a training session with a dog using clicker training and food.  The dog is loose and can participate or not.  As soon as the trainer starts clicking and treating, the dog leaves.  He followed that video with another one where the dog is trained with food alone (no clicker).  In this case, the dog stays and continues doing behaviors for reinforcement.  He said this was an example of a situation where the clicker itself was associated with “hard work” so when the clicker came out, the dog would leave.  He didn’t go into more details on the dog’s previous training history but those clips do suggest that the clicker itself can become an aversive stimulus.

The second video showed a horse being taught using clicker training and food. The horse is loose and the trainer is sitting in a chair.  Throughout the training session, the horse is eagerly offering a behavior, which the trainer clicks and reinforces with food. But, because the trainer is feeding small pellets of grain, some of the food is falling on the ground around the horse.  Despite the abundance of food on the ground, the horse prefers to keep offering behavior, and getting reinforced for it, rather than just eating the food off the ground. Jesús said this was an example of “real choice” because the same food reinforcement was available whether the horse did the behavior or not.

So far we have looked at how reinforcement and freedom contribute to the idea of providing choice for animals.  But there’s another consideration, and that’s hidden within the idea of repertoire size.

Restraint, constraint and the effect of repertoire size

Positive reinforcement trainers are usually very aware of the effect of restraint on animals and try to train under conditions in which the animal is not physically restrained.  They are also aware of the effect of constraint, but it seems to get less attention.   Constraint is when we control the “goodies” or limit the animal’s ways to earn them.

Jesús said it was important to avoid constraint in training.  Constraint can be physical, which means that the animal is in an environment where it might be “free” to move, but there are very few options for things to do.  A Skinner box is an example of an environment where the animal is constrained because the number of behaviors it can do is quite limited.

But, constraint is not always physical. It can also be more about skills or the repertoire of behaviors that are available to an individual. You could say this is a kind of “mental” constraint where the individual feels it has few options because it is only comfortable doing a few things.

He used some human examples to illustrate this. For example,  a person who has several skills has more freedom from constraint than someone who is only good at one thing.   If you are good at debating, dancing, and social interactions, then you can go to a debate, dance or eat lunch with your friends. If you are only good at debating (even if you’re really good at it), but you lack other skills, especially social ones that are important for many activities, then you are constrained by your own repertoire.

He called this being coerced by available behavior, because your options are limited if you have a limited repertoire. At the end of the presentation, Joe Layng made the comment that “feeling” free is actually being free. If you only have one way of getting the extraneous consequence, then you are still limited. Joe’s comment reminded me of Steve’s point that perception, not reality, drives behavior.  It’s always interesting to see how these things come together.

So, one way to limit constraint and increase choices in animal training is to increase the animal’s behavioral repertoire.  This gives them more choices on several different levels because they have more options for reinforceable behavior overall, and it also may make it possible for you to give them more options at any given time.

To summarize:

Choice is not just about providing reinforcement or about removing aversives.  It’s about providing the animal with opportunities to earn reinforcement in many ways and increasing the animal’s repertoire so it has the skills and opportunities to practice many different behaviors.

Remember that:

  • A small repertoire = more constraint
  • limited skills = limited opportunities

If our goal is to increase freedom, then we need to be aware that individuals can be constrained by the available environment and available behavior.

Jesús ended with the question, “If my dog will only walk with me when I have treats, why is that?”  

That’s kind of a loaded question and if I didn’t know Jesús was in favor if using treats, I might think he was suggesting that using treats was a problem.
But I don’t think he was saying that at all. I think he was just encouraging the audience to think about what it means if your dog will only walk with you if you have food.  What does that say about the choices he is making?  Does it tell you anything about how much you might be using food to limit his choices, not to give him more choices?

At the end of Jesus’s presentation, I found myself pondering the practical application of the material he presented.  Yes, I love the idea of giving animal’s choices and I know from personal experience that adding reinforcement is not the same as giving choices. But, I was thinking hard about what training would look like if several behaviors were all capable of receiving the same amount of reinforcement. The whole idea of clicker training is that we can select out and shape behavior by using differential reinforcement.

So, what would happen if you had several behaviors that could earn equal reinforcement? Well, lucky for me, Ken Ramirez’s presentation later in the day was on this exact topic.

Ken Ramirez:  Teaching an animal to say “No”

In his presentation, Ken shared some training that he did with a beluga whale who had become reluctant to participate in training sessions.  He started by saying that while his talk is titled, “Teaching an animal to say ‘No,'” he realizes that that phrase is just a convenient way to describe what they did, and is not necessarily how the animal perceived the training.

He spent a little time talking about terms like “no” and “choice.” They are labels we give to ideas so we can talk about them, but that’s not useful unless we make sure we are all using them in the same way, or have a common reference point.  He shared what he means by teaching “no,” choice, and how the two are related.

What is “no”?

  • Teaching “no” means teaching another reinforceable behavior, one that the animal can choose to do instead of the behavior that has been cued. In the example he’s going to share, they taught the whale that she could always touch a target for reinforcement.
  • Teaching “no” is different than teaching intelligent disobedience, which is more about teaching the animal that some cues override other cues. It’s also different than a Go, No-Go paradigm (hearing test where you don’t respond if you don’t hear the tone), or an “all clear” in scent detection which is just a new contextual cue.
  • We can only guess why the animal chooses to say “no.”  When the whale did touch the target, they had no way of knowing if she was doing it because it was easier, had a stronger reinforcement history, she didn’t want to do the other cued behavior, or …
  • But, regardless of why she chose it,  the value was that it gave her another option besides responding to the cue or ignoring the cue.  Tracking her “no” responses had the added benefit of allowing the trainer to gather information about her preferences and whether or not there was any pattern to when she chose to say “no.”

What is choice?

(these points should be familiar, as they are very similar to what Jesús discussed)

  • It is hard to define
  • Arguably, no situation every provides true choice (there are always consequences)
  • In “true choice,” the animal has the option to receive reinforcement in more than one manner (this goes along with Jesús’s point about true choice being where there are multiple ways to earn the same reinforcement).
  • It is about controlling outcomes

Choice Matters:

  • Real choice is rare
  • Choice is often forced (meaning it is limited, or only one option has a positive consequence)
  • Choice is a primary reinforcer (animals can be reinforced by the opportunity to control their own environment)

Choice in animal training:  a little history

The introduction of positive reinforcement training into zoos and other animal care facilities made it possible for trainers to choose training strategies that allowed their animals more choices.  In the beginning, it may just have been giving the animal a choice between earning reinforcement or not, but over time the training has gotten more sophisticated so that animals have more choices and can actively choose their level of involvement in training sessions.

Ken had some video clips that showed the ways that trainers in zoos can provide choice during husbandry behaviors.  One common practice is to teach stationing, which can be used to teach animals to stay in a specific location for husbandry behaviors.  The animal can choose whether or not to participate by either going to the station, or not.

Another option is to teach the animal an “I’m ready” behavior, which the animal offers when it is ready to start or continue. The trainer does not start until the animal offers the behavior, and she may pause and wait for the animal to offer it again, at appropriate intervals during the session, to make sure the animal is ready to continue.  Some common “I’m ready” behaviors are chin rests, targeting, the bucket game (Chirag Patel), and stationing.  These methods give the animal some choice because the animal is taught a specific way to tell the trainer whether he wants to participate or not.

Teaching stationing and “I’m ready” behaviors are examples of ways that trainers can give their animals more choices.  Teaching these kinds of behaviors usually leads to training that is more comfortable and relaxed for both the trainer and the learner.  A side benefit is that the trainers become much more skilled at observing their animals and paying attention to body language. And, they learn to wait until the animal is ready, which is always a good thing!

Husbandry behaviors can be unpleasant, but allowing the animal to control the timing and pace of the training can make a big difference in how the animal feels about what needs to be done.   However, this may still be quite different than providing “true choice.”

So, what does “true choice” look like?  This was the main part of Ken’s presentation.

The “No” Project:

The “no” project is the story of re-training Kayavak, a young beluga whale.  Kayavak was born at the Shedd Aquarium and has been trained for her entire life (5+ years) using positive reinforcement.  During that time, she developed strong relationships with several trainers and she would work both for food and for secondary reinforcers, especially tongue tickles.  She was easy to handle and responded well (to criteria and with fluency) to her cues.

In fact, she was so agreeable that they often let the younger and more inexperienced trainers work with her.  But, as she started to have more training sessions with less experienced trainers, and less with more advanced trainers, her behavior started to change. She became less reliable about responding to cues and was likely to just swim away, especially if they were working on medical behaviors.

This continued for several years until the “problem” was finally brought to Ken’s attention. By this time the staff was very frustrated and they needed to find a solution. Ken said that part of the reason it took so long for the problem to get to him was that she was handled by different trainers and it took a while for a clear pattern to appear.

Of course, the first thing one wonders is “What happened?”  Looking back, Ken’s best guess is that the change in her behavior was the result of many small mistakes that accumulated over time. None of them were significant events, but added up, they undermined her confidence and made her reluctant to participate in training sessions.

Here are some of the contributing factors:

  • She was trained more and more often by young trainers without strong relationships
  • They misread early, mild signs of frustration, and didn’t adjust
  • They used a LRS (least responsive scenario) inappropriately and it became long enough that it was more of a TO (Time Out.)
  • They felt pressure to get behavior and asked for it again after refusal, instead of asking for another behavior, taking a break, or one of many other options.
  • The problem exacerbated over time and she discriminated against less experienced or unknown trainers.

The Solution

Ken proposed a unique solution.  He felt that Kayavak needed to have a way to say “no.”  He thought she might be feeling as if she didn’t have any choices, and that was why she would just leave.  But, if she had a way to say “no,” and got reinforced for it, perhaps she would choose to remain and would learn to become more engaged in training again.

He suggested that they teach her to touch a target (a buoy tethered by the pool edge).  Touching the target would ALWAYS be a reinforceable behavior, and would be reinforced in the same way as other behaviors.  This was an important point.  They didn’t reinforce targeting with a lesser value reinforcer. They made sure the reinforcement for targeting was equivalent to the reinforcement for other behaviors.

While teaching “no,” might seem like a radical idea, Ken mentioned that he had a few reasons he thought this might work. One was some training he had seen at another aquarium where a sea lion was taught to touch a target at the end of his session so the trainer could leave.  The sea lion learned to do this, but also started touching the target at other times, and seemed to be using it to indicate when he wanted to be done.

The challenge was convincing the staff to try it.  They doubted it would work, because why would she do any of the “harder” behaviors if she could just touch the target all the time? This was a good question, and gets to the heart of what some clicker trainers believe, which is that animals like doing behaviors that have been trained with positive reinforcement.  But, if this is really true, then shouldn’t she be just as happy to do a variety of positive reinforcement trained behaviors instead of just repeating the same one over and over?

Since Ken is the boss, he convinced them to give it a try…

The training

  • Place a buoy close to where she is working and teach her to target it.
  • Practice targeting the buoy until it’s a very strong behavior.
  • Start mixing in some easy behaviors, but ask for targeting the buoy in between.  So they might cue behavior 1 -> click -> reinforce -> target buoy -> click -> reinforce -> behavior 1 (or 2) -> click -> reinforce -> target buoy ->…
  • Increase the number of other behaviors and/or difficulty, still mixing in targeting the buoy on a regular basis.
  • Throughout this process, she can touch the buoy as often as she likes. So if the trainer wants to alternate buoy touches with cued behaviors, and Kayavak offers several buoy touches in a row, she still gets clicked for all of them. If she makes an error, doesn’t get clicked and then touches the buoy, she gets clicked and reinforced.  A buoy touch is always a reinforceable option.

Evolution of a behavior

Ken showed us how the training progressed. He had some video of her at the various stages and some great charts to show how the number of buoy touches changed over time.  I thought this part was really fascinating because it showed how important it was to allow her to find her own way to use “no” and how challenging it was for the trainers to stick with her through the process!

After 3 weeks:

  • She touched it all the time, after every cue.
  • The staff thought her behavior meant that it wasn’t going to work.

At 4 weeks:

  • She started to work well and only chose the buoy under specific conditions, such as when ill, when wrong or no marker heard, when asked to do a medical behavior, with a new trainer, or if working with a trainer with whom she didn’t have a good relationship.
  • He had a clip of a training session with a trainer she didn’t like. She would touch the buoy repeatedly, usually doing it before the trainer had time to cue another behavior.
  • I asked Ken if they kept the session length the same, even if all she wanted to do was touch the buoy and he said “yes.”  That was partly because her training session is how she gets fed, but also because touching the buoy repeatedly was not “wrong.” If that’s what she felt like doing that day, that was fine.
  • With a trainer she trusted, she might not touch the buoy if she didn’t want to do a behavior, but would wait for the next cue instead.

At 4 months:

  • There were almost no refusals with experienced staff.
  • She still tests new trainers for a period of time.
  • They did use the buoy during free shaping, but she rarely touched it.  If she did, it was a sign that the trainer was not slicing the behavior finely enough.
  • They can use the buoy to test if she likes a behavior – which one does she do?
  • He had some nice charts showing how her behavior changed (#right answers, buoy touches, refusals).  They showed how she would test new trainers and then over time, the “no” behavior would get offered less and less.

Is this a useful approach?

Since doing this training with Kayavak, Ken has done the same thing with a sea lion and two dogs. They were all cases where the animal had lost confidence in the training.  Having a default behavior that was always reinforceable meant they always had a way to earn reinforcement and it gave them choices.

He did find that, as with Kayavak, once the “no” behavior had been learned, the animals were fairly discriminating in when they used it. They might offer “no” instead of doing a cued behavior if the cued behavior was difficult, uncomfortable, or unknown.  They also might offer it after an error.

Despite his success, he’s not sure you should use it with all animals. Usually if an animal is trained with positive reinforcement, it already has lots of ways to say “no,” so it’s not necessary to teach another one.  It may be more useful to work on your own training or observation skills so you notice the first signs of frustration and can adjust before the animal reaches the point where it needs to say “no.”

There may also be difficulties if you teach it too early because the animal might get “stuck” on that behavior. This point made me think of Jesus and his comments about the danger of having a limited repertoire. Ken thinks it’s better to teach the animal a larger repertoire and then add a “no” behavior if needed, either because the relationship has broken down, or the animal has lost confidence. If you do teach a “no” behavior, it’s important to choose an appropriate one, either one that is useful or is a starting point for other behaviors.

I enjoy Ken’s presentations because he always has the coolest projects and approaches them with a great blend of practical and scientific knowledge.  At some point in his presentation, he mentioned that the “no” project brought together a lot of scientific principles, including matching law, contra-freeloading, Premack, and others.  But he also said that he used what he had learned from observing other trainers, or observing the animals themselves.  I think this project was a great example of how we can give animals more choices as long as we have a well thought out plan and are willing to take the time to see it through.

This is the last of the articles I am planning on writing on the ASAT conference.  I have lots of ideas for what to do with what I learned from the conference, and may blog about some of my own training later this summer. In the meantime, I hope something in these articles has caught your attention and inspired you to go out and try something new.   I want to end by thanking all the speakers for their permission to share my notes. I also want to thank all the ORCA students who work hard to put to plan and run the conference. They are already busy planning the conference for next year and it will take place on March 24-25, 2018 in Irving, Texas.

 

Notes from the Art and Science of Animal Training Conference (ORCA): Dr. Jesús Rosales-Ruiz on “Conditioned Reinforcers are Worth Maintaining.”

click treat 1.jpg

In this short presentation, Jesús Rosales-Ruiz revisited the question:

“Do I have to treat every time I click?”

He said that this question constantly comes up and that different trainers have different answers.

Before I share the details of his presentation, I want to mention that he said he chose to use the words “click” and “treat” because he was trying to avoid using too much scientific jargon.    But, as he pointed out at the end of his talk, it would be more accurate to say “click and reinforce,” and probably even more accurate to say “mark and reinforce.”

Since he used “click and treat,” I’m using the same words in these notes, but you should remember that he is really looking at the larger question of how we use conditioned reinforcers and whether or not they always need to be followed by a primary reinforcer in order to maintain their effectiveness.

Back to the question…

Do you have to treat after every click?

Some say YES:

  • Otherwise the effectiveness of click may be weakened
  • Bob Bailey says: “NEVER sound the bridging stimulus idly (just to be ‘fiddling’) or teasing…it’s important that the “meaning” of the bridging stimulus is kept unambiguous and clear. It should ALWAYS signify the same event- The primary reinforcer.”  How To Train a Chicken (1997) Marian Breland Bailey, PhD and Robert E Bailey.
  • This view is supported by research that shows that the conditioned reinforcer should be a reliable predictor of the unconditioned reinforcer.

Some say NO:

  • Once a click is charged, you only have to treat occasionally
  • Once a behavior is learned, you only have to treat occasionally
  • Supported by research on extinction (in general, this means that if an animal learns that not every correct answer is reinforced, then it will keep offering the correct answer for some period of time, even if there’s no reinforcement.

So maybe there is some research for both.

He said that he started thinking about this question again after reading a blog by Patricia McConnell, who was sharing some thoughts on whether or not to treat after every click. She was wondering why clicker trainers recommend it, but other positive reinforcement trainers do not.

Patricia McConnell wrote:

  • “For many years I have wondered why standard clicker training always follows a click with a treat.”
  • “Karen Pryor strongly advocates for us to reinforce every click (secondary reinforcer) with a treat (primary reinforcer). Ken Ramirez, one of the best animal trainers in the world, in my opinion, always follows a click with a treat.”
  • “But Gadbois went farther, given the link between motivation and anticipation, suggesting that it was important to balance the “seeking” and “liking” systems, with more emphasis on the former than the latter during training. He strongly advocates for not following every click (which creates anticipation) with a treat, far from it, for the reasons described above.”

You can read the blog at: http://www.patriciamcconnell.com/theotherendoftheleash/click-and-always-treat-or-not.

If you have not heard of Simon Gadbois, you can read about him here: https://www.dal.ca/academics/programs/undergraduate/psychology/a_day_in_the_life/professors/simon-gadbois.html.

What happens if you don’t treat after every click?

Jesús was intrigued by Gadbois’s statement that you don’t want, or need to treat after every click because you want to balance “liking” with “seeking.” And that if you don’t treat after every click, you get more seeking.

One reason for his interest was that he already knew of an experiment that had been done to look at what happens if you don’t follow every click with a treat.  About 10 years ago, one of his students wanted to compare how the behavior of a dog trained under conditions where one click = one treat was different than a dog that was trained with multiple clicks before the treat.   The two conditions looked like this:

  • one click = one treat:  The trainer clicked and treated as normal after every correct response:  cue -> behavior -> click -> treat -> cue -> behavior ->click -> treat.
  • two clicks = one treat:  The trainer clicked for a correct response, cued another behavior and clicked and treated after the correct response: cue -> behavior -> click -> cue -> behavior -> click -> treat.

These dogs were tested by asking for previously trained behaviors. Each dog was trained under both conditions so some training sessions were under one click = one treat and some were done under two clicks = one treat.  There were multiple reversals so the dogs went back and forth between the two conditions several times over the course of the experiment.

Under the one click = one treat condition, the dogs continued to perform as they had in training sessions prior to the start of the experiment. Under the two clicks = one treat condition, both dogs showed frustration behaviors, deterioration in behavior and at times the dog would leave the session.

There were many factors that could have contributed to the result, including the fact the dogs were originally trained under one click = one treat,  the reversals themselves could have caused confusion, and the dogs might have done better if they were transitioned more gradually.  But, it was pretty clear that omitting the treat did not activate the seeking system, instead it created frustration. Why?

They considered two possibilities:

  • Perhaps because they were getting less food? Under the one click = one treat condition, each dog was getting twice as much food reinforcement as the dog training under the two clicks = one treat condition.
  • Properties of the click had changed.  What does the click mean to the dog?

Can we test if it’s about the decrease in food reinforcers?

If you want to test what happens when you click without treating, you have to change the ratio of clicks to treats. You can do that by omitting some treats, or by adding some clicks. But both options are probably not going to be perceived in the same way by the animal.

In the experiment described above, the trainer changed the ratio of clicks to treats by omitting food reinforcers after half the clicks. This is a significant decrease in the number of primary reinforcers that the dog was receiving. Could the results be more about the reduction in food reinforcers, than about whether or not each click was followed by a treat?

One way to test this would be to keep the number of food reinforcers the same, but add another click.  To do this, the trainer taught the dog to do two behaviors for one click.  The dog would touch two objects. When he touched the second object, he would get clicked and treated.

Once this behavior had been learned, the trainer decided to add another click by clicking for the first object, clicking for the second object and then treating. So the pattern would be behavior (touch) -> click -> behavior (touch) -> click -> treat. This works out to clicking after every second behavior, but the trainer got there by adding a click, not by removing a treat.

What she found was that the dog just got confused.  The dog would orient to the trainer on the first click, get no response, go back to the objects and touch again (either one).  Or he might just wait and look at the trainer, or he might leave. The additional click didn’t seem to promote seeking. Instead it interrupted the behavior and created confusion.

Why?  Well, perhaps it has to do the two functions of conditioned reinforcers. This goes along with the second point above, which is that the difference was due to how the click was being used.

The 2 Functions of Conditioned Reinforcers:

Let’s take a moment and look more closely at conditioned reinforcers.  Conditioned reinforcers are stimuli that become reinforcers through association with other reinforcers.  They usually have no inherent value. Instead, their value comes from being closely associated with another strong reinforcer for a period of time, (while it is being “conditioned”), and this association must be maintained through regular pairings in order for the conditioned reinforcer to retain its value.

In training, this is usually done by deliberately pairing the new stimulus with a primary reinforcer.  There are different kinds of conditioned reinforcers and their meaning and value will depend upon how they were conditioned and how they are used.  Marker signals (the click), cues, and keep going signals (KGS) are all examples of conditioned reinforcers.

Regardless of the type, all conditioned reinforcers have two functions. They are:

  • Reinforcing
  • Discriminating (they can function either as cues or event markers, or both)

Conditioned reinforcers are not just used in training and laboratory experiments.  They are everywhere.

Jesús used the example of a sign, which is a conditioned reinforcer for someone driving to a specific destination.  Let’s say you are driving to Boston and you see a sign that says “Boston, 132 miles.” The sign provides reinforcement because it tells you that you are going the right way. It also has a discriminatory function because it provides information about what to do next, telling you to stay on this road to get to Boston.

When talking about conditioned reinforcers, it’s easy to focus on only one of these functions.  Is this why there is confusion?  Perhaps the debate over whether or not to treat after every click is because some trainers are focused on the discriminating function of the click and others are focused on the reinforcing function of the click?

What does training look like if the focus is on the discriminating function?

When every click is followed by a treat, the click has a very specific discriminating function. It tells the animal it has met criteria and reinforcement is coming.  The trainer can choose what the animal does upon hearing the click (stop, go to a food station, orient to the trainer), so the trainer has to decide what behavior she wants the animal to do upon hearing the click. But, regardless of which you choose, the click functions to cue another behavior which is the start of the reinforcement process.

A lot of one click = one treat trainers emphasize the importance of the click as a communication tool.  There are two aspects to this. One is that it marks the behavior they want to reinforce and the other is that it tells the animal to end the behavior and get reinforcement. If the click is always followed by a treat, the meaning of the click remains clear and it provides clear and consistent information to the animal.

You can think of the click -> treat as part of a behavior chain, where the click has both a reinforcing function, from the association (click = treat), and also an operant function (click = do this).  Clicker trainers who promote the one click = one treat protocol still recognize that the click itself has value as a reinforcer, but they choose to focus on the click as an event marker and as a cue, more than as a reinforcer.

What does training look like if the focus is on the reinforcing function?

A lot of trainers who treat intermittently (not after every click) emphasize that the click is a reinforcer in itself, so it’s not necessary to also provide a treat after every click. They are looking at the reinforcing function of a conditioned reinforcer and would argue that the whole point of having a conditioned reinforcer is so that you don’t have to follow it with another reinforcer every time.

They are still using the discriminating function of the click because it can be used to mark behavior.  But, the click does not become an accurate predictor of the start of the reinforcement phase, so it is not going to have the same cue function as it does under the one click = one treat condition.

Jesús did mention that if the click is not a reliable cue for the start of the reinforcement process, then the animal will look for a more reliable way to tell when it will be reinforced. In most cases, the animal finds a new “cue” that tells it when to expect reinforcement and the click functions as a Keep Going Signal. If the animal can’t find a reliable cue for the start of reinforcement, or if it’s not clear when the conditioned reinforcer will be followed by reinforcement, and when it won’t, then he will get frustrated.

Back to the Literature…

With this information in mind, what can we learn by going back and looking at the research on conditioned reinforcers?  Well, it turns out that the literature is incomplete for several different reasons:

  • It doesn’t look at the cue function of the conditioned reinforcer.
  • Animals in the lab are often restrained or constrained (limited in their options) so the cue function of the conditioned reinforcer may be more difficult to observe.
  • It doesn’t take into account that the most consistent predictor of food is the sound of the food magazine as it delivers the reinforcement.   Even when testing other conditioned reinforcers, the sound of the food magazine is what predicts the delivery of the food reinforcement, and it’s on a one “sound” = one “treat” schedule.
  • To test a conditioned reinforcer that as sometimes followed by food and sometimes not, you would have to use two feeders, one with food and one without and even then you would have to worry about vibrations. Most labs are not set up with two feeders so this work has not really been done.

 
He also mentioned that a lot of what we know about conditioned reinforcers in the lab is from research where the conditioned reinforcer was used as a Keep Going Signal (KGS), and not as a marker or terminal bridge.

I asked Jesús if he had an example of an experiment using a conditioned reinforcer as a KGS and he sent me an article about a study that looked at the effect of conditioned reinforcers on button pushing in a chimpanzee.

The chimpanzee could work under two different conditions. In one condition, he had to push the button 4,0oo times (yikes!) and after the 4,000th push, a light over the hopper would flash and his food reinforcement would be delivered. In the other condition, he also had to press the lever 4,000 times, but a light would flash over the hopper after every 400 pushes, and then again at the end when the food was delivered after the 4,000th push.

The chimpanzee was tested under both conditions for 31 days and the results showed that he worked faster and with fewer pauses until he got to the 4,000th push when he was reinforced by the flashing light every 400 pushes.

Once the chimpanzee had been tested under both conditions for 31 days, they started the second part of the experiment.  In this part, the chimpanzee could choose the condition (by pressing another button) and he usually chose the one where the light flashed after every 400 pushes.

So, having a Keep Going Signal improved the speed at which the chimpanzee completed the 4000 pushes and was also the condition preferred by the chimpanzee.  This suggests that Keep Going Signals can be useful and an animal may prefer to get some kind of feedback.

In this experiment, the conditioned reinforcer they were testing (the flashing light) was functioning as a KGS and the sound of the food magazine was what told the chimpanzee that he had met criteria.  So, this is an interesting experiment about conditioned reinforcers as Keep Going Signals, but it also shows the difficulty of separating out the conditioned reinforcer from the stimulus that predicts food delivery.

An example of training a KGS with a dog

Jesús talked a little bit more about Keep Going Signals, using an example from one of his own students. She wanted to teach her dog a new conditioned reinforcer that she could use as a KGS. She started by teaching the dog to touch an object for a click and treat. Once the dog had learned the behavior, she said “bien” (her new KGS) instead of clicking, and waited for the dog to touch the object again. If the dog repeated the touch, then she would click and treat.

She was able to use the KGS to ask the dog to continue touching an object and I think she tested it on other objects. You do have to train a KGS with multiple behaviors in order for it to become a KGS, as opposed to a cue for a specific behavior. I don’t know if she tested it with other behaviors, but that would be the next step. I’m also not sure if they compared the dog’s performance, with and without the KGS, to see if adding a KGS increased the dog’s seeking behavior, as Gadbois had suggested it would.

Conclusion

The difficulty with the question “Do I have to treat after every click?” is that the answer depends upon how you are using the click and whether or not it cues the animal to “end the behavior” and expect reinforcement. Conditioned reinforcers have two functions. They function as reinforcers and as discriminators, and you need to consider these functions when choosing how to use the click.

If you are using the click as a Keep Going Signal, the animal learns to continue after the click and the click does not interrupt  the behavior.  This means you can click multiple times before delivering the terminal reinforcer. However, it’s likely that you will end up having a different cue that tells the animal when it has completed the behavior and can expect reinforcement. If you don’t, the animal may become confused about what it should do when it hears the click.

If you are using the click to indicate when the behavior is complete, the animal learns that the click is a cue to start the reinforcement process.  You can teach the animal a specific response to the click so that the animal knows what to do to get his reinforcement. If the click is being used in this way, then it will interrupt the behavior and you will want to wait until the behavior is complete before clicking.

We call both these types, the click as a KGS and the click as an “end of behavior” cue, conditioned reinforcers, but they are not the same thing. There are many kinds of conditioned reinforcers, and when you are not specific, it’s easy to think you are talking about the same kind, but you are not.  So both “camps” may be right, but for the wrong reasons.

Jesús finished by saying we need to study this more carefully in the laboratory and also in real life training situations.  One point he made was that an animal, who initially learned one click = one treat, could probably be re-trained to understand that the click was a KGS, if the transition was done more slowly (than the dogs in his student’s experiment), but he still thinks it would change the meaning of the click from an “end of behavior” cue to a “keep going signal.”

I thought this was a very interesting talk, partly because it shows how important it is to clearly decide how you are going to use conditioned reinforcers and to make sure that you teach your animal what it means. I don’t think it was intended to be the final word on a complicated subject, but the presentation certainly made me more aware of the importance of thinking about the many functions of conditioned reinforcers and how I am using them.

But… I’m not sure it left us with an answer to the question of what happens when the same conditioned reinforcer is used as both a KGS and to end the behavior, which is how many trainers describe their practice of clicking multiple times before delivering the terminal reinforcer. There needs to be research done on what happens if it is used as both.

A few personal thoughts

This presentation was informative, and made me feel more confident about the system I use, but it also left me with some unanswered questions.

I have always followed a click with a treat. It is how I originally learned to clicker train and it has worked well for me. If I want to use a different reinforcer, I have a different marker. If I want to provide information or reinforcement to the horse without interrupting the behavior, I have several other conditioned reinforcers I can use.

It’s never made sense to me to have the same conditioned reinforcer sometimes be a cue to “end the behavior” and sometimes be a cue to “keep going.” I question if that’s even really possible, unless the animal learns it has different meanings under different conditions, and that seems a bit awkward. It just seems simpler to have clearly defined conditioned reinforcers and use them in a consistent manner.

I was intrigued by the research into Keep Going Signals. I do use Keep Going Signals and have found them to be useful. But I have also found that I have to pay attention to maintaining them in such a way that they retain their value (through pairing with other reinforcers), but don’t become reliable predictors of reinforcement and morph into “end of behavior” cues. I’d love to see more research on how to effectively maintain Keep Going Signals, as well as some research on how effective they are at marking behavior.