I was fortunate to be able to attend Clicker Expo in Lexington, Kentucky in March 2008. This was my third expo and like the ones before, it was a great experience. I met some very nice people, learned a lot, and had the opportunity to share ideas with other clicker trainers. I thought I would share some of the highlights of my experience because I think that this expo generated a lot of new ideas for me about how to clicker train horses more efficiently and more creatively. I need to point out that Clicker Expo runs for 3 days with 3 time blocks of lectures and labs each day. Within each time block, there are five choices for what to attend, so every attendee puts together his or her own program and the sessions I am going to describe are just those that I attended.
Friday morning starts with an opening session where Aaron Clayton presents some information on who is attending (what states we come from, what kinds of jobs) and then the faculty are introduced. Don’t hold me to these numbers but someone told me there were about 400 attendees with 150 dogs. The sessions I attended are listed below.
- Ken Ramirez: Working for the joy of it: A systematic look at non-food reinforcers
- Steve White: “No Problem! Solve any Training Issue in Four Steps”
- Jesus Rosales-Ruiz: “Broken Clicks: How Reinforcer Delivery Impacts Learning”
- Ken Ramirez: LAB: “Next! Finding and Using New Reinforcers”
- Morten Egtvedt and Cecilie Koeste: “Reliability: Thy Name is Backchaining”
- Kathy Sdao: LAB: “What a Cue can do in Action: cue control”
- Michelle Pouliot: “Training Guide Dogs for the Blind”
- Jesus Rosales-Ruiz: “The Poisoned Cue Anew”
- Theresa McKeon: LAB: “TAGteach in Action”
Ken Ramirez: “Working for the joy of it: A Systematic Look at Non-Food Reinforcers.”
Ken Ramirez is the vice president of animal collections and training at the Shedd Aquarium in Chicago and he is always a good presenter with a nice mix of stories and information. The focus of his talk was how to condition and use non-food reinforcers to provide reinforcement variety in your training. He started by explaining the difference between primary and secondary reinforcers. Primary reinforcers are those things that inherently satisfy a biological need and include food, water, shelter, air, reproduction, and safety. Secondary reinforcers are reinforcers that acquire their value through association with a primary reinforcer. There are lots of different kinds of secondary reinforcers including event markers and keep going signals. For the purpose of his discussion, he was focusing on what he called “reinforcement substitutes” which are reinforcers that a trainer can use in place of a primary reinforcer. I want to point out that this session was an “advanced” level one and he was very clear that novice trainers should not be using reinforcement substitutes without supervision or guidance from more experienced trainers, although there are exceptions.
Reinforcement substitutes are useful for trainers for those times when a primary reinforcer cannot be used either because it is not available or because the animal does not want it. He explained how they used reinforcement substitutes with one of the dolphins when it was sick and would not eat. They needed to feed her medicated fish so they actually got her to eat the fish by using reinforcement substitutes in training sessions and including some fish eating as a desired behavior. One of the points he made with this example is that the value of reinforcers can change so it is useful to have several reinforcers available to give the trainer a choice of what to use in any given situation.
He presented detailed information on how to condition reinforcement substitutes. He uses classical conditioning to associate the new reinforcement substitute with a primary reinforcer. The procedure starts with choosing a reinforcement substitute you want to use. He gave us some guidelines for choosing these. It is easier to start with something you think the animal might find reinforcing, but you can start with something neutral. He does not recommend starting with something the animal finds aversive. And he is very specific that even if you choose something that you already think the animal finds reinforcing, you still need to go through the conditioning process because just because an animal likes something, that doesn’t mean it will work for it. Later, he mentioned that one of the cautions about using reinforcement substitutes is that in order for them to be strong reinforcers, you need to control access to them. Therefore if your animal has a behavior that it finds strongly reinforcing, and you want to condition it as a reinforcement substitute, you may need to limit access to that reinforcer under some situations.
The trainer is going to start associating the reinforcement substitute with the primary reinforcer by presenting the reinforcement substitute and then immediately presenting food. He does not click because he does not want the animal to think it needs to “do anything” to earn the primary reinforcer. At this point the reinforcement substitute is just indicating that primary reinforcement is coming. He does this for many sessions over weeks, months and maybe years, as long as it takes for the animal to make the association. This is the first step in the process.
The second step is to start using the reinforcement substitute in training and see if the animal accepts it. He had some general rules to follow. Each step is repeated multiple times.
1. Start with a easy, well established behavior. For the first reps, ask for the behavior, click, present new reinforcement substitute, follow with primary
2. Mix in some times where you follow the click with the reinforcement substitute and NO primary. Only do this 3x max per session.
3. Now start using reinforcement substitute after a harder but well established behavior, by asking for the behavior, clicking, offering the new reinforcement substitute, followed by the primary
4. Start following a harder but well established behavior with the reinforcement substitute only. Again, only do this 3x max per session.
5. Increase use of reinforcement substitutes but do not use more reinforcement substitutes than primary reinforcers. His general guideline is 20/80 max but this can be over multiple sessions so you can have some sessions where you use more reinforcement substitutes than others.
6. You must continue to pair the reinforcement substitute with the primary reinforcer on a regular basis to keep the association strong.
7. With beginners, he had more specific rules for introducing reinforcement substitutes gradually. These were never use a RS (reinforcement substitute) after 2 consecutive behaviors, avoid using the same RS twice in a row, always ask for behavior followed by a primary reinforcer more often than a secondary, and keep the association strong by reinforcing reinforcement substitute regularly.
He showed how to use reinforcement substitutes in chains and take advantage of Premack. He gave examples of chains where there were very few primary reinforcers and how this initially looks like a variable reinforcement schedule. But included in the chain, he had behaviors that had been conditioned as reinforcement substitutes so when you identified those in the chain, it showed that the animal was getting steady reinforcement.
When talking about introducing reinforcement substitutes, he said that it is important to keep in mind the animal’s expectations. This means that if you have used the same reinforcement for every click, the animal expects that type of reinforcement. If you now try adding in reinforcement substitutes, you have to be very careful to avoid frustration because the animal has expectations about what follows a click. I found this particular detail very relevant because 99% of the time, I reinforce my horses with food. If I want to start adding in substitute reinforcers, I need to do it slowly and read my animal to make sure I don’t create frustration or confusion.
He also pointed out that not everyone needs to use reinforcement substitutes and not every animal does well with them. He wanted to share how he conditions them because he finds that people are using them whether they are aware of it or not, and he finds that behavior can break down if the trainer is depending upon reinforcement substitutes that have not been systematically conditioned and introduced.
On a personal note, this session had some “ah-ha’s” for me. I found the discussion on expectations very interesting because I have run into some of that and not been quite sure how to address it. I also found it added to my understanding of how to create stronger chains. From Kathy Sdao, I had gotten the piece that cues are reinforcers and they help to keep chains strong. From Ken, I got a better idea of how to mix in reinforcement substitutes as components of a chain to further strengthen it and to avoid having to keep clicking, stopping and treating as I am working my horse.
Steve White: “No Problem! Solve any Training Issue in Four Steps”
Steve White trains police dogs, but recently he has been working with people and he found that some of the problem solving strategies that people use in their jobs work well for animal trainers too. He presented several models for how organizations use problem solving (OODA loop, SWOT, drill down, IDEAL, SARA) and his own DIP-IT.
The components of DIP-IT are Define the problem, Isolate the problem, Plan your remediation, Implement your plan and Take another look. These were the basic steps and could be used for solving problem in current behaviors or training new behaviors. He pointed out that teaching new behaviors is as much about problem solving as is working on modifications to existing behaviors. Some of the problem solving pitfalls that people encounter are focusing on the problem, asking too much, working too long and being short-sighted. He reviewed the 10 rules of shaping (Karen Pryor) and pointed out that he believes the ultimate solution to many training issues is to change the motivation.
He believes that many people have problems because they stop training too soon. Good trainers overtrain. They also train equal and opposite behaviors. The example he used for this was a DVD clip of a police dog learning to be called off the person wearing the sleeve. I saw this movie last year and it made a big impression on me. In the movie, they have a police dog that is happy to go out and grab the “bad guy” but ignores requests to let go or to abort his run toward the victim if he is called off. To address this issue, they sent the dog toward one person who offered no reinforcement by remaining passive as the dog approached. They called the dog off and at the same time, they presented a new “bad guy” and sent the dog to him. To do this, the dog had to leave the first “bad guy” and go to the new “bad guy” who gave the dog a good reinforcement by being very reactive to the dog’s “attack.” The dog learned to listen to the handler because the handler always knew where the best reinforcement was. When I saw this last year it made a big impression on me about how we can teach our horses that listening to us is the best way to earn reinforcement instead of getting carried away offering their own favorite behaviors.
He showed us a training form, the taproot form, that he uses with his dogs and students. It is a way of keeping data on what behaviors are being trained so that the most important behaviors get the time they need. He has examples on his web site in the libraries section (www.i2ik9.com). The form has boxes for behaviors with the most important ones in the middle and additional behaviors on each side. The trainer works on a behavior in the middle and then can go to one of the “side” behaviors but then has to return to a middle behavior.
He ended with some general thoughts on training. Training is a mixture of art and science. If you want to break the “rules,” then you need to know the rules. It is important to be creative and think outside the box, especially with problem solving where most people are tempted to lose their sense of the where they are in training process. He pointed out that knowing where you are is very important and you have to be realistic about it. He compared it to using Mapquest to get directions. Mapquest doesn’t care where you have been, it just wants to know where you are now.
Jesus Rosales-Ruiz: “Broken Clicks: How Reinforcer Delivery Impacts Learning”
Jesus Rosales-Ruiz and his graduate students are studying various aspects of operant conditioning and clicker training. He did a series of experiments to see how a delay in the presentation of the treat after the click affected the animal’s behavior. Before he described the studies, he went over some basic information about the use and selection of the click (or other marker signal) as a conditioned reinforcer.
He believes that the clicker has two functions. It serves as a marker signal to strengthen and select behavior. It also serves as a cue to tell the animal to go collect its reinforcement. To illustrate this, he went back to some basic information about how to condition the clicker from BF Skinner. He showed how the click ends up being the first step in a chain which is click, approach (or turn head), eat. The conventional thought has been that you pair the clicker with the food using classical conditioning, but he thinks that it is more useful to think that we are teaching that the clicker is a cue to approach the trainer with the expectation of getting reinforcement. With this in mind, he explains why it is so important to maintain the one to one ratio of clicks to treats (or other reinforcement).
If the trainer is not consistent about reinforcing after each click, then the use of the clicker as a cue that reinforcement is available is no longer reliable. This matters because if the click is no longer a reliable predictor of food, the animal is going to spend time trying to identify other predictors of food and you can end up with an animal that is easily distracted and spends a lot of time looking at the handler for information he should be getting from the click. This is inefficient and adds a level of confusion to the training process. He had some movie clips of dogs trained where they were not treated after every click and in both cases the dogs got confused, showed frustration behavior and their level of performance decreased significantly.
The use of treatless clicks also poisoned the whole training experience so that their general attitude and enthusiasm for the training sessions decreased so much that they no longer enjoyed the training sessions. When the student was first training them using a one click:one treat ratio, the dogs were excited to get started. When she switched to two clicks: one treat, they would avoid the student when she came and the owner had to go get them for the sessions.
He then did another experiment with adding in extra clicks during a training session. His student trained a dog to touch two foot targets by using a one-click:one-treat ratio. The dog was focused on the task and very prompt and clear about what behavior it needed to do. The student then added in a click for the first target touch, but still clicked and treated after the second target touch. The dog did keep working but she seemed to get confused. She looked to the handler more often than before and there were hesitations as she went from one target to another.
They did another experiment that was similar, but instead of using a treatless click to mark the first target touch, they used the word “bien.” This worked out ok. In this scenario, the word “bien” was being used as a keep going signal and it did not interfere with the one-click:one-treat ratio. The dog seemed to interpret it as additional information that she was working correctly although I did not see any big difference between the behavior as trained with only a click for completion of both target touches vs. trained with “bien” as a keep going signal.
The remainder of his talk focused on what happens when there is a delay between the click and treat. According to his earlier statement, the click acts as both a marker signal and the start of a chain What happens if other behaviors creep into the chain when there is a delay between click and treat? He showed results of a targeting experiment with sheep. They asked the sheep to target and either reinforced immediately or delayed for 5, 10, and 20 seconds and measured how much other behavior occurred between click and treat. As the delay increased, more frustration behavior (hoofing, nibbling the target stick, walking away) happened between the click and treat.
He also shared how Virginia Broitman and Sherri Lipman did the same experiment with their very clickerwise dogs and also found that the dogs repeated behaviors that occurred between the click and treat. Not only that, but after the experiment was over, they were unable to completely eliminate the frustration behaviors that those dogs had inadvertently gotten reinforced for doing. The same types of changes in behavior were shown on movie clips studying the effect of a 5 second delay on targeting behavior in monkeys. In all cases, the monkey’s behavior deteriorated. Not only did they add a lot of superstitious behavior between click and treat, but after receiving the treat, they did not immediately go back to the target. Some of the behaviors they showed were submissive or frustration behaviors.
From these studies, he has concluded that while it is common to say that the clicker allows us to have a delay between click and treat, this is not as straightforward as it seems. A delay in delivering the treat can allow the animal to include superstitious behaviors and also lead to frustration and confusion over what was being clicked. Someone asked the question about what if we have a behavior we are training where we cannot immediately deliver the reinforcement, such as working at a distance or at speed. In that case, he thinks the delay is not a problem as long as the process of getting and delivering the treat starts as soon as the click is given, and the animal can see that the trainer has started what he calls the “reinforcement cycle.” What he stressed was that it was important to recognize that the click starts a behavior chain that ends with the delivery of the treat. If the trainer keeps this chain intact and always does food delivery in the same way, it is not an issue. I think what good trainers really need to do is ensure that the animal is not throwing a lot of behaviors in between the click and treat. This can be accomplished by keeping the time between click and treat short or having a set routine for presentation of the treat so that the animal patiently waits for the treat.
Ken Ramirez: LAB: “Next! Finding and Using New Reinforcers”
This lab was a chance for people who attended Ken’s lecture to practice with their dogs. I attended as an observer and got to watch several people start conditioning new reinforcement substitutes with their dogs. He started by reviewing how to select a good reinforcement substitute and suggested that people practice it without their dogs if the substitute was a hand signal or some kind of physical gesture. I suppose you might practice a verbal reinforcer too to make sure you were saying it consistently.
A few different people demonstrated with their dogs. The important thing to watch for is the dog orienting toward the handler when it hears or sees the reinforcement substitute. We had one woman who wanted to use a thumbs-up, another who wanted to use a verbal and one who used clapping. The dog that belonged to the woman who chose clapping took a while to figure out that clapping was not a new cue for begging, which he offered a lot. Ken had her shorten the duration of clapping and present the food before the dog had a chance to offer to beg. He stressed that it is important to keep the time between the reinforcement substitute and primary reinforcer short so that the dog does not throw in other behaviors, but there does need to be a delay in order for classical conditioning to take place.
I enjoyed watching the dogs work and it was interesting to see how some of them were fine with food just appearing and others seemed to want to figure out why the food was appearing. It was good to see some conditioning of reinforcement substitutes to see how it worked out and how the dogs handled it.
Morten Egtvedt and Cecilie Koeste: “Reliability: Thy Name is Backchaining”
Morten and Cecilie had a number of presentations and I only attended the one on backchaining. They are top level obedience competitors who also run a chain of dog training schools and publish dog books and videos in Norway. They have a different approach to training in that they do not put behaviors on cue until they are completely finished. Rather than put early versions on a temporary cue, they do not cue the dog at all. The dogs figure out which behaviors they might want by context (handler position, presence of equipment) and run through their repertoire until they get clicked. The covered all this material in the lecture and lab I did not attend, but a friend of mine attended and gave me the basic outline.
The lecture I did attend was on how they combine behaviors together by backchaining. Most of the behaviors they teach are used in obedience trials and the dogs perform them in a set order. To achieve precision and speed, they backchain finished behaviors to prepare for competition. They stated that they never forward chain if they can backchain. They have found that backchaining produces faster and more reliable performance because the dog knows what is coming next and is anticipating the cue. The success of backchaining is due partly to Premack where a less probable behavior is reinforced by a more probable behavior. In a backchain, the last behavior has the strongest reinforcement history and is therefore more likely to occur. Therefore you can use that behavior to reinforce the previous behavior and so on. When you create a chain in this way, the animal is working from less probable behaviors toward more probable behaviors and therefore it builds enthusiasm.
They showed some examples of how they backchain. If the chain is short, they just start by reinforcing the last behavior (4) and then asking for the last two behaviors (3 & 4). They reinforce for 3 & 4 until they are really fluent. The explained that they do not backchain until each individual behavior is very fluent. A lot of errors in chaining come about when trainers try to chain together behaviors that are not fluent enough. They make sure that the dog really knows each individual piece before they create the chain. Once 3 &4 are fluent, they add in 2 and practice 2 & 3 & 4 until fluent before adding in 1.
If they are building a longer chain, they might make two mini chains separately and then combine them using overlap, if they can. So they might train a chain with behaviors 1, 2 & 3 and another chain with 3, 4, & 5 and then combine them. The overlap helps to make the connection strong. There are other ways to keep the chain strong. One way is to reinforce the last behavior with a really premium reward. Another is not break the chain unless absolutely necessary. One of the problems with forward chaining is that you have to occasionally reinforce each component of the chain to keep it strong, which reinforces individual behaviors, but breaks the chain. By making sure each behavior is perfect before backchaining them together, they can avoid having to break out individual components of the chain, which might strengthen an individual component, but breaks the chain.
They had a few other guidelines. They test each chain 5 times to see if it is reliable. They emphasized that the strength of chains comes from using positively trained behaviors. Behaviors that have a “do it or else” component will actually weaken a chain. If they have taught a backchain and the dog makes an error, they abort and try again. If the dog makes an error again, then they go back to looking at individual components of the chain. If the dog makes an error, it could be that they need to go back and work on one component, but they try to correct the problem without breaking the chain if possible.
The last thing they talked about was how dogs will go through a “testing phase” to see if they can skip steps of the chain and still get their reward. They stressed how important it is that the animal goes through this phase and Cecilie said she would never compete a dog that had not tested her. In most cases, if the dog skips a behavior, they can just prevent it from being reinforced and ask it to start again. They had a movie clip of a dog checking blinds for a person. The dog has to run a figure 8 pattern checking every blind until it gets to the last one where the person is hiding with a sleeve. They backchain this search sequence so the dog always knows that the person is in the last blind. When they start backchaining, if the dog skips and goes directly to the last blind, the person just steps out and does not give the opportunity to bite the sleeve which is the reinforcement for this exercise. Usually just removing the reinforcement is enough to convince the dog to complete the entire chain.
Their presentation had a nice mix of explanations and movie clips. They showed some very good footage of dogs being backchained. They also had the audience participate in some activities by pairing up and backchaining each other. I found this was interesting as I could feel how the backchaining did build anticipation. This also showed how important it was to be very clear about each individual behavior. Sometimes I was cued to do the next behavior, and I started to do it without stopping the previous behavior. One chain was turn, clap, sit and I found myself sitting while clapping until we got more specific about how many times I was supposed to clap. I could also see how keeping the cues in the chain let the trainer use anticipation to their advantage. Prior to this, I had thought that part of the point of creating a chain was to give the final chain one cue instead of retaining the cues for each individual behavior. But Morten and Cecilie keep the cues in place, which means they can control the timing of when the dog goes from one part of the chain to the next.
I am very intrigued by the idea of using more backchaining and I am trying to think of ways to implement this in horse training. So much of riding is about using one behavior to set the horse up for the next behavior that forward chaining seems more obvious, but I have sometimes taken advantage of anticipation by careful selection of the a behavior that improves the one before it and I think this is part of what backchaining is all about. For example, if I have a horse that is slow at the walk, I will do a lot of walk trot transitions, clicking for the trot. As the horse starts to anticipate the trot, the walk will improve. In the past I have been clicking the improved walk, but if I think of this as a backchaining exercise, clicking the walk might not be necessary because I can reinforce the improved walk by asking for and rewarding the trot as well. I think this is definitely worth pursuing and I am putting together some ideas for things to try with my horses.
Kathy Sdao: LAB: “What a Cue can do in Action: cue control”
This was the second of two labs that accompanied her lecture. I did not attend her lecture or the first lab, but I wanted to see her teaching so I signed up for this lab. I spent the winter watching her DVD’s so I was hoping to be prepared. Kathy is a dog trainer from Washington State with extensive experience with marine mammals and a very dynamic and enthusiastic teacher. She kept us busy with several exercises on cue discrimination and control.
She started by reviewing the qualities of a good cue. Cues are distinctive, consistent, salient, simple (make sure you know exactly what the cue is – harder than you think), and precious. By precious she meant that we should be careful with our cues. Don’t present a cue unless you are willing to bet money that the dog will do it.
Then we started off with a cue discrimination exercise. She had each dog owner write down 5 behaviors their dog had on cue. The helper shuffled the cards and made a random list of 10 behaviors. Then they tested to see how the dog was doing in the new environment. This lab was held in an outdoor tent with blowers going for heat and a lot of distractions. I noticed that it took some of the dogs a while to get focused on their owners and respond reliably to the cues they knew. She gave each team a data collection sheet and they recorded how many times the dog was correct. The owner was to only ask once and this exercise was not about getting the behaviors, but about seeing how the dogs responded to the cues. Every correct behavior has to be clicked. You cannot use variable reinforcement for this exercise as the dogs need the information that they have responded to each cue correctly.
She recommends that dog trainers do discrimination exercises like these on a regular basis to keep the dogs sharp. I think it would also help to make sure that your cues are not morphing over time or getting sloppy. When she asked for feedback on how the sessions went, a number of handlers reported that they were more careful about their cues than usual.
For the second cue discrimination exercise, she explained how to test to see if you and your dog agreed on the correct cue for any given behavior. She calls this the “prove it” game. The idea is to vary the cue slightly and see if the dog still responds. So if you wanted to test the cue “sit,” you might ask the dog to “hit,” “sat,” “upsit,” or some other variation on “sit.” Is the dog sitting for any one syllable word that starts with s? How about any word that ends in -it? If you are using body language cues, you could test your hand gestures by changing some details. Ideas for this included standing on a chair, using the other hand, holding your hand at a different height, making the motion bigger or smaller, wearing sunglasses, wearing a glove, kneeling and so on. She gave us a handout with suggestions for ways to test cues and said that some people get very creative about doing this.
Before she let people try this out, she reminded them that they had to decide ahead of time how they wanted the dog to respond to any given cue. Do you want your dog to sit when you say “sat?” Do you want your dog to respond to hand gestures with both hands or just one? It is important that the dog has the possibility to earn a click each time a cue is presented, but it could earn a click by doing the behavior, or by not doing the behavior, depending upon what the owner has decided. Again, she had people ask their dogs for behaviors in random order so that the dog was not influenced by recent reinforcement for any one behavior. People had a lot of fun with this.
This was a fun lab with lots of good training to watch and people were very creative with coming up with variations on cues for the prove it game. She also handed out a paper that had a list of reasons people might not go at a green light. I am familiar with this list from her DVD’s and I think it is a great way to help people understand about cues. In her DVD she makes the distinction between cues and commands. In her view, commands have a component of “do it or…” whereas cues are just indicators that an opportunity for reinforcement is available for a particular behavior.
To help people understand this, she compares cues to a driver waiting at a red light. The handout lists reasons why you might not go even if the light turned green. Some reasons are: you can’t see the signal, signal was brief and you sneezed or blinked, didn’t recognize signal because it was different somehow (flashing green), distracted by another sight or sound, another overriding signal prevented you (e.g. siren), another car is in your way (inhibition), unsafe (someone ran a red light), ran out of gas/broke down, and new criterion (standard car stalled on hill). Her point is that there are lots of reasons a dog might not respond when presented with a known cue and we need to recognize and troubleshoot why instead of assuming the dog chose to ignore it.
Michelle Pouliot: “Training Guide Dogs for the Blind”
Michelle Pouliot presented information on the training program for Guide Dogs for the Blind which has been converting over to clicker training. She presented some historical information about guide dog training and then showed how they are now adding in clicker training. I didn’t know much about guide dog training so I was interested to see how they taught some of the behaviors. They use treadmills to teach the dogs to lead and the movie clips showed that the dogs really love the treadmill training. She had movie clips that showed them teaching the dogs to respond to the collar, ignore food on the ground and back up in a straight line.
She also showed how they teach them to go around obstacles and about intelligent disobedience. They used to teach intelligent disobedience by asking the dog to proceed and then mimicking a fall or bad event. This was stressful on the dogs and with the use of the clicker, they came up with a better way by asking the dog to go forward and clicking before the dog could respond. The dog learned to evaluate the safety of the situation before responding to the command “forward.”
I thought it was interesting to see how they had added clicker work into their training and I loved seeing how happy the dogs were when they were working. So even though I am not doing guide dog work, I was happy to have attended this session.
Jesus Rosales-Ruiz: “The Poisoned Cue Anew”
The poisoned cue lecture is one that Jesus has presented at several Clicker Expos and he is constantly adding new information to it. He started the session by distinguishing between cues, commands and poisoned cues. Cues are discriminative stimuli that indicate the possibility of reinforcement as established through positive reinforcement. Commands are discriminative stimuli established through negative reinforcement. In training a command, the command is presented and if the animal does not comply, negative reinforcement is applied until the animal performs the target behavior at which point the aversive stimulus is terminated. I think it is important to note his terminology here because he does specify that the animal must find the use of negative reinforcement aversive in his example. Through this scenario, a command becomes a conditioned aversive stimulus. The presentation of the command can be used to decrease behavior, or removal of the command can be used to shape and capture behavior. Because of the way we use negative reinforcement with horses, I think we have to look carefully at how we are using negative reinforcement and how our horses interpret our use of negative reinforcement. In most of his examples, negative reinforcement was used as more of a correction than as a teaching tool.
He argues that a poisoned cue is a cue that is ambiguous because it has been trained with both positive reinforcement and the use of aversive stimuli. I want to note here that the first time I attended this talk, he said that a poisoned cue was one that was trained with both positive and negative reinforcement. So he has changed his terminology somewhat here. He now refers to aversive stimuli instead of negative reinforcement. I think this is important for horse trainers who use so many pressure and release cues. It is the ambiguity of the cue that causes the problem. Possible effects on behavior from poisoned cues are reluctance in the trainee with signs of stress, behavior breaking down both before and after the cue, longer latencies, freezing and avoidance behavior.
There are three ways to create a poisoned cue. The first is to add aversive stimulation to a positive reinforcement program. The second is to teach with aversive stimulation where the good behaviors were positively reinforced and the incorrect responses were “corrected” with an aversive stimulus. The third way is to elicit behavior with an aversive stimulus and capture it with positive reinforcement.
He showed a movie clip of how they created a poisoned cue with a miniature poodle. The poodle had prior experience with clicker training and no experience with aversives. There were two parts to the experiment. In the first part, they taught the cues. The cue “ven” was used to teach the poodle to come using only positive reinforcement. The cue “punir” was trained by presenting the cue (saying “punir”) and giving the dog 2 seconds to respond. If the dog did not respond, it was pulled to the handler and then clicked and treated in position. They repeated this over a hundred times. The leash correction was discontinued somewhere around the 60th trial. They measured the tail height (high vs. low), presence of whining, and snorting. He showed a movie clip of some of the training sessions and the dog’s body language for “ven” was bright and enthusiastic. For “punir,” the dog looked depressed, with a low tail and low energy. He also pointed out that in order to present the cue, the dog had to be a certain distance away. In “ven,” the dog willingly waited at a distance for the cue. In “punir,” the dog stayed so close to the handler that she had a hard time getting the required distance. It seemed as though the dog hoped to avoid the leash correction by staying right next to the handler.
In the second part of the experiment, they used the cues “ven” and “punir” to capture a new behavior. They chose several behaviors such as the dog’s position in the room, back stepping, and touching an object. For example, when they were working on capturing stepping back, the handler said “ven” when the dog stepped back. The dog returned to the handler and got his reinforcement. In “ven,” the dog figured out the game quickly and offered the new behavior with a lot of enthusiasm and the handler was able to maintain a high rate of reinforcement. In “punir,” the dog seemed hesitant and slightly aimless. Even if the handler said “punir” and reinforced the dog for the new behavior, it did not immediately return to the new behavior that caused the handler to say “punir.” In some cases, the dog created chains where it consistently performed the behavior that had been marked by “ven” even though that was never reinforced during the “punir” sessions.
Once they were done capturing behaviors with “ven” and “punir,” the handler did some additional experiments to rule out the presence of the leash and harness used in “punir” as being the source of the dog’s attitude. The dog was fine with the leash and harness in other settings. In the original experiment, the dog was loose for “ven,” but on a leash and harness for “punir,” so they switched and used the leash and harness for “ven” and had the dog loose for “punir.” There were some small changes in the dog’s attitude, but “ven” was still significantly different than “punir.”
From this study, Jesus and his student concluded that combining positive and negative reinforcement during training can have detrimental effects. Emotional behaviors produced by this procedure do not disappear over time despite the use of positive reinforcement and that while a poisoned cue can be used to mark and select new behavior, the results will be different than using a cue that was trained with positive reinforcement only. He said that he had gotten a lot of requests to repeat this study but he was reluctant to put another dog through the correction process. Then he realized that there are a lot of cues out there that are already poisoned and they didn’t need to poison a new cue, they just needed to find one that was already poisoned.
As an example, he showed some work they did with Caesar, a German Shepherd, who reacted to the presence of the leash as a poisoned cue. If the leash was in the house, the dog happily engaged with his student and responded correctly to behaviors such as “sit.” As soon as the leash came out, he started ignoring and avoiding the student. This was with the leash on the ground, not attached. They put the leash in various locations and the dog did not respond enthusiastically until the leash was back in the house.
I have gone into this in such detail because I think poisoned cues are a real problem for horse trainers, especially if we start with crossover horses. And no matter how careful we are with our rein handling, there are going to be times when the reins are used in a manner that the horse finds aversive. I am still trying to work out in my brain if the fact that we educate the horses about negative reinforcement means that we can use negative reinforcement without having poisoned cues. With my own horses, I have used negative reinforcement to shape a lot of behaviors and they don’t seem to view those as poisoned cues. But I do have a few issues that have come about where I think that the cue has gotten poisoned and I would like to avoid that in the future. I am going to take a closer look at how I use negative reinforcement to both shape and maintain behavior.
One of his early statements was that cues or commands trained only with -R (negative reinforcement) are better than poisoned cues because it is the ambiguity of the poisoned cue that creates the problem. This makes it seem as if it would be better to just use -R alone, instead of using -R and +R (positive reinforcement) combined, which is something a lot of horse clicker trainers do on a regular basis. But I don’t see that trend in my training and with other people’s horses. On the contrary, horses trained with -R and +R combined are much happier and more willing than if -R alone is used. I do think that if you use escalating pressure and the -R becomes too aversive, that it negates the effect of the positive reinforcement so my guess is that there is a balance here that works. In any case, I think that horse trainers need to be very aware of poisoned cues because we are somewhat limited in what we can use for cues under saddle and we need to make sure that we don’t poison them.
Theresa McKeon: LAB: “TAGteach in Action”
I attended the TAGteach lecture at Clicker Expo 2006, but had not attended a lab so I signed up for the lab to get more information on what TAGteaching really looks like. Theresa McKeon led the discussion and activities and started us off by making a TAGulator out of beads. She had us string 10 beads and then gave us ideas for how to use them. I had thought of TAGteach as being used in a formal teaching situation, but she described how she uses it to keep track of things in her daily life. She might TAG herself each time she drank a bottle of water or when she made a decision about whether or not to have a cookie. I have to say that this was a big light bulb moment for me as I suddenly realized that there were lots of places in my everyday life where I could TAG myself for doing things.
She talked us through a few exercises on how to implement TAGteach. The first exercise was just practice tagging other people. We divided up into groups of 3 and one person tagged while the other two people passed a balloon back and forth. The TAG point might be touching the balloon with two hands, or calling the balloon. I learned the importance of picking a good TAG point. If the people were too close together, it was important to pick a really obvious TAG point as there was not much time to decide if the person should get tagged. It is important to pick a TAG point that is easy for the trainee at the start, and easy for the tagger to identify.
She had us do another exercise where we set up “runways” by laying out two strings and putting objects between them to create an obstacle course. One member of the team was blindfolded and another team member guided her/him through the runway using instructions of 5 words or less. This exercise demonstrated that it is hard to explain something in 5 words or less unless you have previously decided upon some definitions, and part of the ability to set good TAG points was knowing your trainee. A further demonstration on picking TAG points showed how the trainee could define her own tag points by saying what she was feeling in her body when it was correct.
There were some demonstrations by a TAGteach team from Quebec who use TAGteach in their dog training. They start by training the handlers, not the dogs and they use tagging to teach basic positions such as a hand position for heeling, how to handle treat delivery and clicking the handler for recognizing when the dog is giving them attention. Theresa also showed with a helper how it was important to be unemotional and specific about TAGpoints. She went over some basics about good tagging including having a short phrase to describe the TAGpoint (5 words or less), keeping emotion out by saying “The TAG point is..,” not correcting or saying that she is changing anything, but just stating “the new TAGpoint is..” She also stressed the importance of separating out information from praise. Praise should come at the end so that the trainee just pays attention to the TAG which is the most important information. Some people find praise distracting and she warned of the danger of creating praise junkies.
This was a very entertaining session. She had all the attendees involved and she presented enough information to give me some good ideas about how to use TAGteach. Later in the final wrap-up with Karen Pryor, they played a movie clip showing different applications for TAGteach and I am amazed at the breadth and scope of where TAGteach is being used. The movie showed TAGteach being used by commercial fisherman, painters, teachers and in various sports applications.
In addition to these sessions, I had a lot of great discussions with other clicker trainers. There are so many interesting people who attend these expos and I learned something from each one of them.
Here are a few other odds and ends that I pulled out specifically to share with my horse friends.
There was talk recently on one of the lists about using jackpots and whether or not they were useful. I do use jackpots, but I have to say that I have never really studied whether they work for me or not. At clicker expo, there were a variety of opinions about jackpots and from listening to the trainers, the idea I got was that if you are going to use jackpots, make sure they work for you.
For example, Kay Laurence does not use jackpots (as defined by multiple goodies, not different goodies). She points out that if your animal has just performed a really good effort, you want to have it repeat the behavior right away before it forgets what it did. You don’t want to have a big time lapse (due to chewing) between a successful effort and the next effort. I think this is a good point.
I think the key here is to think about all aspects of the treat from choice of treat to delivery and make sure they all work to your advantage. Here are some of my common strategies.
Vary the treat. Give better treats for better efforts that are better because the animal added enthusiasm and energy. When I taught Willy Spanish Walk, I found out he would lift his leg higher for an apple than for a carrot. He likes apples better so it made him more enthusiastic and generated a better Spanish Walk. If I am training a behavior where I am trying to get the horse to relax or calm down, I am more likely to jackpot by offering more of a lesser quality treat. Chewing is calming and I don’t mind the time lag between offered behaviors. Sometimes I can even have the horse eat while performing the behavior. This works if you are going for duration on the mat. You can feed many treats while the horse is standing on the mat and this will strengthen the behavior.
NO Reward Markers:
This was discussed at a panel discussion. Most of the trainers do not use NRM. If they do, it is for very specific circumstances. They do not use it for shaping. In one example, a NRM marker was used to show a dog the difference between an incorrect behavior and one that was correct but that the trainer was not going to click at that moment. In this case, no information (no click, no NRM) means the dog is correct.
I was introduced to the ideas of cues as reinforcing behaviors. I think that with horses, we do a lot of redirecting. I often ask my horse for something because I want him to do something else, not because he has done one thing correctly and I am ready to do the next thing. If you want to chain together many behaviors, the horse needs to know that being asked for the next cue means he was successful, not that you are redirecting him. One way to do this would be with a keep going signal or verbal praise. My thoughts now are that while I might not use words in shaping, I will use them when I start to connect together several behaviors.
Differences between horses and dogs:
It is tempting to try and apply all the same ideas from clicker training for dogs to horses. In many cases, such as husbandry, and simple discrete behaviors, this works. But with riding, I think we have to be a bit more creative.
If you want the horse to catch on quickly to the idea that the next cue means success, then you need to start chaining behaviors together early. It is tempting to work on the walk and then the trot and then the canter and reinforce each effort, going for refinement. I think the horse needs the trainer to start chaining behaviors together early, even if it is just asking for a few behaviors in a row. The horse needs to really understand that each new cue is a new opportunity for reinforcement.
Katie Bartlett, 2008 – please do not copy or distribute without my permission