In this short presentation, Jesús Rosales-Ruiz revisited the question:
“Do I have to treat every time I click?”
He said that this question constantly comes up and that different trainers have different answers.
Before I share the details of his presentation, I want to mention that he said he chose to use the words “click” and “treat” because he was trying to avoid using too much scientific jargon. But, as he pointed out at the end of his talk, it would be more accurate to say “click and reinforce,” and probably even more accurate to say “mark and reinforce.”
Since he used “click and treat,” I’m using the same words in these notes, but you should remember that he is really looking at the larger question of how we use conditioned reinforcers and whether or not they always need to be followed by a primary reinforcer in order to maintain their effectiveness.
Back to the question…
Do you have to treat after every click?
Some say YES:
- Otherwise the effectiveness of click may be weakened
- Bob Bailey says: “NEVER sound the bridging stimulus idly (just to be ‘fiddling’) or teasing…it’s important that the “meaning” of the bridging stimulus is kept unambiguous and clear. It should ALWAYS signify the same event- The primary reinforcer.” How To Train a Chicken (1997) Marian Breland Bailey, PhD and Robert E Bailey.
- This view is supported by research that shows that the conditioned reinforcer should be a reliable predictor of the unconditioned reinforcer.
Some say NO:
- Once a click is charged, you only have to treat occasionally
- Once a behavior is learned, you only have to treat occasionally
- Supported by research on extinction (in general, this means that if an animal learns that not every correct answer is reinforced, then it will keep offering the correct answer for some period of time, even if there’s no reinforcement.
So maybe there is some research for both.
He said that he started thinking about this question again after reading a blog by Patricia McConnell, who was sharing some thoughts on whether or not to treat after every click. She was wondering why clicker trainers recommend it, but other positive reinforcement trainers do not.
Patricia McConnell wrote:
- “For many years I have wondered why standard clicker training always follows a click with a treat.”
- “Karen Pryor strongly advocates for us to reinforce every click (secondary reinforcer) with a treat (primary reinforcer). Ken Ramirez, one of the best animal trainers in the world, in my opinion, always follows a click with a treat.”
- “But Gadbois went farther, given the link between motivation and anticipation, suggesting that it was important to balance the “seeking” and “liking” systems, with more emphasis on the former than the latter during training. He strongly advocates for not following every click (which creates anticipation) with a treat, far from it, for the reasons described above.”
You can read the blog at: http://www.patriciamcconnell.com/theotherendoftheleash/click-and-always-treat-or-not.
If you have not heard of Simon Gadbois, you can read about him here: https://www.dal.ca/academics/programs/undergraduate/psychology/a_day_in_the_life/professors/simon-gadbois.html.
What happens if you don’t treat after every click?
Jesús was intrigued by Gadbois’s statement that you don’t want, or need to treat after every click because you want to balance “liking” with “seeking.” And that if you don’t treat after every click, you get more seeking.
One reason for his interest was that he already knew of an experiment that had been done to look at what happens if you don’t follow every click with a treat. About 10 years ago, one of his students wanted to compare how the behavior of a dog trained under conditions where one click = one treat was different than a dog that was trained with multiple clicks before the treat. The two conditions looked like this:
- one click = one treat: The trainer clicked and treated as normal after every correct response: cue -> behavior -> click -> treat -> cue -> behavior ->click -> treat.
- two clicks = one treat: The trainer clicked for a correct response, cued another behavior and clicked and treated after the correct response: cue -> behavior -> click -> cue -> behavior -> click -> treat.
These dogs were tested by asking for previously trained behaviors. Each dog was trained under both conditions so some training sessions were under one click = one treat and some were done under two clicks = one treat. There were multiple reversals so the dogs went back and forth between the two conditions several times over the course of the experiment.
Under the one click = one treat condition, the dogs continued to perform as they had in training sessions prior to the start of the experiment. Under the two clicks = one treat condition, both dogs showed frustration behaviors, deterioration in behavior and at times the dog would leave the session.
There were many factors that could have contributed to the result, including the fact the dogs were originally trained under one click = one treat, the reversals themselves could have caused confusion, and the dogs might have done better if they were transitioned more gradually. But, it was pretty clear that omitting the treat did not activate the seeking system, instead it created frustration. Why?
They considered two possibilities:
- Perhaps because they were getting less food? Under the one click = one treat condition, each dog was getting twice as much food reinforcement as the dog training under the two clicks = one treat condition.
- Properties of the click had changed. What does the click mean to the dog?
Can we test if it’s about the decrease in food reinforcers?
If you want to test what happens when you click without treating, you have to change the ratio of clicks to treats. You can do that by omitting some treats, or by adding some clicks. But both options are probably not going to be perceived in the same way by the animal.
In the experiment described above, the trainer changed the ratio of clicks to treats by omitting food reinforcers after half the clicks. This is a significant decrease in the number of primary reinforcers that the dog was receiving. Could the results be more about the reduction in food reinforcers, than about whether or not each click was followed by a treat?
One way to test this would be to keep the number of food reinforcers the same, but add another click. To do this, the trainer taught the dog to do two behaviors for one click. The dog would touch two objects. When he touched the second object, he would get clicked and treated.
Once this behavior had been learned, the trainer decided to add another click by clicking for the first object, clicking for the second object and then treating. So the pattern would be behavior (touch) -> click -> behavior (touch) -> click -> treat. This works out to clicking after every second behavior, but the trainer got there by adding a click, not by removing a treat.
What she found was that the dog just got confused. The dog would orient to the trainer on the first click, get no response, go back to the objects and touch again (either one). Or he might just wait and look at the trainer, or he might leave. The additional click didn’t seem to promote seeking. Instead it interrupted the behavior and created confusion.
Why? Well, perhaps it has to do the two functions of conditioned reinforcers. This goes along with the second point above, which is that the difference was due to how the click was being used.
The 2 Functions of Conditioned Reinforcers:
Let’s take a moment and look more closely at conditioned reinforcers. Conditioned reinforcers are stimuli that become reinforcers through association with other reinforcers. They usually have no inherent value. Instead, their value comes from being closely associated with another strong reinforcer for a period of time, (while it is being “conditioned”), and this association must be maintained through regular pairings in order for the conditioned reinforcer to retain its value.
In training, this is usually done by deliberately pairing the new stimulus with a primary reinforcer. There are different kinds of conditioned reinforcers and their meaning and value will depend upon how they were conditioned and how they are used. Marker signals (the click), cues, and keep going signals (KGS) are all examples of conditioned reinforcers.
Regardless of the type, all conditioned reinforcers have two functions. They are:
- Discriminating (they can function either as cues or event markers, or both)
Conditioned reinforcers are not just used in training and laboratory experiments. They are everywhere.
Jesús used the example of a sign, which is a conditioned reinforcer for someone driving to a specific destination. Let’s say you are driving to Boston and you see a sign that says “Boston, 132 miles.” The sign provides reinforcement because it tells you that you are going the right way. It also has a discriminatory function because it provides information about what to do next, telling you to stay on this road to get to Boston.
When talking about conditioned reinforcers, it’s easy to focus on only one of these functions. Is this why there is confusion? Perhaps the debate over whether or not to treat after every click is because some trainers are focused on the discriminating function of the click and others are focused on the reinforcing function of the click?
What does training look like if the focus is on the discriminating function?
When every click is followed by a treat, the click has a very specific discriminating function. It tells the animal it has met criteria and reinforcement is coming. The trainer can choose what the animal does upon hearing the click (stop, go to a food station, orient to the trainer), so the trainer has to decide what behavior she wants the animal to do upon hearing the click. But, regardless of which you choose, the click functions to cue another behavior which is the start of the reinforcement process.
A lot of one click = one treat trainers emphasize the importance of the click as a communication tool. There are two aspects to this. One is that it marks the behavior they want to reinforce and the other is that it tells the animal to end the behavior and get reinforcement. If the click is always followed by a treat, the meaning of the click remains clear and it provides clear and consistent information to the animal.
You can think of the click -> treat as part of a behavior chain, where the click has both a reinforcing function, from the association (click = treat), and also an operant function (click = do this). Clicker trainers who promote the one click = one treat protocol still recognize that the click itself has value as a reinforcer, but they choose to focus on the click as an event marker and as a cue, more than as a reinforcer.
What does training look like if the focus is on the reinforcing function?
A lot of trainers who treat intermittently (not after every click) emphasize that the click is a reinforcer in itself, so it’s not necessary to also provide a treat after every click. They are looking at the reinforcing function of a conditioned reinforcer and would argue that the whole point of having a conditioned reinforcer is so that you don’t have to follow it with another reinforcer every time.
They are still using the discriminating function of the click because it can be used to mark behavior. But, the click does not become an accurate predictor of the start of the reinforcement phase, so it is not going to have the same cue function as it does under the one click = one treat condition.
Jesús did mention that if the click is not a reliable cue for the start of the reinforcement process, then the animal will look for a more reliable way to tell when it will be reinforced. In most cases, the animal finds a new “cue” that tells it when to expect reinforcement and the click functions as a Keep Going Signal. If the animal can’t find a reliable cue for the start of reinforcement, or if it’s not clear when the conditioned reinforcer will be followed by reinforcement, and when it won’t, then he will get frustrated.
Back to the Literature…
With this information in mind, what can we learn by going back and looking at the research on conditioned reinforcers? Well, it turns out that the literature is incomplete for several different reasons:
- It doesn’t look at the cue function of the conditioned reinforcer.
- Animals in the lab are often restrained or constrained (limited in their options) so the cue function of the conditioned reinforcer may be more difficult to observe.
- It doesn’t take into account that the most consistent predictor of food is the sound of the food magazine as it delivers the reinforcement. Even when testing other conditioned reinforcers, the sound of the food magazine is what predicts the delivery of the food reinforcement, and it’s on a one “sound” = one “treat” schedule.
- To test a conditioned reinforcer that as sometimes followed by food and sometimes not, you would have to use two feeders, one with food and one without and even then you would have to worry about vibrations. Most labs are not set up with two feeders so this work has not really been done.
He also mentioned that a lot of what we know about conditioned reinforcers in the lab is from research where the conditioned reinforcer was used as a Keep Going Signal (KGS), and not as a marker or terminal bridge.
I asked Jesús if he had an example of an experiment using a conditioned reinforcer as a KGS and he sent me an article about a study that looked at the effect of conditioned reinforcers on button pushing in a chimpanzee.
The chimpanzee could work under two different conditions. In one condition, he had to push the button 4,0oo times (yikes!) and after the 4,000th push, a light over the hopper would flash and his food reinforcement would be delivered. In the other condition, he also had to press the lever 4,000 times, but a light would flash over the hopper after every 400 pushes, and then again at the end when the food was delivered after the 4,000th push.
The chimpanzee was tested under both conditions for 31 days and the results showed that he worked faster and with fewer pauses until he got to the 4,000th push when he was reinforced by the flashing light every 400 pushes.
Once the chimpanzee had been tested under both conditions for 31 days, they started the second part of the experiment. In this part, the chimpanzee could choose the condition (by pressing another button) and he usually chose the one where the light flashed after every 400 pushes.
So, having a Keep Going Signal improved the speed at which the chimpanzee completed the 4000 pushes and was also the condition preferred by the chimpanzee. This suggests that Keep Going Signals can be useful and an animal may prefer to get some kind of feedback.
In this experiment, the conditioned reinforcer they were testing (the flashing light) was functioning as a KGS and the sound of the food magazine was what told the chimpanzee that he had met criteria. So, this is an interesting experiment about conditioned reinforcers as Keep Going Signals, but it also shows the difficulty of separating out the conditioned reinforcer from the stimulus that predicts food delivery.
An example of training a KGS with a dog
Jesús talked a little bit more about Keep Going Signals, using an example from one of his own students. She wanted to teach her dog a new conditioned reinforcer that she could use as a KGS. She started by teaching the dog to touch an object for a click and treat. Once the dog had learned the behavior, she said “bien” (her new KGS) instead of clicking, and waited for the dog to touch the object again. If the dog repeated the touch, then she would click and treat.
She was able to use the KGS to ask the dog to continue touching an object and I think she tested it on other objects. You do have to train a KGS with multiple behaviors in order for it to become a KGS, as opposed to a cue for a specific behavior. I don’t know if she tested it with other behaviors, but that would be the next step. I’m also not sure if they compared the dog’s performance, with and without the KGS, to see if adding a KGS increased the dog’s seeking behavior, as Gadbois had suggested it would.
The difficulty with the question “Do I have to treat after every click?” is that the answer depends upon how you are using the click and whether or not it cues the animal to “end the behavior” and expect reinforcement. Conditioned reinforcers have two functions. They function as reinforcers and as discriminators, and you need to consider these functions when choosing how to use the click.
If you are using the click as a Keep Going Signal, the animal learns to continue after the click and the click does not interrupt the behavior. This means you can click multiple times before delivering the terminal reinforcer. However, it’s likely that you will end up having a different cue that tells the animal when it has completed the behavior and can expect reinforcement. If you don’t, the animal may become confused about what it should do when it hears the click.
If you are using the click to indicate when the behavior is complete, the animal learns that the click is a cue to start the reinforcement process. You can teach the animal a specific response to the click so that the animal knows what to do to get his reinforcement. If the click is being used in this way, then it will interrupt the behavior and you will want to wait until the behavior is complete before clicking.
We call both these types, the click as a KGS and the click as an “end of behavior” cue, conditioned reinforcers, but they are not the same thing. There are many kinds of conditioned reinforcers, and when you are not specific, it’s easy to think you are talking about the same kind, but you are not. So both “camps” may be right, but for the wrong reasons.
Jesús finished by saying we need to study this more carefully in the laboratory and also in real life training situations. One point he made was that an animal, who initially learned one click = one treat, could probably be re-trained to understand that the click was a KGS, if the transition was done more slowly (than the dogs in his student’s experiment), but he still thinks it would change the meaning of the click from an “end of behavior” cue to a “keep going signal.”
I thought this was a very interesting talk, partly because it shows how important it is to clearly decide how you are going to use conditioned reinforcers and to make sure that you teach your animal what it means. I don’t think it was intended to be the final word on a complicated subject, but the presentation certainly made me more aware of the importance of thinking about the many functions of conditioned reinforcers and how I am using them.
But… I’m not sure it left us with an answer to the question of what happens when the same conditioned reinforcer is used as both a KGS and to end the behavior, which is how many trainers describe their practice of clicking multiple times before delivering the terminal reinforcer. There needs to be research done on what happens if it is used as both.
A few personal thoughts
This presentation was informative, and made me feel more confident about the system I use, but it also left me with some unanswered questions.
I have always followed a click with a treat. It is how I originally learned to clicker train and it has worked well for me. If I want to use a different reinforcer, I have a different marker. If I want to provide information or reinforcement to the horse without interrupting the behavior, I have several other conditioned reinforcers I can use.
It’s never made sense to me to have the same conditioned reinforcer sometimes be a cue to “end the behavior” and sometimes be a cue to “keep going.” I question if that’s even really possible, unless the animal learns it has different meanings under different conditions, and that seems a bit awkward. It just seems simpler to have clearly defined conditioned reinforcers and use them in a consistent manner.
I was intrigued by the research into Keep Going Signals. I do use Keep Going Signals and have found them to be useful. But I have also found that I have to pay attention to maintaining them in such a way that they retain their value (through pairing with other reinforcers), but don’t become reliable predictors of reinforcement and morph into “end of behavior” cues. I’d love to see more research on how to effectively maintain Keep Going Signals, as well as some research on how effective they are at marking behavior.