Training a behavior from start to finish

Clicker trainers work towards finished behaviors a little differently than some other training methods and I think it is helpful to have a general overview of how clicker trainers work. I am going to outline the steps and then explain each one in a bit more detail farther down the page. More detailed information can be found in my book, Teaching Horses with Positive Reinforcement, available in paperback and kindle on Amazon.

For a horse that already has some clicker training experience, an outline of how I would clicker train a new behavior would look something like this:

Step 1: Get the behavior
Step 2: Add or select the cue
Step 3: Build duration
Step 4: Fine tune the behavior and finish stimulus control
Step 5: Transition out of the teaching phase to long term maintenance

Let’s look at each step in more detail.

I do want to point out that this is not necessarily a one way path or that these steps are fixed in this order. I may revisit each step multiple times. For example, I might teach a behavior and put it on cue, then go back and reshape it and add a different cue. Depending upon the behavior, I may introduce stimulus control earlier or later. So this is flexible. The idea here is to show the steps you need to cover to get from start to finish.

Step 1: Get the Behavior

What does this mean?

In the previous article (what is equine clicker training?), I gave brief descriptions of some ways that clicker trainers get behavior. Clicker trainers have a variety of ways to get behavior, some of which are used by other training methods and others which are specific to clicker training or pure positive reinforcement methods. In that article I wanted to share the “big picture” of how clicker trainers get behaviors so that new trainers didn’t get locked into the idea that clicker training was a very narrowly defined application of positive reinforcement. There is a lot of variety in how clicker trainers work, but we are all looking for animals that are happy in their work and freely offer behavior at appropriate times.

Understanding how clicker trainers get behavior is an important piece of information for new clicker trainers. I sometimes encounter people who think clicker trainers sit around waiting for the horse to do something. Not at all! Clicker trainers learn to recognize the basic building blocks for new behaviors, manipulate the environment to make desirable behavior more likely, and become very creative at coming up with ways to jump start behavior.

If you want to read more about getting behavior, you might want to read the three article series on operant conditioning that is in the articles section. There is one article that explains the four quadrants of operant conditioning, another one on negative reinforcement and one on getting behavior using only positive reinforcement.

One key point here is that most traditional trainers and clicker trainers approach training from different perspectives. Think about what happens when a behavior occurs. It can be simply written out as antecedent -> behavior -> consequence. The antecedent is what triggers the behavior, then the behavior happens, and the consequence is what happens after the behavior occurs. Most traditional training focuses on the antecedent, in that they believe that if you ask for the behavior “correctly,” the correct behavior will occur so they tend to focus on antecedents. Clicker trainers are more concerned with the second half of the equation, which is reinforcing correct versions of the behavior.

This is why clicker trainers say “get the behavior” and worry about the cue later. In the early stages, cues don’t get behavior, although environmental and context cues can help the animal figure out which behavior you might want. This gets into the whole topic of cues vs. commands which Karen Pryor talks about, but that is a whole new topic.

How long does this take/do I stay in this phase?

In this phase, I am actively shaping a behavior and I can stay in it for quite a long time if I am shaping a complex behavior, or I can move through it quickly if I am using other methods. I want to stay in this phase until I have a discrete and recognizable behavior that the horse clearly and deliberately repeats multiple times within each training session. This does not have to be the perfect finished behavior, but it has to be far enough along that it is clearly recognizable as a specific behavior. Kathy Sdao says that before you attach a cue, the behavior should be predictable and freely offered by the animal.

Step 2: Add or select the cue

Once you have a recognizable and predictable behavior, you are ready to add the cue. If you have been shaping the behavior, you probably already have some kind of “working cue,” that told the animal which behavior it should offer, but now you are ready to add a more formal cue. If you read any clicker training texts, you will come across statements that state “clicker trainers add the cue after they have the finished behavior”, or some variation on that statement.

I think that statement causes a lot of confusion because it implies that prior to this, the trainer has no way of indicating to the horse which behavior they want. So just remember that there are ltots of kinds of cues. When a clicker trainer talks about adding the cue, they mean adding a formal cue that was not part of the training process.

There are a number of reasons to add a new cue to a finished behavior but it is not always necessary. In some cases, the working cue evolves along with the behavior and you don’t have to do the separate step of adding the cue. I can teach my horses to back by touching their chest and then fade the cue to my finger pointing at them. That would be an example of “selecting” the cue. Out of all the stimuli that I have used in shaping, I am going to identify the one I want to have the horse associate with the behavior and deliberately start using it. I don’t need to add another cue, but I do need to make sure that the horse pays attention to my selected cue (the pointing finger) and not to other environmental stimuli that might not always be present.

In other cases, your “working cue” is not practical. When I taught my mini to lie down, the working cue was my presence in his stall at 10 at night. That was when I taught him to lie down so he had a cue based on several different requirements (location, my presence, time of day). That was pretty limiting so I added a new cue which was for me to bend down and pat the floor of his stall. I added that cue after he was reliably lying down when I entered.

There are different ways to add a cue. The simplest one is to add the cue right before you know the behavior is going to occur (remember Kathy Sdao says it must be predictable) or you can add the new cue before the old cue if you have a working cue so you would do new cue -> old cue -> behavior. If I want to add the verbal “back” to the backing behavior which I taught with a finger on the chest, I can say “back”, touch the horse or point, and reinforce backing. The horse will start to anticipate and back when it hears “back” but before I touch with my finger. If I click the horse for backing before I touch it, I have transferred backing to the new cue. This process takes a bit of time but as horses get more savvy about cues, they learn to attach new cues to behaviors very quickly.

Step 3: Build duration

Once you have the behavior on cue, you can still shape if farther. The cue just means you can start to ask for it in different situations and take it on the road while still refining it.

In this phase, I might work on teaching the horse to perform the behavior for longer (duration) or multiple times. If you are interested in reading more about training duration, I wrote an article on it which is on the articles page.

Step 4: Fine tune the behavior and finish stimulus control

It’s unusual for me to shape a behavior into its final form in the first step. Usually I get something pretty close and then I work on stimulus control and duration. Once I have addressed those aspects, I go back and fine tune the behavior. Sometimes I do this because it’s more practical. I may need to build more duration so the horse can practice, become stronger, or offer desired variations that I can select out. Sometimes I need to use the behavior for a while before I know exactly what criteria I need to train. Behavior is also fluid so I may find that I want to change something for other reasons.

It’s in this step, that I also finish addressing stimulus control. Adding a cue to a behavior is the first step in stimulus control. It means the trainer now has a way to ask for a specific behavior, but in order to have a behavior completely trained, it needs to be under complete stimulus control, which includes three other specifications (as defined by Karen Pryor).

In this phase, the trainer is going to make sure the behavior (A) meets the rest of the requirements for stimulus control which are:

The behavior doesn’t occur in response to another cue (you cue B, C, don’t get A)
Some other behavior doesn’t happen when you give the cue (so if you are cueing A, you don’t get B)
You don’t get behavior A when you cue some other behavior
The behavior doesn’t happen when there is no cue given

If you want to read more about stimulus control, I wrote an article on stimulus control which is in the articles section. The title is about clicking offered behavior, but it covers stimulus control in depth. Please note that depending upon the behavior, stimulus control may only apply to what happens during training time. I can put “down” under stimulus control so that my dog doesn’t offer down during a training session without being cued, but that doesn’t mean the dog can’t lie down when it feels like it in other situations.

Step 5: Transition out of the teaching phase to long term maintenance

The idea behind clicker training is that the click and treat (reinforcement) are used for the training process and that the trainer doesn’t have to click and treat forever to maintain behaviors. The theory is good. I find that the application of it is a bit trickier. The reason is that in order to maintain a behavior, it does have to be reinforced in some way, and if you want to maintain a behavior with some precision, then you need a way of communicating that.

For now I am just going to say that there are options for maintaining behavior long term and you have to choose what works for you. My plan is to write an article with more detail as time permits. I do have to say that this is a topic that I do not often see covered in clicker training books. There is a lot of information on getting behavior and training issues, but I have not read that much on maintaining behaviors long term.

Here are a few examples of ways that I have moved a behavior from a training situation to part of my everyday routine with my animals.

1. Some behaviors are easy to maintain by using other reinforcers, either with a different marker signal (that is not so strongly associated with food), or with food alone. I taught my dogs to sit at the back door before going out. I clicker trained the behavior with food. I now maintain the behavior by simply waiting to open the door until they sit. Their reinforcement is getting to go out, which is something they want.

2. Some behaviors are easy to just randomly reward with or without a click. I taught polite haltering to my horses. Most of the time I don’t click and reinforce for haltering. I reinforced haltering often enought that it just became habit and usually the halter means they are going somewhere they want to go (in or out from turnout, to be groomed, etc…) so there is some reinforcement that follows haltering. But I do ocassionally click and treat for a good effort in a difficult time. So if it is windy day and everyone is running around and the horse I am haltering stands quietly, he gets a click and treat.

3. Some behaviors can be reinforced indirectly by inserting them as into a chain (or sequence) of behaviors. I could argue that this just a variation on using another reinforcer (the cue for the next behavior) or using random reinforcement, but I find it takes conscious thought on my part to establish and maintain a good chain, so I am writing it up as a separate option. Foot cleaning can be considered a chain. I started out by reinforcing each foot, now I just reinforce at the end.

4. Condition some secondary reinforcers. This is different than option 1 (use other reinforcers). In this case, you systematically teach the animal to accept other forms of reinforcement (praise, patting, etc…). I wrote quite a bit about this in the Clicker Expo reports on the articles page.

One thing to consider when you think about how much (or if) you want to click and treat a behavior long term, is that for some horses, the click and treat is more than positive reinforcement, it is information. They want the information and verification that they were right as much as the food. If you phase out the click and treat, you still need to give them information about how they are doing in some other way. I like to get the behaviors really well established and have an alternative communciation system in place before I start cutting back on clicking and treating.

I am going to add some personal thoughts here. One of the things I like about clicker training is that there are aspects of the training where every trainer can make their own choices about how they want to do things. I think it is important not to get hung up on the idea that you should be able to stop clicking and treating, or that you must follow some set of rules about what you behaviors you maintain by clicking and which ones you don’t. The important thing is to find what works for you and your animals.

I have made some of my own choices about how I work with my own horses. For example, I always click before I treat. I know there is a school of thought that says this dilutes the clicker and if I am not marking anything precise, I don’t need to use it, but I like that my horses don’t expect a treat unless I click. I could switch to a different marker signal but that actually makes life more complicated. So I click and treat and the horses figure out when the precise timing matters and when it doesn’t.

In addition, I have some behaviors in my horses that I keep on a high rate of reinforcement and I have some horses that are on a higher rate of reinforcement in general. One of my horses had a number of bad experiences and gets flustered easily, especially over hoof handling. I really don’t care if I click and treat at a high rate of reinforcement with her forever, if it allows me to get the job done without stressing her out. I have another horse who is so laid back he doesn’t care if I click and treat for hoof care or not.

Usually when I think about maintaining behavior long term, I take into account other possible reinforcers and/or ways to maintain the behavior. Some behaviors are maintained very nicely with pressure and release if it is taught as an alernative communications system, and there is no aversive element to it. Other behaviors can be reinforced randomly and that works out. If I have a behavior that I want to reinforce less often, I have found it is helpful to write out a systematic plan for slowly weaning the horse down to fewer clicks and treats. I find that with some horses this works better than just being random. They slowly adjust to the idea that they have to do a few behaviors before being clicked and treated as long as I am consistent about how I do it.

Another way to think about it is to consider what might happen if you decrease the rate of reinforcement or change reinforcers. Is the behavior established enough to be maintained? What would you do if the behavior deteriorates or you get unwanted behavior? Are you comfortable using negative punishment or negative reinforcement to maintain it? Are there other competing behaviors that are more reinforcing to the horse, and would those take over?

How you answer those questions will determine how you choose to maintain different behaviors. For example, I still click and treat Rosie for standing quietly on the cross-ties. When I first used the cross-ties, she developed the habit of pawing non-stop so I trained her keep her feet on the floor. I have her weaned down to just a few treats per session, but I do still reinforce this behavior because I would rather do that than the alternative which is either give her a time-out (negative punishment) for pawing or apply an aversive.

It is important to remember that we are not the only one controlling the reinforcers, and we have to evaluate what might happen if we change things. Don’t let this paralyze you. Part of the way you learn how to maintain behaviors long term is through trial and error until you find what works for you and your horse.

The interesting thing that I have found is that while I get some flack for “spoiling that horse with food” when I take my horses out, they are much better behaved about basic things than a lot of the other horses I meet. That makes me happy to do a little clicking and treating to maintain their behavior.

equine clicker training

using precision and positive reinforcement to teach horses and people

Training a behavior from start to finish

Share this: