Steve White has worked in law enforcement and professional dog training for over 40 years. He specializes in teaching behavior modification, tracking, and scent work through the use of positive reinforcement-based operant conditioning.
Steve’s presentation was about some of the practical aspects of using chains in the real world. It was a nice follow up to Dr. Reid’s description of work in the laboratory and Dr. Layng’s presentation on the different kinds of sequences and how to create them.
He started his talk with a story about what happens when you modify one element of a behavior chain without considering all the possible ways it could effect the functionality of the entire chain. I won’t repeat the story here but I’ll say it involved a bad guy, a dog, and what happens when a dog learns to let go when the bad guy puts his hands up.
Once he had us awake and listening (it was after lunch), he went into some definitions.
What is a behavioral chain?
- A sequence of stimulus – response pairs
- Each response produces a change in the environment that acts as the discriminative stimulus for the next response.
- note: Steve’s “behavior(al) chain” is the equivalent of Dr. Layng’s “chain sequence.” Since this is Steve’s talk, I am going to use his terminology so I will write “chain” not “chain sequence.” You can see why chain sequence and tandem sequence might have gotten shortened to chain and sequence.
There are two kinds of constructed chains:
- Technical chains – A sequence of behaviors where each one “sets the occasion” for the next one. A detection dog doing a search is an example of a technical chain (search – locate – report).
- Common chains – A sequence of behaviors where each behavior has its own external cue. A dog working on an agility course is cued (by the handler) for each obstacle and completes multiple obstacles before finishing and being reinforced. In a common chain, it’s important to have good stimulus control and fluent behaviors.
This is a quote Steve shared:
“…behavior does not occur as random strings of unrelated responses but in organized sequences, called chains, in which each successive response produces the stimuli, internal or external, that determine what comes next. And when the time arrives to initiate a new sequence of behavior, that too, is signaled by a change in stimulus conditions.”
Constructing, Maintaining, and Trouble-shooting Chains
Before he explained some of the options for building chains and the challenges with maintaining them, Steve pointed out that chains can be a problem from two different angles. Sometimes the problem is that you can’t maintain a chain that you do want, but equally problematic is when you have accidentally created a chain that you don’t want. These are two sides of the same coin and understanding more about chains will help you work through both these problems.
Example of cleaning up a chain:
Steve showed a video of a dog that had been trained to search a building for a suspect. The dog has to approach the building, wait, and then search for the decoy. In the video, they were “cleaning up” the beginning of the chain because the dog’s wait behavior had deteriorated. Instead of waiting quietly, the dog was vocalizing because he was so excited about getting to search for the suspect. The solution was to practice the first part of the chain, with careful attention to when the decoy (pretend suspect) showed himself to the dog so that the dog was reinforced for quiet waiting by the appearance of the decoy.
This is one of the most important things to remember about a chain. The cue or stimulus change that indicates the start of the next behavior functions as a reinforcer to the previous behavior, so you want to be careful about if and when you allow the animal to continue to the next step in the chain.
Building a chain: the retrieve
Step 1: Task analysis for the retrieve. The component behaviors are:
- go out
Step 2: Teach and build fluency in each individual behavior. This can be done individually or in groups.
Step 3: Assemble the chain. This can be done individually or in groups. Remember that you have:
Dual function chained stimuli:
- Each stimulus reinforces the preceding response.
- Each stimulus also acts as a discriminative stimulus for the following response.
For example, if you are creating a chain with three behaviors (1, 2 and 3), each with their own antecedents and consequences, it could be written as:
A1 -> B1 -> C1 + A2 -> B2 -> C2 + A3 -> B3 -> C3 (terminal)
When chained, C1 becomes the antecedent (A2) for the next part of the chain, same with C2 and A3. We could write it like this:
A1 -> B1 – > (C1A2) -> B2 – > (C2A3) -> B3 -> C3 (terminal)
There are different ways to build chains. They are:
- Forward chaining – Start with behavior 1, then ask for behaviors 1 and 2, then behaviors 1, 2, and 3. The learner is reinforced for each successful sequence (after 1, then after 1 and 2, then after 1, 2 and 3).
- Back-chaining – Start with behavior 3, then ask for behavior 2 and 3, then ask for behavior 1, 2 and 3. Backchaining is his choice for most behaviors. The learner is reinforced for each successful sequence (after 3, then after 2 and 3, then after 1, 2, and 3).
- Total task chaining – The learner is expected to do the entire chain from beginning to end right from the start. Reinforcement only happens at the end of the chain.
- Each behavior must be fluent
- Each behavior must be on cue (internal or external cues)
There are a number of things that can go wrong with chains. The quality of the individual behaviors can deteriorate or the animal can add or skip behaviors. The behaviors near the beginning of the chain, or those that are the weakest (harder to do or have less reinforcement history) can be more vulnerable. Steve offered some tips:
- Fixing broken chains
- Premack rules (use the Premack Principle to your advantage)
- Excessive suspension – when chains get long, they get wobbly in the middle
- Disassembling runaway chains – requires stimulus chain analysis and checking for hidden reinforcers.
Premack – Put it to work for you.
- Remember that Premack says that you can reinforce low probability behaviors with the chance to do high probability behaviors.
- Keep relative probability of behaviors in mind when constructing, maintaining, and repairing chains.
- You want to be working toward higher probability behaviors.
- Broken behaviors may indicate a problem with the relative value of the behaviors in the chain.
- There must be sufficient reinforcement to maintain the chain. This is a problem with bomb detection dogs because they do many searches where they don’t find anything. The trainers often have to sneak in other reinforcers or the chains fall apart.
- note: Premack was one of the themes of the ASAT conference in 2017. If you want to read more about the Premack Principle, you can read my notes in two articles on the Archives page. The first one is here.
- As a chain gets longer, the behaviors in the middle often deteriorate. Steve calls this “excessive suspension.” It is a sagging (slowing down, quitting, or losing quality) in the middle
- More of an issue with technical chains.
- Usually happens when the first and last behaviors are highly motivating, but the ones in the middle are less motivating.
- The only solution is to occasionally shore up the intermediate behaviors by reinforcing in the middle of the chain.
- note: I asked Steve if you can cause problems by interrupting a chain. Will the animal start to anticipate, leading to hesitation at points in the chain? He said that can happen so you have to be sure to teach the animal to keep going, unless the handler indicates otherwise.
Analyzing Problem Chains
- Use reverse task analysis – What is different between the end result and what you want?
- Look for discriminative stimuli and responses that are not functioning as intended.
- Look for for additional extra stimuli or responses that are disrupting the chain.
Look for Hidden Reinforcers
- What is reinforcing the “errors” in the chain? What is the dog getting out of them?
- Common reinforcers (that we fail to consider) are relief, attention, and intrinsic reinforcers.
- Use video analysis to analyze. Look at video 3 times: once to see the big picture, once to check the relevant stimuli, and once to look for reinforcers.
- Many daily behaviors are actually chains (or sequences)
- When chains go awry, analyze and then fix
- Check for fluency
- Check for stimulus control
- Pick behaviors that have a high probability of occurrence
- All chains and sequences rely on your ability to control reinforcers
I loved that Steve pointed out that trainers can have problems both with chains they want and with chains they don’t want. It’s always been a puzzle to me how it can be so easy to create unwanted chains, but then when you do want them, they can be so hard to maintain. It must be the Murphy’s Law of Chains.
After listening to all three presentations, I feel like I have a better understanding of how animals learn sequences of behavior and how we, as trainers, need to be very clear about how we are using them. In particular, after thinking about Dr. Reid’s talk and how guiding cues are replaced by practice cues, I found myself with a little better understanding of why animals are often skipping ahead in chains, or modifying their behavior in ways that work for them, but that are not necessarily appropriate for our goals.
I always seem to come back to riding examples, but that’s probably because it’s where I find I have to be the most careful about how I use both chain and tandem sequences. In a lot of my ridden work, I use chains both to teach the horse the sequence of behaviors I want and as a teaching tool for how to do something on her own. So, sometimes I do want to fade out the guiding cues and let the horse become more autonomous, but at other times I want to keep those guiding cues and maintain the original composition of the chain. Thinking more about what kinds of cues I want to use and the level of autonomy I want is going to make it easier for me to identify my end goal and be consistent about how I get there.
I also found that these presentations made me think more about sequences from the animal’s point of view. If you think about it, animals are innately programmed to find better and more efficient ways of obtaining desirable consequences. It makes total sense that they would modify and adjust their behavior, especially if they repeat the same sequence many times. Asking them to repeat the sequence with no adjustments (or only in the direction we want) is a more unnatural process than letting them learn and adjust in the manner that makes the most sense to them. It’s a good thing to keep in mind the next time an animal skips a step or tries to modify a chain to suit their own needs.
This was the last talk that was specifically on sequences of behavior. The remaining talks were on related subjects (cues, anticipation, and reinforcement) which are all connected.
While I try to be accurate in my note taking, there may be some errors either in my understanding or my presentation of the information from this talk. If you have questions, feel free to contact me or leave a comment.
Thanks to Steve White for allowing me to share his presentation. Thanks to the ORCA students and to the organizers of the Art and Science of Animal Training Conference for all their hard work on putting on this great event.