This the first in a series of posts based on my notes from the 2018 Art and Science of Animal Training Conference that was held in Irving, Texas on March 24-25, 2018. To learn more about the conference, you can visit the conference website.
While I try to take accurate notes, it is possible that there are errors or that some detail is lacking. If you post a comment or email me, I can try to clarify or provide some additional information. Many thanks to the speakers and organizers who allow me to share.
Ken Ramirez: No Reward Markers (NRMs): Science and Practice
Whether or not No Reward Markers can be used as part of a positive reinforcement training strategy is always a controversial topic. As part of the Sunday morning presentations on reinforcers and conditioned reinforcers, Ken shared his thoughts on the subject. This was a 20 minute talk.
Ken started by clarifying what he meant by a No Reward Marker. The term is used by many trainers, but there are significant variations in both definition and practice, so it’s a good idea to start by defining it.
What is an NRM?
- Most common use is that it marks the moment the animal does the wrong or incorrect answer
- Opposite of the click
- Conditioned punisher
If you use NRMs, you might agree with the first two points, but you will probably question the third point. Most trainers who use NRMs would not describe them as conditioned punishers. Instead, they prefer to describe them as providing information to the animal so that he doesn’t waste time pursing behaviors that will not earn reinforcement. But Ken said that in all his years of training, he has only seen 13 people (out of thousands), who can use an NRM without any visible side effects.
It may be easier to see this if you look at how conditioned punishers are taught and the possible side effects.
What are conditioned punishers?
A punisher is a stimulus that, when applied immediately after a behavior, decreases the likelihood (frequency) of that behavior happening in the future. A conditioned punisher is a stimulus that has been conditioned, through association with another punisher, so that it can be used to decrease behavior.
He shared a video example of a verbal conditioned punisher that was learned through pairing with a finger poke (I’m sure you can guess who). The dogs clearly responded to the sound with defensive body posture and by recoiling. The video showed that the conditioned punisher was effective, but also that it had side effects. There’s no argument that conditioned punishers can be effective, but they are not without risks.
The rest of the talk was looking at various applications of NRMs and evaluating both their effectiveness and side effects. The conundrum is this… If the NRM is effective at reducing the behavior, then it is, by definition, punishment. If the NRM is not effective – it does not function as a punisher to decrease the behavior in the future – then why use it?
To unravel this, you have to look at the different applications of NRMs to see whether the NRM is functioning as a punisher, has no effect, or is perhaps functioning as something else like a new cue or a means of redirection.
NRMs: Varied uses and applications
To indicate “no” or “wrong”
- Marks incorrect response
- Trainers say they just want it to be information
- Trainers think it’s ok if it is delivered in a passive manner. How about a passive “oops?”
- The problem is that if it is effective, then by definition it is a punisher
As a warning signal
- Last chance before something bad is coming
- Warning prior to a more aversive stimulus (or a more severe one)
- Varied effectiveness
- Can become a new cue for the behavior
- Do generate an emotional response
Ken had two examples to show some of the things that can happen when an NRM is used as a warning signal.
Example 1: When he was a kid, his Mom would ask him to take out the garbage. She might ask him a few times (“Kenny, take out the garbage”) and then if he didn’t do it, she would call him by his full name. When he heard his full name, he got up and did it. The use of his full name was effective in that it did cause him to get up and take the garbage out, but it didn’t change his future behavior – he was still likely to ignore her when she said “Kenny, take out the garbage.” If his full name as an NRM was effective, then he should have learned to take the garbage out when she asked him the first time, but he didn’t. And, over time, the use of his full name just became the new cue (or part of the new cue) to take the garbage out.
Example 2: The warning “ding, ding, ding” in his car when he leaves the lights on. The sound is aversive and he feels a moment of frustration when he hears it. It is effective because when he hears it, he does turn the lights off. But, has it made him less likely to leave the lights on? Maybe a little over time, but it could only be considered a weak punisher because it doesn’t change his behavior very quickly. He did joke that if it was followed by a strong aversive, it might be more effective, but then he would probably sell the car. In addition to being a warning, the sound also becomes a cue for a specific behavior – turn off the lights.
To indicate “correct the behavior or you will not be reinforced”
He had a video showing a blood draw in a hyena where the hyena moved away before the trainer was done holding off the spot. The trainer said “ah ah” and cued the hyena to move back. He came back into position, she finished, clicked and reinforced him.
Was the NRM effective?
- Ken can’t see any change in the hyena (good or bad)
- She uses her cue to bring him back
- It’s possible the “ah ah” will just become a cue to come back into position
- the “ah ah” is possibly just superstitious behavior on the trainer’s part
One of the points he made, using this example, was that since the trainer only uses positive reinforcement, it’s likely that the “ah ah” has no meaning to the hyena, which is why Ken doesn’t see any response. The hyena doesn’t return to position when she says “ah ah,” (he responds to her cue), but it might learn to over time, if she continued to follow it with her cue.
Used as an interrupter
- Stops behavior in the moment, but doesn’t always change future behavior
- Still aversive
- Weak or ineffective punisher (more like redirection)
Used as a “stop” cue?
This is a more common (growing) use among R+ trainers. The idea is to use the NRM and then immediately redirect and reinforce the alternative behavior. Again, you have to look at the effect on behavior and the animal’s emotional response. What does the animal look like?
Example: He had a clip of Susan Garrett teaching a dog to do weave poles. The video shows several NRMs being used. In each case, the dog is not reinforced and is re-started. He shared this as an example of an NRM that doesn’t seem to have any aversive side-effects.
- She has a variety of NRMs (I think he said 4)
- The dog maintains a high level of enthusiasm even after the NRM
- In the last part of the clip, she placed a toy a short distance from the end of the weave poles. If the dog went through correctly, he retrieved the toy and she would play tug. If he made an error, she used her NRM and he returned (without retrieving the toy) and was re-started.
- Note: Steve White pointed out that it’s not the toy that is the reinforcer, but playing with the toy – if the dog learns that he won’t get to play with the toy – then there’s no point in going and getting it.
- Traditional use is that an NRM functions as a punisher
- Can assist in shaping behavior, but can also create frustration
- Other similar uses may not actually be an NRM (it’s more likely they are a cue or redirection)
- Often conditioned inadvertently
- Only skilled and disciplined trainers can use them well, not a bad tool, or at least should be used with thought and care.