NRM! No-reward marker

I will try to analyze what really stays behind No Reward Markers and how they work. Concepts like NRM are very popular in dog training world. They are considered effective because in use for ages and no one ever questions them. They just work, based on common knowledge.

At the beginning I will give you some examples I found online, on what people consider NRM to be:
– “signal informing the dog he made an error”
– “signal informing the dog that he should try harder next time”
– “words like wrong! ,again! , uh uh (…) informing the dog he has done something poorly, in a sloppy way”

DICTIONARY:
Before we go further I will explain few terms that appear in the article:

1. Operant learning – ability to learn through consequences
2. ABC three term contingency. A – antecedent stimuli preceding the behavior, B – behavior, C – consequence. Three term operant contingency
3. Reinforcement and punishment – two main types of consequences in operant learning. Both punishment and reinforcement are divided into 2 more types: unconditioned, conditioned.
4. Extinction – or more correctly operant extinction. Is the process by which a previously reinforced behavior is weakened by withholding reinforcement.
5. Frustration – function of extinction
6. Resurgence – reappearance of a previously extinguished behavior, during the extinction of a more recently reinforced behavior.
7. Intermittent schedule of reinforcement – Schedule of reinforcement in which some but not all occurrences of behavior produce reinforcement. Less prone to extinction.

“Non-Reward Marker” let’s start from the word reward, because it is here, where we will encounter first challenge. Word reward implies unscientific attitude.
According to the dictionary definition:
Reward – “a thing given in recognition of service, effort, or achievement”. The correct term we should use here is reinforcement.
Is it really necessary to pay attention to the words we use? Of course, because the way we name certain things defines how we perceive them and what we really know about them.
When we read reinforcement, it encourages us to learn about operant conditioning. It makes us see and think in a certain, scientific way and pushes us to understand how and what we using while working with dogs.
There are many topic covered in this article. I know some are not explained enough, but I tried not to diverse from the main subject. In future I will go back to more important ones in another blog.
Let’s start with reinforcement. Why it’s the correct word to use?
Because working with dogs and teaching them new behaviors, for the most part, we use rules of operant learning. It is based on assumptions that we (dogs, humans, cats, rats, horses) learn through consequences that occur in the environment.

Consequences are reinforcement or punishment. If something that follows after the behavior causes that in the future, behavior repeats itself or maintains, it is a reinforcement. If the result is the opposite, that is, the behavior occurs in the future less often, we talk about punishment. It is not relevant whether a given consequence gives us pleasure, pain, or is widely recognized as something „nice” or „unpleasant”.
An example may be the situation, when the handler yells at the dog every time the dog barks. As a result, barking behavior occurs more frequently in future. This means that the yelling at the dog acts as a reinforcement. It does not matter that we generally perceive it as punishment.
Unluckily, the terms reinforcement and punishment (punishment in particular) have a common meaning, deviating completely from their scientific definitions, used in the behavioral analysis.

In order to influence life of dogs, we need to understand what processes are involved. Our main interest is how animals learn, and as such we need to know the principles that apply to learning. We didn’t invent those rules, like we didn’t invent rules for matrix or subtraction. We simply learned to use them. Knowledge about how it all works, will help us not only to understand how our dogs learn, but will also explain why something works and why it doesn’t.

Back to our punishment and reinforcement. When we work with dogs, we should know how to recognize punishment and reinforcement by, among others, their basic and inseparable attribute which is: the effect on a given behavior. Without this effect, there is no punishment and reinforcement. On this video you will see the more about this topic. This can be helpful in understanding the rest of the text. When we briefly discussed the inaccuracies in definition, we can go further.

BEHAVIOR!
First of all, lets focus only on what we can influence directly. Behavior. Not emotions or internal states. We tend to analyze things that we do not have an influence on, like for example the psychological and internal states of our dog. We treat behavior not as a subject of our analysis but as symptom of something that happens inside our dog. However, when we consider what we use when working with dogs, it turns out that it is nothing but behavior. It is through changes in the environment that we influence our student’s behavior. That is why it is worth to make this behavior become an independent subject of our interest. We will not presume what is going on inside the dog. Behavior is our focus. It doesn’t mean emotions are not happening, we are simply unable to measure them. They are crucial and extremely important part of dog’s life but we can only alter them by working on behavior. That’s why we should focus on behavior and rules by which it is governed. This also implies ethical approach, as dogs are unable to use verbal behavior to explain how they feel in the moment, we should be even more careful with the methods we use. That’s why our first hand choice should always be positive reinforcement. Therefore, we should focus on what is the real object of our work – behavior!
Let us ask ourselves what lies behind NRM, i.e. WHAT effect it has on the behavior. Since we use it in training, we do it for some purpose, it is supposed to bring the desired effect on a behavior. It does not matter what we suspect the dog „thinks”. Because we do not know that. We cannot assume that when we say NRM, the dog knows that he has to „try harder”, „add more effort” or „knows about the error”

CONTINGENCY:
A-B-C

A = antecedent stimuli
B = behavior
C = consequence

Example:
A = cue for sit
B = dog sits
C = click and treat

Let’s think for a second. Where do we want to insert our NRM?
We have to answer, what function NRM is supposed to have on a behavior?

There are many discrepancies here:
1. When we use NRM, what behavior does the dog do?
2. Do we use another cue after NRM?

Doing research on this topic I have found two main ways, in which people use NRM.
1. NRM(-) – cue for the behavior is not being repeated after NRM
2. NRM(+) – cue for the behavior is repeated after NRM

Now, I will try to use our A-B-C example to explain first type of NRM. Analyze what function on behavior they have?
NRM(-) – cue for the behavior is not being repeated after NRM

A = cue for sit
B = dog sits
C = click and treat

A1 = cue sit
B1 = lays down
NRM(-) = S delta (ext)
B2 = sits
C = C/T

But more often:
A1= cue sit
B1 = lays down
NRM(-) = S delta (ext)
B2 = stands up
NRM(-) = S delta (ext)
B3 = sits
C = C/T

What does it all mean?
In the case of the NRM(-) (no cue afterwards), it starts to function as a stimulus for extinction. This means we start the operant extinction procedure, along with its function: frustration. What can happen is that we may experience resurgence, which occurs much more frequently, than in the first version of our ABC. It is the situation that the handler says: „the dog begins to try harder”, „the dog is thinking „, „the dog has more” motivation „(sic!) to make further attempts”, „the dog begins to do more”. What is happening is the consequence of the withdrawal of the reinforcement for the behavior that was previously reinforced. A similar scenario can be observed with the incorrect use of shaping to teach a new behavior. The student begins to resurge either in the topography of a given behavior or presents other previously reinforced behaviors.

What is the function of NRM(-)?

The extinction procedure with its side effect of extinguishing the presented behavior, also in other situations. Incorrect use of extinction leads to frustration, even to a degree that can prevent the continuation of the learning process. Extinction is the process of stopping the reinforcement for behavior that has been reinforced earlier. This means that the goal is to remove the contingency between behavior and consequence. We want the extinguished behavior to cease to exist.

The problems that arise here are:
1. Very often, we do not want to extinguish this other behavior at all. We do not want to because it is the behavior we need in other situations. Therefore, first and foremost, we should consider why the dog responded to this stimulus with such behavior and not subject it to the extinction
2. Intermittent schedule of reinforcements. Because very often the situation is that the undesirable behavior that appeared on the cue A1, is the behavior reinforced in other situations. Then the extinction process will be doomed to failure, because what we do is just using intermittent schedule of reinforcements. The same schedule that occurs , for example, in a one-armed bandit. Thanks to it, our behavior is much more resistant to extinction.

2. NRM(+) – cue for the behavior is repeated after NRM

A1 = cue sit
B1 = lays down
C1 = NRM = P+
A2 = cue sit
B2 = sits
C2= C/T

What does it all mean?
In the case of the equation with NRM (+) (NRM followed by a new cue for behavior), NRM begins to function as a positive punishment. We add something to the environment immediately after the behavior, which makes this behavior less frequent in the future.
What is the function of NRM(+)?
– a positive punishment may have a side effect in the form of decreasing presented behavior, also in other situations. Again, however, the basic issue that we should discuss, is the reason why the dog responded in this way and what can we do to ensure the success.

The concept of error.
What is the error in training? This issue is discussed in more detail in this article. Wrong! I invite interested parties to read it in the first place.
In short, a mistake is nothing more than an undesirable reaction, behavior on a given stimulus. It is feedback for the handler, that at this point, in this environment, with this dog, the cue X, did not mean the X behavior. Why? There can be many reasons. And as good trainers, we should find them, understand them and adjust the training. Training with a dog is based on learning behaviors, based on maneuvering the variables available to us, handlers. We operate on antecedents, (our A in the equation) as well as the consequences (our C in the equation) occurring in the environment. If the response to the stimulus is undesirable then first of all we should check why? Such a perspective on error excludes the use of NRM.

How do we teach NRM?
Just saying uh uh, again, wrong or other words most often used as NRM will not make the dog perform the next repetition of the behavior correctly. Those words mean nothing (like sit, spin, etc.). The fact that we suddenly start to use them in training will not affect the behavior or „understanding” by the dog to what we expect from him. A lot of people say that they would like someone to tell them when they are doing things incorrectly.

But would you like it to be done:
– in another language, you don’t understand?
– besides saying the word, the person didn’t tell you how to do it correctly?

It’s like if I got a tennis racket in the palm of my hand and I could take a difficult ball the first time. Then someone clicked and said it was great! Ok, but did it teach me to do it correctly? Do I now have skills that will allow me to repeat this behavior? Next ball and I also managed to do it.
However, the third time missed, someone does not click, just tells me uh uh. I try again but I do not know what to do, to make it work. I hear uhuh again. Does it in any way help me to develop the necessary skills? Skills that will allow me in future to properly perform the behavior.
Another version of the same situation. This time I already have the skills to bounce the ball. However, not with this wind. That’s why I miss. I hear uhuh. Shouldn’t my teacher consider teaching me how to do this in such environment in the first place?
Distance control. Dog does position changes on a large filed. At the time of the cue for one change:
– he experienced a muscle contraction
– he heard the sound that caught his attention.
What would I achieved, if I used NRM here?

How does the dog know what is the correct behavior?
When I work with a dog, I teach him behaviors. Behaviors that often require many skills to be done „correctly” – for example in the manner required by the regulations of a given sport. Success is largely dependent on my skills as a teacher.
What I achieve is the result of, among others:
– my technique
– my knowledge
– my student’s learning history

When I take all this into account, it may turn out that the way of teaching this behavior, needs to be refined, from my side as a teacher. “It is not about teaching spectacular behaviors. It’s about spectacular teaching of behaviors”. Very often the situations described by the handler as „sloppy”, „poor quality”, „fault” on the dog’s side, result from the way we work. We are responsible for the training program, not dogs.

In that case, what to do when you get an unwanted reaction to the stimulus?
We are not able to go back in time. You cannot change the behavior that your dog just did.. The worst thing we can do to fix the error when it happened. Quoting Dr. Jesús Rosales-Ruiz
„The worst moment to correct an error is when it happens.” If we do that that we fall into the trap of our first NRM(-). The most important thing is to keep fluency in training. Make sure that what happens next is the correct behavior. It will not happen if you use NRM.
Then analyze, think what happened, why it happened, draw conclusion. Teacher and not the student, is in in this team, to fill in the gaps in training.
That is why we should think about what the NRM is all about. What effect it is supposed to have on the behavior in future and how we will achieve it. Maybe when you really think about it, you will see there are more much better ways to ensure successful training.

Agnieszka Janarek Dog Trainer