Suppose dataset B consists of one variable, called INDEX, with values 1-10, in order.
Obs INDEX
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
If I run this code
data b;
set b;
counter=_N_;
run;
I get
Obs INDEX counter
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
9 9 9
10 10 10
Suppose instead I were to run:
data b;
set b;
counter=_N_;
_N_=1;
run;
one would imagine that counter would always have the value 2, since _N_ is being reset to 1 at every iteration of the data step loop. But this is not the case:
Obs INDEX counter
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
9 9 9
10 10 10
This means that _N_ is only a place that SAS puts the value of an internal counter which we never see and have no way of changing. Whatever you do to _N_ in the data step, when the loop comes to the head of the DATA step, SAS will place the incremented value of the internal counter in _N_.
One useful construct is:
data b;
set b;
if _n_ =1 then do;
<some stuff>
end;
run;
which does ``some stuff'' at the first value read from b and nowhere else. This is useful for initializing variables and hashes (see section 12).