next up previous
Next: Some CRSP variable names Up: The logic of the Previous: The OUTPUT statement


The counter _N_

The counter _N_ counts the iterations through the implicit loop of the DATA step. It increments when the top of the loop is encountered.

Suppose dataset B consists of one variable, called INDEX, with values 1-10, in order.

                                 Obs    INDEX

                                   1       1 
                                   2       2 
                                   3       3 
                                   4       4 
                                   5       5 
                                   6       6 
                                   7       7 
                                   8       8 
                                   9       9 
                                  10      10

If I run this code

     data b;
        set b;
	counter=_N_;
      run;

I get

                            Obs    INDEX    counter

                              1       1         1  
                              2       2         2  
                              3       3         3  
                              4       4         4  
                              5       5         5  
                              6       6         6  
                              7       7         7  
                              8       8         8  
                              9       9         9  
                             10      10        10

Suppose instead I were to run:

     data b;
        set b;
	counter=_N_;
	_N_=1;
      run;

one would imagine that counter would always have the value 2, since _N_ is being reset to 1 at every iteration of the data step loop. But this is not the case:

                            Obs    INDEX    counter

                              1       1         1  
                              2       2         2  
                              3       3         3  
                              4       4         4  
                              5       5         5  
                              6       6         6  
                              7       7         7  
                              8       8         8  
                              9       9         9  
                             10      10        10

This means that _N_ is only a place that SAS puts the value of an internal counter which we never see and have no way of changing. Whatever you do to _N_ in the data step, when the loop comes to the head of the DATA step, SAS will place the incremented value of the internal counter in _N_.

One useful construct is:

     data b;
        set b;
	if _n_ =1 then do;
	    <some stuff>
	    end;
      run;

which does ``some stuff'' at the first value read from b and nowhere else. This is useful for initializing variables and hashes (see section 12).


next up previous
Next: Some CRSP variable names Up: The logic of the Previous: The OUTPUT statement
Andre de Souza 2012-11-19