The DATA step is nothing but an implicit loop. The standard DATA step looks like this:
data a; set b; <do something> run;
What this does is to read from dataset b, one observation at a time, do something to that observation, and then output that observation to dataset a. This is an implicit loop, which in pseudocode, might be made explicit by saying:
DO I = 1 TO ROWS(B); CURRENT_OBSERVATION=B[I,.]; <do something to CURRENT_OBSERVATION>; OUTPUT A; END;
treating A and B as matrices so ROWS(B) and B[I,.] (by which I mean the Ith row of the matrix B - the Ith observation) - have meaning.
There are two things that it is important to know. One is the role of the OUTPUT statement. The other is the role of the counter _N_.