For instance, suppose I wanted to find the mean return for each permno in the MSF file. I could do this in two ways.
The obvious way is:
proc means data=crsp.msf noprint; by permno; var ret; output out=means mean=meanret; run;The DOW-loop way is:
data means; count=0; sum=0; do until (last.permno); set crsp.msf; where ret is not missing; by permno; sum=sum+ret; count=count+1; end; meanret=sum/count; output means; run;
What's going on here? Briefly, the do-loop iterates over each permno's returns, exiting when the last observation for that permno has been reached. When this happens,meanret is calculated and output.
This might seem like a fancy construct that has no obvious practical use, but its utility is enormous, particularly when you want to merge two datasets and then calculate summary statistics on the merged dataset, as will be explained in section 12.5.
For further reading, see Dorfman and Shajenko (2007).