Use the fact that the datasets you use are already sorted. DSF and MSF are sorted by permno date. A splendid example of this is the code on the WRDS website which merges TAQ.CT with TAQ.CQ.
Instead of sorts to merge, use hashes - Section 12), which do not require sorting and are faster anyway.
Instead of using proc means with a by statement, that is, instead of saying
proc sort data=crsp.msf out=msf; by date; run; proc means data=msf noprint; by date; var ret; output out=summarystats(drop=_type_ _freq_) mean=meanret; run;
proc means data=crsp.msf noprint nway; class date; var ret; output out=summarystats(drop=_type_ _freq_) mean=meanret; run;
I replaced the ``by'' with a ``class''. I also used the NWAY option, which restricts the output to the ``outremost level''. Without it, SAS produces output for all combinations of the CLASS variables. MEANING WHAT?