%let startdate= '31Jan1980'd;
%let enddate= '31Jan2000'd;
%macro justaloop;
%let startdate=%sysfunc(putn(&startdate, 8.));
%let enddate=%sysfunc(putn(&enddate, 8.));
%let date=&startdate;
%do %while(&date <= &enddate);
data tempdsf;
set crsp.dsf;
where intnx('month', date, 1)-1 = &date;
run;
<code to calculate correlations>
%let date=%eval(%sysfunc(intnx(month, &date+1, 1))-1);
%end;
%mend;
%justaloop;
This is inefficient because it takes long to subset DSF each time. Instead, do the following:
Sorting DSF and creating an index are simple:
proc sort data=crsp.dsf out=dsf;
where date>=&startdate and date<=&enddate;
by date;
run;
data dsfindex(keep=date firstobs lastobs);
set dsf;
by date;
retain firstobs;
if first.date then do;
firstobs=_n_;
end;
if last.date then do;
lastobs=_n_;
output;
end;
run;
DSFINDEX now contains, for each date, the observation numbers at which the observations for that date begin and end.
Within the loop, you then say:
data _null_;
set dsfindex;
where date=&date;
call symput('firstobs', firstobs);
call symput('lastobs', lastobs);
run;
data tempdsf;
set dsf(firstobs=&firstobs obs=&lastobs);
run;
The DATA _NULL_ statement just means that I am not creating a new dataset when I subset DSFINDEX. The only reason I am opening DSFINDEX is to get the values of FIRSTOBS and LASTOBS into the corresponding macro variables.
I then subset DSF to get TEMPDSF, keeping only the observations numbered between &FIRSTOBS and &LASTOBS. Note that the obs= statement tells you the last observation that will be read, not the number of observations to read.
TEMPDSF now contains all the observations for which DATE equals &DATE. The savings here are enormous: this takes virtually no time.
The complete code thus reads:
%let startdate= '31Jan1980'd;
%let enddate= '31Jan2000'd;
proc sort data=crsp.dsf out=dsf;
where date>=&startdate and date<=&enddate;
by date;
data dsfindex(keep=date firstobs lastobs);
set dsf;
by date;
retain firstobs;
if first.date then do;
firstobs=_n_;
end;
if last.date then do;
lastobs=_n_;
output;
end;
run;
%macro justaloop;
%let startdate=%sysfunc(putn(&startdate, 8.));
%let enddate=%sysfunc(putn(&enddate, 8.));
%let date=&startdate;
%do %while(&date <= &enddate);
data _null_;
set dsfindex;
where date=&date;
call symput('firstobs', firstobs);
call symput('lastobs', lastobs);
run;
data tempdsf;
set dsf(firstobs=&firstobs obs=&lastobs);
run;
<code to calculate correlations>
%let date=%eval(%sysfunc(intnx(month, &date+1, 1))-1);
%end;
%mend;
%justaloop;