[an error occurred while processing this directive]
data dsf; set crsp.dsf(keep=permno date ret); where permno in(10001, 10002, 10003); run;
instead of
data dsf; set crsp.dsf(keep=permno date ret); if permno=10001 or permno=10002 or permno= 10003; run;
What happens when you do not know in advance what permnos you want to keep? Or if you do know, but the list is a headache to type out?
In that case, you can replace the contents of the where with a macro variable. Suppose some previous steps produce a dataset, named KEEPTHESE that contains the permnos you want to keep:
Obs PERMNO 1 10000 2 10006 3 10009 4 10021
Given this dataset, I want to subset CRSP.DSF, keeping only the permnos listed in KEEPTHESE. To do this, I first run the following data step, which creates several macro variables.
data _null_; set keepthese nobs=nobs; if _n_ =1 then call symput("nobs", nobs); call symput("permname"||trim(left(put(_N_,4.))), permno); run;The first line after the SET statement creates a macro variable named NOBS, which contains the number of observations in KEEPTHESE. The second line creates a series of macro variables named permname1, permname2, permname3, ... containing the permnos in the order in which they appeared in KEEPTHESE. Now inside a macro, I concatenate all these macro variables.
%macro justconcatenate; %let listofpermnos=&permname1; %do i =2 %to &nobs; %let listofpermnos =%sysfunc(catx(%str(,),&listofpermnos, "%trim(&&permname&i)")); %end; %let listofpermnos=(&listofpermnos); %mend; %justconcatenate;This macro just creates a macro variable named LISTOFPERMNOS whose value is set to the contents of the macro variable PERMNAME1. I then join all the remaining PERMNAMEs to this LISTOFPERMNOS with commas as delimiters.
Finally, I can use the macro variable so created:
data dsf; set crsp.dsf(keep=permno date ret); where permno in &listofpermnos; run;
This might seem like a lot of bother for nothing, but supposing you are work with a large dataset, like TAQ, and you need to keep observations with symbols in a dataset like KEEPTHESE. Sorting and merging is not an option. This method works well, as would hashes(see section 12). I have not compared this method against a hash.