1 The SAS System 12:07 Thursday, September 22, 2011 NOTE: Copyright (c) 2002-2008 by SAS Institute Inc., Cary, NC, USA. NOTE: SAS (r) Proprietary Software 9.2 (TS2M2) Licensed to NEW YORK UNIVERSITY - STERN SCHOOL OF BUSINESS-T/R, Site 70062232. NOTE: This session is executing on the Linux 2.6.9-78.0.22.ELsmp (LINUX) platform. You are running SAS 9. Some SAS 8 files will be automatically converted by the V9 engine; others are incompatible. Please see http://support.sas.com/rnd/migration/planning/platform/64bit.html PROC MIGRATE will preserve current SAS file attributes and is recommended for converting all your SAS libraries from any SAS 8 release to SAS 9. For details and examples, please see http://support.sas.com/rnd/migration/index.html This message is contained in the SAS news file, and is presented upon initialization. Edit the file "news" in the "misc/base" directory to display site-specific news and information in the program log. The command line option "-nonews" will prevent this display. NOTE: SAS initialization used: real time 0.87 seconds cpu time 0.03 seconds 1 * 2 ________________________________________________________________________________________________________________________ 3 4 dpRegression.sas 5 Joel Hasbrouck 6 September 2011 7 8 This program runs on rnd.stern.nyu. It builds a dataset of transactions for one ticker symbol, computes signed trades, 9 and estimates the generalized Roll model. 10 _______________________________________________________________________________________________________________________ 11 ; 12 options source nodate nocenter nonumber ps=max ls=110 orientation=landscape; 13 libname taq '/homedir/fin/fac/jhasbrou/public_html/ftp/phd2011Fall'; NOTE: Libref TAQ was successfully assigned as follows: Engine: V9 Physical Name: /homedir/fin/fac/jhasbrou/public_html/ftp/phd2011Fall 14 libname this '.'; NOTE: Libref THIS was successfully assigned as follows: Engine: V9 Physical Name: /homedir/fin/fac/jhasbrou/phd2011Fall 15 16 *__________________________________________________________________________________________________ 16 ! _____________________ 17 18 Subset a ticker symbol and print out a few records. 19 ___________________________________________________________________________________________________ 19 ! _____________________; 20 data this.myTicker; 21 set taq.ctqall; NOTE: Data file TAQ.CTQALL.DATA is in a format that is native to another host, or the file encoding does not match the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance. 22 * Next statement restricts the output to the symbol we want with nonmissing values for the bbid 22 ! and bofr 23 Outliers (with extremely high offers) and records after the 'normal' market close are removed. 24 'condFlag' flags trades with special condition codes.; 25 where symbol='ESSX' and bbid^=. and bofr^=. and bofr/bbid<5 and time<='16:10't and condflag^=1; 26 run; NOTE: There were 4471 observations read from the data set TAQ.CTQALL. WHERE (symbol='ESSX') and (bbid not = .) and (bofr not = .) and ((bofr/bbid)<5) and (time<='16:10:00'T) and (condflag not = 1); NOTE: The data set THIS.MYTICKER has 4471 observations and 14 variables. NOTE: DATA statement used (Total process time): real time 2.19 seconds cpu time 1.22 seconds 27 28 *__________________________________________________________________________________________________ 28 ! _____________________ 29 30 Compute quote midpoints 31 ___________________________________________________________________________________________________ 31 ! _____________________; 32 data quotes; 33 set this.myTicker; 34 by date time; 35 if last.time then do; 36 if (bbid^=. and bofr^=.) then qMid=(bbid+bofr)/2; 37 time = time+1; * Sign with respect to prior second's quote; 38 output; 39 keep symbol date time bbid bofr qMid; 40 end; 41 run; NOTE: There were 4471 observations read from the data set THIS.MYTICKER. NOTE: The data set WORK.QUOTES has 3714 observations and 6 variables. NOTE: DATA statement used (Total process time): real time 0.07 seconds cpu time 0.01 seconds 42 43 *__________________________________________________________________________________________________ 43 ! _____________________ 44 45 Merge back in & sign trades. 46 ___________________________________________________________________________________________________ 46 ! _____________________; 47 data qt; 48 merge this.myTicker (where=(price^=.) keep=symbol date time price size) 49 quotes (keep=symbol date time qMid); 50 by symbol date time; 51 retain qMidPrevailing tradeSeqno 0; 52 if (qMid^=.) then qMidPrevailing=qMid; 53 if price^=. then do; 54 q = sign(price-qMidPrevailing); 55 tradeSeqno = tradeSeqno+1; 56 output; 57 drop qMid; 58 end; 59 run; NOTE: There were 1715 observations read from the data set THIS.MYTICKER. WHERE price not = .; NOTE: There were 3714 observations read from the data set WORK.QUOTES. NOTE: The data set WORK.QT has 1715 observations and 8 variables. NOTE: DATA statement used (Total process time): real time 0.26 seconds cpu time 0.02 seconds 60 61 *__________________________________________________________________________________________________ 61 ! _____________________ 62 63 Compute first differences 64 ; 65 data qt; 66 set qt; 67 by date; 68 dp = dif(price); 69 dq = dif(q); 70 if first.date then do; 71 dp=.; 72 dq=.; 73 end; 74 run; NOTE: There were 1715 observations read from the data set WORK.QT. NOTE: The data set WORK.QT has 1715 observations and 10 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 75 proc print data=qt (obs=50); 76 title "qt"; 77 run; NOTE: There were 50 observations read from the data set WORK.QT. NOTE: The PROCEDURE PRINT printed page 1. NOTE: PROCEDURE PRINT used (Total process time): real time 0.93 seconds cpu time 0.03 seconds 78 79 %let nImpulse=10; 80 *__________________________________________________________________________________________________ 80 ! _ 81 82 Estimate the generalized Roll model. 83 ___________________________________________________________________________________________________ 83 ! _; 84 title "Generalized Roll model"; 85 proc model data=qt outmodel=var; 86 Dp = c*dif(q) + lambda*q; 87 87 ! fit Dp; 88 run; NOTE: At OLS Iteration 1 CONVERGE=0.001 Criteria Met. 89 quit; NOTE: The program 'MODEL' was written to data set WORK.VAR. NOTE: Starting in SAS 9.2 model files are being saved as XML-based data sets. To change this behavior, use the SAS global option 'CMPMODEL'. Currently the 'CMPMODEL' option is set to write BOTH formats, but only the new format will be used in future releases. NOTE: The PROCEDURE MODEL printed pages 2-4. NOTE: PROCEDURE MODEL used (Total process time): real time 0.57 seconds cpu time 0.05 seconds 90 *__________________________________________________________________________________________________ 90 ! _ 91 92 To compute impulse response function (IRF), first build a dataset with 'innovation' values. 93 The one below looks at the impact of a one unit buy order (q=+1). 94 ___________________________________________________________________________________________________ 94 ! _; 95 title2 'IRF calculations based on an initial one-unit buy order (q=+1)'; 96 data u; 97 do t=-2 to -1; 98 q=0; Dp=0; output; 99 end; 100 q=1; Dp=.; output; 101 do t=1 to &nImpulse; 102 q=0; 103 Dp=.; 104 output; 105 end; 106 run; NOTE: The data set WORK.U has 13 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 107 proc print data=u noobs; 108 run; NOTE: There were 13 observations read from the data set WORK.U. NOTE: The PROCEDURE PRINT printed page 5. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 109 *__________________________________________________________________________________________________ 109 ! _ 110 111 Now 'solve' the model with 'innovation' values... 112 ___________________________________________________________________________________________________ 112 ! _; 113 proc model model=var; NOTE: The program was read from the file WORK.var.MODEL. 114 114 ! solve / data=u out=irf (keep=t Dp q) forecast; 115 id t; 116 run; NOTE: The data set WORK.IRF has 12 observations and 3 variables. 117 *__________________________________________________________________________________________________ 117 ! _ 118 119 ... and cumulate them. 120 ___________________________________________________________________________________________________ 120 ! _; NOTE: The PROCEDURE MODEL printed pages 6-7. NOTE: PROCEDURE MODEL used (Total process time): real time 0.04 seconds cpu time 0.01 seconds 121 data cirf; 122 cumDp=0; 123 cumq=0; 124 do until (eof); 125 set irf end=eof; 126 cumDp = cumDp+Dp; 127 cumq = cumq+q; 128 output; 129 end; NOTE: There were 12 observations read from the data set WORK.IRF. NOTE: The data set WORK.CIRF has 12 observations and 5 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 130 proc print data=cirf noobs; 131 run; NOTE: There were 12 observations read from the data set WORK.CIRF. NOTE: The PROCEDURE PRINT printed page 8. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 132 133 *__________________________________________________________________________________________________ 133 ! _ 134 135 Estimate the generalized Roll model WITH autocorrelated q(t) 136 ___________________________________________________________________________________________________ 136 ! _; 137 title "Generalized Roll model with autocorrelated q(t)"; 138 proc model data=qt outmodel=var; 139 Dp = c*dif(q) + lambda*q; 140 q = a*lag1(q); 141 141 ! fit Dp q / covs; 142 run; NOTE: At OLS Iteration 1 CONVERGE=0.001 Criteria Met. 143 quit; NOTE: The program 'MODEL' was written to data set WORK.VAR. NOTE: Starting in SAS 9.2 model files are being saved as XML-based data sets. To change this behavior, use the SAS global option 'CMPMODEL'. Currently the 'CMPMODEL' option is set to write BOTH formats, but only the new format will be used in future releases. NOTE: The PROCEDURE MODEL printed pages 9-11. NOTE: PROCEDURE MODEL used (Total process time): real time 0.03 seconds cpu time 0.03 seconds 144 145 *__________________________________________________________________________________________________ 145 ! _ 146 147 To compute IRF, first build a dataset with 'innovation' values. 148 The one below looks at the impact of a one unit buy order (q=+1). 149 ___________________________________________________________________________________________________ 149 ! _; 150 title2 'IRF calculations based on an initial one-unit buy order (q=+1)'; 151 data u; 152 do t=-2 to -1; 153 q=0; Dp=0; output; 154 end; 155 q=1; Dp=.; output; 156 do t=1 to &nImpulse; 157 q=.; 158 Dp=.; 159 output; 160 end; 161 run; NOTE: The data set WORK.U has 13 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 162 proc print data=u noobs; 163 run; NOTE: There were 13 observations read from the data set WORK.U. NOTE: The PROCEDURE PRINT printed page 12. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 164 *__________________________________________________________________________________________________ 164 ! _ 165 166 Now 'solve' the model with 'innovation' values... 167 ___________________________________________________________________________________________________ 167 ! _; 168 proc model model=var; NOTE: The program was read from the file WORK.var.MODEL. 169 169 ! solve / data=u out=irf (keep=t Dp q) forecast; 170 id t; 171 run; NOTE: The data set WORK.IRF has 12 observations and 3 variables. NOTE: The PROCEDURE MODEL printed pages 13-14. NOTE: PROCEDURE MODEL used (Total process time): real time 0.28 seconds cpu time 0.03 seconds 172 proc print data=irf noobs; 173 run; NOTE: There were 12 observations read from the data set WORK.IRF. NOTE: The PROCEDURE PRINT printed page 15. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 174 *__________________________________________________________________________________________________ 174 ! _ 175 176 ... and cumulate them. 177 ___________________________________________________________________________________________________ 177 ! _; 178 data cirf; 179 cumDp=0; 180 cumq=0; 181 do until (eof); 182 set irf end=eof; 183 cumDp = cumDp+Dp; 184 cumq = cumq+q; 185 output; 186 end; 187 NOTE: There were 12 observations read from the data set WORK.IRF. NOTE: The data set WORK.CIRF has 12 observations and 5 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 188 proc print data=cirf noobs; 189 run; NOTE: There were 12 observations read from the data set WORK.CIRF. NOTE: The PROCEDURE PRINT printed page 16. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414 NOTE: The SAS System used: real time 5.56 seconds cpu time 1.43 seconds