Price Discovery in High Resolution

Last updated: Sunday, November 25, 2018 5:08 PM .

These pages contain documents, programs, and program outputs related to my paper "Price Discovery in High Resolution".

Basic documents

Programs, outputs and listings

These are contained in a zipped directory HRVAR_V01.zip. The table at the end of this page summarizes the programs and maps them to tables and figures in the paper.

There are three program subdirectories, organized by source language:

Mathematica is used only for the computational appendix; The SAS programs are used mainly for extraction of the TAQ data (from WRDS). Most of the estimation code is written in Matlab.

Notes on the SAS programs

The first extraction program (taqExtract01.sas) runs on WRDS. It pulls off the consolidated trade and quote data for the sample, and builds the various BBO (best bid and offer) series used in the paper. The second program (taqExtract02.sas) runs locally and converts the sas datasets produced by taqExtract01 to csv files, which can be read directly into Matlab. The output from the extraction programs is placed in the wrdsSASdatasets directory. As the data are proprietary, I cannot supply these files.

Notes on the Matlab programs

The main programs and scripts are in /MatlabNovember2018. The /mFiles subdirectory contains most functions; the mClasses subdirectory contains the class definitions. Program names that start with "MVARi" are computational. An MVARi object (Microstructure Vector Autoregression, version i) describes a VAR/VECM and serves as a container for estimation results. MVARi methods set up the VAR/VECM, build the crossproduct matrix, compute impulse response functions and so forth.

There are three sorts of analyses: "PartSIP" (participant vs. securities information processor time stamps); LexExl (bids and asks from listing and non-listing exchanges); and, "DarkLit" (quotes, lit trades and dark trades).

There are three numerical suffixes, for example: MVARiPartSIPxx, where xx is 01, 02 or 03. The 01 analyses cover IBM and NVDA for October 3, 2016 at resolutions ranging from one second down to 10 microseconds; the 02 analyses cover October 3 through November 11 at a 10 microsecond resolution; the 03 analyses are low-order multiple resolution estimations that are subsequently combined into bridged IRF analyses.

The production runs are executed as batch array jobs (one for each symbol/date) on NYU's High Performance Computing (HPC) system. Each job produces three files: a listing file that contains the Matlab diary (the copy of the screen output) for the run; an "out" file that contains the log file; and a "mat" file that contains the save results of the run. Most of these files are downloaded to the /HPC directory. Program names that start with "summary" summarize the HPC runs and produce figures.

Also note:

Notes on the live script files

In addition to the production runs, there are two Matlab mlx files. These are "live script" files. They document the essential classes and computations in the paper, and also contain executable Matlab code:

Program cross-reference table:

Analysis Description Language Program 1 Program 2 Result
Directory
Output Files Output Figures Paper
Tables
Paper
Figures
PartSIP Resolutions 1s-10us Matlab MVARiPartSIP01.m summaryPartSIP01.m summary01 summaryPartSIP01.xlsx summaryPartSIP01 IBM/NVDA.fig/jpg 5A 1
LexExl Resolutions 1s-10us Matlab MVARiLexExl01.m summaryLexExl01.m summary01 summaryLexExl01.xlsx summaryLexExl01 IBM/NVDA.fig/jpg 6A 2
DarkLit Resolutions 1s-10us Matlab MVARiDarkLit01.m summaryDarkLit01.m summary01 summaryDarkLit01.xlsx summaryDarkLit01 IBM/NVDA.fig/jpg 7A 3
PartSIP Event time Matlab MVARiPartSIP01ET.m MVARi01ETStats.m summary01 allStatsET.txt   5A,B  
LexExl Event time Matlab MVARiLexExl01ET.m MVARi01ETStats.m summary01 allStatsET.txt   6A,B  
DarkLit Event time Matlab MVARiDarkLit01ET.m MVARi01ETStats.m summary01 allStatsET.txt   7A,B  
PartSIP 10us res; 30 days Matlab MVARiPartSIP02.m MVARi02Stats.m summary02 allStats.txt   5B  
LexExl 10us res; 30 days Matlab MVARiLexExl02.m MVARi02Stats.m summary02 allStats.txt   6B  
DarkLit 10us res; 30 days Matlab MVARiDarkLit02.m MVARi02Stats.m summary02 allStats.txt   7B  
PartSIP Bridged info shares Matlab MVARiPartSIP03.m summaryPartSIP03.m summary03 summaryPartSIP03.txt summaryPartSIP03 IBM/NVDA.fig/jpg 9A 5
LexExl Bridged info shares Matlab MVARiLexExl03.m summaryLexExl03.m summary03 summaryLexExl03.txt summaryLexExl03 IBM/NVDA.fig/jpg 9B  
DarkLit Bridged info shares Matlab MVARiDarkLit03.m summaryDarkLit03.m summary03 summaryDarkLit03.txt summaryDarkLit03 IBM/NVDA.fig/jpg 9C  
  WRDS TAQ extraction SAS taqExtract01.sas            
  SAS datasets to csv SAS taqExtract02.sas            
  Descriptive statistics SAS DescStats01.sas   odsDescStats   tables234.rtf/docx 2,3,4  
  Bridged models SAS varsim04.sas   odsVarsim04 varsim04_1.rtf/docx varsim04_1.rtf/docx 8 4
  VECM examples MLX VECM_demo.mlx     VECM_demo.pdf      
  Sparse vector examples MLX MVARClassesDemo.mlx     MVARClassesDemo.pdf