logo about writing tools
teaching data blog

 

 

 

caveat


I am flattered that there are so many people who are interested in my data and that it is getting used in so many contexts, but I am also a little nervous about how it is being used. So, here are some things that you may want to consider before you use the data.

1. History

I started putting my datasets online in the early 1990s. At the time, I did not expect it to be used by anyone other than the students in my classes. In fact, for the first few years, the only datasets I had were for US companies and I had only a handful of data items- average betas by sector, averages of PE, Price to Book and EV to EBITDA multiples and a few dividend/debt ratio statistics. I did not provide the individual company data for downloads. The only data source I used was Value Line, which included data on 1700 US companies. Over the last decade, two things have happened that are interrelated.

2. Data Sources

I am dependent upon my data sources for my data, which gives rise to three issues.

3. Sector averages versus Individual company data

My focus has always been on the industry average data for two reasons. The first is that it is the data that I most use in valuations and corporate finance. The second is that you can get much more detailed individual company data from the company's own financial reports. I am not a data service (and I do not have the resources to be one) and the individual company data was never meant to be used as a research database. So, if it is missing items you wish it had, there is a reason.

4. Cross sectional versus Time Series Data

My primary objective each year is to provide updated data for sector averages on different statistics that year. Thus, the 2023 update has the industry averages using the most recent market price data (end of 2021) and the most recent financial statements (for annual data, that may be 2021). I never intended to provide a time series of data. I do provide the archived data sets from prior years, and while I have tried to be consistent in my industry groupings and data definitions, the raw data sources have changed at least five times in the last 25 years, making comparisons dangerous.. I often do change my views on how to compute a statistic and will try to go back over time and change my historical numbers. Thus, do not be surprised, if you go back and look at the 2004 data, to see an average beta for the telecom services businesses that is different from the value you looked up in 2004.

5. Use for data

  1. I am a valuation/corporate finance person. When looking at a company, I am less interested in where it has been and more in where it is going. I look at data as raw material that I can use in making better estimates for the future. Consequently, this valuation mission drives how I come up with my numbers and what I report. For instance, when defining beta, my primary concern is that I get as good a beta estimate I can for the future and not to get the best estimate I can for the past. This remains the best use for my data.
  2. If you are interested in assessing the past (doing a post-mortem), you will be better served using a service that focuses on providing just that: historical time series information. You can try to string together my sector average datasets over time, but it may not serve your purposes as well.
  3. I do know that this data ends up in the legal arena more often than it should. If you are using my data from a prior year to back up your position or repudiate your opponent's in a court of law, please leave me (personally) out of that food fight. While I stand behind my data, it was never my intent to use it for that purpose. In fact, I don't put much weight on two factors that the legal system values, precedence and consistency. Put differently, if I feel that I have been computing a ratio incorrectly for ten years, I have no qualms about changing the way I do it in year 11.
  4. Finally, this data was alsonever meant for public policy debates. In 2011, for instance, the New York Times used the tax rates that I had computed by sector to make a case that the tax code in the US was unfair. That may very well be, but I computed tax rates for a prosaic purpose, which is to value companies. It was not to make judgments on whether companies pay enough in taxes.

6. Description

I know that I have not been very good about providing enough background on how I compute some ratios (say the return on invested capital), expecting users to be familiar with my writings or books. That is not fair and I will try to remedy it over time, by going in and augmenting my variable description section and providing YouTube supporting videos for some of my datasets. It may take me a while to get it done. So, please have patience.

7. Fixing errors

Are there errors in the data? I am sure that there are, just as there are in any large dataset. Some of those errors may come from the data service and some are mine. If you do find an error, please let me know. Remember, though, that this site has a staff of one (me) and I may not get the fix done or get back to you as soon as you would like me to. But I promise that I will, sooner rather than later.