logo about writing tools
teaching data blog





This is a course of webcasts (about 15-30 minutes apiece), designed to introduce you to the basics of statitiscs, primarily as practiced in finance and investing. As with my accounting class, I will start with the open disclosure that my knowledge in statistics is limited to what I use on a regular basis, and that I have no interest (or expertise) in delving into the depths of statistical theory. The class webcasts are right below, followed by links to the statistical tools that I find useful, and readings on each topic. With each session, I also have a post-class test and solution, some more involved than others, testing the grasp of the material in the session.

Class Webcasts


Session Webcast

Short Description

Supplementary Material


Class Preview

Provides an introduction to (my) version of a statistics class, including a lead in as to why I think it matters in finance.

  1. Slides


Statistics 101: What and Why?

Lay the groundwork for what statistics covers, and why the need for understanding it is greater than ever before.

  1. Slides
  2. Post-class test & solution


Sampling: Lead In

Examine why we use samples instead of looking at populations, and the perils of sampling bias and error.

  1. Slides
  2. Post-class test & solution


Sampling: Applications

Stock indices as samples, albeit non-random, and use of sampling to test investment strategies and effects of events on stock prices.

  1. Slides
  2. Post-class test & solution


Data Descriptives: Lead In

Describe the metrics that we use to describe data, as well as how and why we use them.

  1. Slides
  2. Post-class test & solution


Data Descriptives: Applications

Compute data descriptives for a variety of data in finance, from asset returns over time to PE ratios, costs of capital and dividend yields across companies.

  1. Slides
  2. Post-class test & solution


Data Distributions: Lead In

Examine how we pick distributions to describe data, and the advantages of doing so.

  1. Slides
  2. Post-class test & solution


Data Distributions: Applications

Use stock price and return data to test for normality in distributions.

  1. Slides
  2. Post-class test & solution


Data Relationships: Lead In

Describe how to measure relationships between two or more variables, and how those measurements can be used in prediction/analysis.

  1. Slides
  2. Post-class test & solution


Data Relationships: Applications

Look at both micro and macro examples of regression uses in finance and investing.

  1. Slides
  2. Post-class test & solution


Data Relationships: More Applications

Describe the process of building a multiple regresssion, checking for significance and non-linearities with finance examples.

  1. Slides
  2. Post-class test & solution


Probabilities: Lead In

Lay out the rules on estimating probabilities, and describe probabilistic techniques that can be used in analysis (decision trees, probit/logit, scenario analysis)

  1. Slides
  2. Post-class test & solution


Probability Tools: Applications

Provide examples of decision trees, simulation & probability estimation in finance.

  1. Slides
  2. Post-class test & solution


Simulations: Applications

Look at the use of Monte Carlo simulation in the valuation of a company.

  1. Slides
  2. Post-class test & solution

Statistical Tools

Tool My comments How to get
Microsoft Excel As an Apple loyalist, even I have to admit the world runs on Microsoft Office, and I am no exception. Microsoft Excel is my workhorse for collecting data, and its data analysis toolpack is surprisingly versatile. I can do most of what I want with its pre-set functions, and If I were truly an Excel ninja (which I not) almost everything. Subscribe to Office 365
StatPlus There are many statistical add ons, to Excel, but my favorite (perhaps because it is one of the few that has expended resources to build and maintain a Mac version) is StatPlus. If you are working with a limited budget, it should do the trick for you. StatPlus Home
Wizard Pro I find this data analysis program delightfully intuitive, to create pivot tables and to make sense of very large datasets. It is often my first stop, before I go on to StatPlus and SPSS to do statistical analysis. Wizard
SPSS There are many stand alone pure statistics packages, and many predate personal computers. I used SPSS on mainframe computers when I first started looking at data, and it is that loyalty, and the fact that SPSS has a Mac version that keep me in its corner. That said, it is overkill for almost everything I do, containing powers that I not only have never used, but would not know how to use. IBM SPSS site
Crystal Ball This is my go-to program for simulations. As an add-on to excel, the learning curve is not steep, and it comes with an impressive array of choices for distributions. I have also heard good things about @Risk, and having seen output from it, it seems to do be very similar to Crystal Ball. A significant caveat is that neither program works on a Mac, and I have to do unpleasant (for me) end runs to get around that limitations, including turning my Mac into a PC (a skin-crawling exercise) using Parallels Desktop. Oracle Crystal Ball


(While all these readings were accessible at the time that I put together this list, a few came with restrictions (Scientific American allows you a maximum of five free articles, and I have four on my list). It is also a reality that more and more of these readings will end up behind paywalls. The readings still remain worthwile, but not if you have to pay significant amounts to subscribe to a publication for a whole year to read them.

Session Readings
Books on Statistics/Numbers/Data Analysis
  1. How to lie with statistics, Darrell Huff, W.W. Norton
  2. Moneyball: The Art of Winning an Unfair Game, Michael Lewis, W.W. Norton.
  3. The Signal and the Noise: Why so many predictions fail, Nate Silver, Penguin Books.
  4. Innumeracy, John Allen Paulos, Holt McDougal
Sessions 1: General Statistics
  1. Data overload is real, Sheryl Estrada, Fortune
  2. How to lie with statistics, Peter Corning, Psychology Today
  3. How politicians poisoned statistics. Tim Harford, Financial Times
Sessions 2 & 2A: Sampling
  1. The Subtle Sources of Sampling Bias hiding in your data, Sam Ransbotham, MIT Sloan Management Review
  2. Sampling Bias and Twitter, Sally Raskoff, Everyday Sociology.
  3. Biased Survey Samples - The Risk To Findings, GreatBrook
  4. Big Data and Large Sample Size: A Cautionary Note on the Potential for Bias, Kaplan, Chambers and Glasgow
  5. What is survivorship bias?, Corporate Finance Institute
  6. Hedge Funds: The Living and the Dead, Bing Liang, Journal of Financial & Quantitative Analysis.
Sessions 3 & 3A: Data Descriptives
  1. The Median vs the Mean in the Age of Average, NPR
  2. Mean vs Median Life Expectance of Retirement Planning, Oblivious Investor
  3. Developing intuition for standard deviation, Liminal
  4. Explaining n-1 - The Intuition behind Bessel's correction for sample variance, Chris Combs.
  5. The Dangerous Disregrard for Fat Tails in Quantitative Finance, SRSV
  6. Discussion of Skewness, Fusion Investing
Sessions 4 & 4A: Statistical Distributions
  1. Six Reasons why you should stop using histograms, Samuele Mazzanti
  2. The Belle Curve, James Taranto, Wall Street Journal
  3. Normal Distributions in the Wild, Evan Stewart, Sociological Images
  4. The Distribution of S&P 500 Returns, William Egan, SSRN
Sessions 5 & 5A: Correlations and Regressions
  1. When correlation does not imply causation: Why your gut microbes may not (yet) be teh a silver bullet to all your problems, Dawn Chen, Science in the News, Harvard University.
  2. Spurious Correlations, Tyler Vigen
  3. Correlation vs Causation, Lee Falin, Scientific American.
  4. Covariance Matrices, Covariance Structures and Bears: Oh My!, Karen Grace-Martin, The Analysis Factor
  5. A Refresher on Regression Analysis, Amy Gallo, Harvard Business Review.
  6. Multicollinearity in Regression Analysis: Problems, Detection and Solutions, Jim Frost, Statistics by Jim.
Sessions 6, 6A & 6B: Probabilities and Tools
  1. Bayes' Theorem: The Maths Tool we probably use every day, but what is it?, The Conversation
  2. Baye's Theorem: What's the big deal?, John Horgan, Scientific American.
  3. People are really bad at probability, and this study shows how easy it is to trick us, Charlie Sorrel, Fast Company
  4. Why our brains do not intuitively grasp probabilities, Michael Shermer, Scientific American
  5. How Gerber used a decision tree in strategic decision making, Jay Buckley and Thomas Dudley, Graziadio Business Review
  6. DCF Myth 3.2: If you don't look, its not there!, Aswath Damodaran, Musings on Markets