Ira Rennert Professor of Entrepreneurship
Professor of Data Science
Professor of Information Systems Andre Meyer Faculty Fellow Paduano Fellow in Business Ethics (Emeritus)
Professor Provost won the Best Teacher Award from the Stern MSBA Class of 2014. (Thanks!)
Data Science for Business (the book) has been the best-selling data science book for several years now!
One of Fortune Magazine’s 5 “must read books
for MBAs”, December 2014.
"..this is the first book of its kind ... whether you are looking for a good comprehensive overview of data
science or are a budding data scientist in need of the basics, this is a
must-read."
-- Chris Volinsky, Director, Statistics Research, AT&T Labs; Winner of the $1Million Netflix Challenge
Prof.
Provost retired in 2010 from being Editor-in-Chief of the journal Machine
Learning.
Prof.
Provost
won the 2009 INFORMS Design Science Award for his work on Social
Network-based Marketing Systems. Previously he received IBM Faculty
Awards for outstanding research in
data mining and machine learning. He was elected as a founding
board
member of the International
Machine
Learning Society. He is a member of the editorial boards of the journals Machine Learning and Data Mining and Knowledge Discovery. In 2001, he
co-chaired
the program of the premier data mining conference (ACM SIGKDD ).
(More bio info)
Prof.
Provost co-founded of several successful NYC-based companies, including Detectica, Dstillery (formerly Media6degrees), Integral Ad Science, and Everyscreen Media. He advises other companies on data science and strategy.
Maytal
Saar-Tsechansky, Associate Professor, Univ. Texas at Austin
Gary
Weiss, Associate Professor, Fordham University (Ph.D. from Rutgers
University, Computer Science; Co-advised with Haym Hirsh) Claudia
Perlich, Senior Data Scientist, TwoSigma (Formerly at IBM Research and Chief Scientist, Dstillery) Shawndra
Hill, Senior Researcher, Microsoft Research (Formerly at Wharton) Brian Dalessandro, Director, Data Science, Zocdoc (Formerly: Director, Data Science, Zocdoc; Vice President Research/Data Science, Dstillery) Josh Attenberg, Head of Data Science, Urbint; CoFounder Detectica (Formerly: Data Science Lead, Etsy) Xiaohan Zhang, Director, Data Science, Integral Ad Science Enric Junqué de Fortuny, Assistant
Professor, NYU Shanghai (Formerly: Assistant
Professor, Rotterdam School of Management; Senior Research
Fellow, INSEAD; Ph.D. from U. Antwerp; Co-advised with David Martens) Jessica Clark, Assistant Professor, Univ. Maryland College Park Rob Moakler, Quantitative Researcher, Facebook
Prior Postdocs
Sofus Macskassy, Head of Data Analytics, Branch Metrics (Formerly: Director,
Fetch Labs @ Fetch Technologies, and also Manager, Applied Machine Learning, Facebook and Assistant Adjunct Professor, USC) Victor (Shengli) Sheng, Associate Professor, University of Central Arkansas
"Classification over bipartite graphs through
projection.”M. Stankova, S. Praet, D.
Martens, and F. Provost. Machine Learning,
forthcoming 2020.
"Instance-level explanation algorithms SEDC,
LIME, SHAP for behavioral and textual data: a counterfactual-oriented
comparison.” Y. Ramon, D. Martens, F. Provost, & T. Evgeniou.Forthcoming in Advances in Data Analysis and Classification 2020.
"A Benchmarking Study of Classification
Techniques for Behavioral Data.” S. De Cnudde, D. Martens, T. Evgeniou & F.
Provost. International Journal of Data
Science and Analytics 2/2020.
2019
"Unsupervised Dimensionality Reduction vs.
Supervised Regularization for Classification from Sparse Data.”J. Clark & F. Provost (2019). Data
Mining and Knowledge Discovery 33(4):871–916.
"Deep Learning on Big, Sparse, Behavioral
Data.” S. De Cnudde, Y. Ramon, D. Martens & F. Provost
(2019). Big Data 7(4): 286-307.
“Big Data, Data Science, and Civil Rights.”Barocas, Bradley, Honavar, and Provost (2017).Invited paper for the Computing
Community Consortium of the Computing Research Association (CRA). http://cra.org/ccc/resources/ccc-led-whitepapers/
Measuring causal impact of online actions via naturalexperiments: application to display advertising. D. Hill, R. Moakler, A. Hubbard, V. Tsemekhman, F. Provost, K, Tsemekhman. In the Proceedings of the Twenty-first ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining (KDD
2015).
Iteratively Refining SVMs using Priors.E. Junque de Fortuny, T.
Evgeniou, F. Provost, and D. Martens. IEEE International Conference on Big Data
(IEEE BigData 2015).
Corporate Residence Fraud Detection.D. Martens, et al.In Proceedings of the Twentieth ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining (KDD
2014).
A Data Scientist’s Guide to Startups. F. Provost, G.
Webb, R. Bekkerman, O. Etzioni, U. Fayyad, C. Perlich.Big
Data2(3):117-128, September 2014.
Pleasing the advertising oracle.Probabilistic prediction from sampled,
aggregated ground truth. M. Williams, C. Perlich, B. Dalessandro, F.
Provost.In Proceedings of the Eighth
International Workshop on Data Mining for Online Advertising (ADKDD 2014).
Causal impact of online
advertisements using viewability as a method of treatment. R. Moakler, et al. Winter Conference on Business Intelligence,
Feb. 2014. (See Moakler et al. KDD 2015 above.)
Scalable Supervised Dimensionality Reduction Using Clustering. Raeder, T., C. Perlich, B. Dalessandro, O. Stitelman, and F. Provost. In
Proceedings of the Nineteenth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (KDD 2013).
Bid Optimizing and Inventory Scoring in Targeted Online Advertising. Perlich, C., B. Dalessandro, R. Hook, O.
Stitelman, T. Raeder, and F. Provost. In
Proceedings of the Eighteenth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (KDD 2012).Best Paper Award, Industry &
Government Track.
Design
Principles of Massive, Robust Prediction Systems.Raeder,
T., O. Stitelman, B. Dalessandro, C. Perlich, and F. Provost. In Proceedings
of the Eighteenth ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining (KDD 2012).
2011
Online Active Inference and Learning. J. Attenberg and F. Provost. To appear in Proceedings of the Seventeenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2011),
Selective Data Acquisition for Machine Learning.
J. Attenberg, P. Melville, F. Provost, and M. Saar-Tsechansky. To
appear in B. Krishnapuram, S. Yu, B. Rao (eds.), Cost-Sensitive Machine
Learning, 2011.
A Unified Approach to Active Dual Supervision. J. Attenberg, P. Melville and F. Provost. To appear in Proceedings of the European Conference on Machine Learning and Principles of Knowledge Discovery in Databases (ECML PKDD 2010).
Social Network Collaborative Filtering.
Zheng, R., D. Wilkinson, and F. Provost. Working paper CeDER-8-08.
Center for Digital Economy Research, Stern School of Business, New York
University. 2008.
ROC Confidence Bands: An Empirical Evaluation. S. Macskassy, F. Provost, and S. Rosset. In Proceedings of the 22nd International Conference on Machine Learning (ICML-2005). [Also appears in the ICML-2005 Workshop on ROC Analysis in Machine Learning (ROCML-2005).]
An Expected Utility Approach to Active Feature-value Acquisition. P. Melville, M. Saar-Tsechansky, F. Provost, and R. Mooney. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM-2005), pp. 483-486. Also appeared in Proceedings of the KDD-05 Workshop on Utility-Based Data Mining, Chicago, IL, August 2005.
Suspicion scoring based on guilt-by-association, collective inference, and focused data access. S. Macskassy and F. Provost. In Annual Conference of the North American Association for Computational Social and Organizational Science (NAACSOS),
2005. [This is a followup paper to the IA paper above, with new results
but considerable overlap, and unfortunately with the same title.]
Knowledge Discovery Using Concept-Class Taxonomies. V. Kolluri, F. Provost, B. Buchanan, and D. Metzler. In AI 2004: Advances in Artificial Intelligence: 17th Australian Joint Conference on Artificial Intelligence. Lecture Notes in Computer Science, Springer-Verlag Heidelberg .
The Gift of Gab: Evidence TelE-Commerce Firms can Profit from Viral Marketing. S. Hill, F. Provost and C. Volinsky. First Interdisciplinary Symposium between Information Systems, Statistics and Related Fields. Decision and Information Technologies Department, Robert H. Smith School of Business, Univ. of Maryland. May 2005
Preliminary version: CeDER Working Paper #IS-01-02, Stern School of
Business,
New York University, NY, NY 10012. Fall 2001. (preliminary
version: PDF,PS)
A
Simple Relational Classifier. S. Macskassy and F. Provost. Proceedings of the
KDD-2003
Workshop on Multirelational Data Mining.
Relational
Learning Problems and Simple Models. F. Provost, C. Perlich, and S. Macskassy. Proceedings of the
IJCAI-2003
Workshop on Learning Statistical Models from Relational Data.
Aggregation and
Concept
Complexity in Relational Learning. C. Perlich and F. Provost. Proceedings of the
IJCAI-2003
Workshop on Learning Statistical Models from Relational Data.
2002
Perlich, C. and F. Provost. "A
Modular Approach to Relational Data Mining." American
Conference
on Information Systems (AMCIS) 2002.
Bernstein, A., S. Clearwater, S. Hill, C.
Perlich, and
F. Provost. “Discovering Knowledge from
Relational
Data Extracted from Business News.” In Proceedings of the
KDD-2002 Workshop on Multi-Relational Data Mining, 2002.
Provost, F. and V. Kolluri, "Scalability."
In W. Kloesgen and J. Zytkow (eds.), Handbook of Knowledge
Discovery
and Data Mining.
Danyluk, A. and F. Provost, "Telecommunications
Network Diagnosis." In W. Kloesgen and J. Zytkow (eds.), Handbook
of Knowledge Discovery and Data Mining. (PDF)
Fawcett, T. and F. Provost, "Data
Mining for Fraud Detection." In W. Kloesgen and J. Zytkow
(eds.),
Handbook
of Knowledge Discovery and Data Mining.
Macskassy, S., H. Hirsh, F. Provost, R.
Sankaranarayanan,
V. Dhar. “Intelligent
Information Triage.” In Proceedings of SIGIR-2001.
Bernstein, A. and F. Provost. "An
Intelligent
Assistant for the Knowledge Discovery Process." In
Proceedings
of IJCAI-01 Workshop on Wrappers for Performance Enhancement in KDD. (CeDER
Working Paper #IS-01-01, Stern School of Business, New York University,
January 2001.)
Provost, F., D. Jensen and T. Oates, "Progressive
Sampling." In H. Liu and H. Motoda (eds.), Instance
Selection
and Construction, A Data Mining Perspective.
Provost, F., D. Jensen and T. Oates, "Efficient
Progressive Sampling." Proceedings of the Fifth International
Conference
on Knowledge Discovery and Data Mining (KDD-99).
Danyluk, A., T. Fawcett, and F. Provost, "AI
Approaches
to Time-series Problems." Workshop report in AI Magazine,
1999.
Provost, F. and D. Jensen, "Evaluating
Machine Learning,
Knowledge Discovery, and Data Mining." Tutorial presented at the
Sixteenth
International Joint Conference on Artificial Intelligence (IJCAI-99)
and
at the Sixteenth National Conference on Artificial Intelligence
(AAAI-99).
(abstract
| links)
Fawcett, T., I. Haimowitz, F. Provost, and S.
Stolfo,
"AI Approaches to Fraud Detection and Risk Management." Workshop
report in AI Magazine, 1998.
Provost, F. and D. Jensen, "Evaluating Data
Mining
and the Knowledge Discovered." Tutorial presented at the Fourth
International
Conference on Knowledge Discovery and Data Mining (KDD-98).
Fawcett, T. and F. Provost, "Automatic Design
of
Fraud Detection Systems" U.S. Patent #5,790,645.
1997
Fawcett, T. and F. Provost, "Adaptive
Fraud Detection."
Data Mining and Knowledge Discovery
1 (1997).
Provost, F. and V. Kolluri, "Scaling
Up Inductive Algorithms: An Overview." In Proceedings of the
Third
International Conference on Knowledge Discovery and Data Mining (KDD-97).
Krenzelok, E. and F. Provost, "The Ten Most
Common
Plant Exposures Reported to Poison Information Centers in the United
States."
Journal
of Natural Toxins (1995).
Provost, F. and A. Danyluk, "Learning from
Bad Data."
In Proceedings of the ML-95 Workshop on Applying Machine Learning
in
Practice, 1995.
Krenzelok, E., F. Provost, T. Jacobsen, J.
Aronis, B.
Buchanan, "Assessing Patient Referral Patterns to a Health Care
Facility
in Plant Exposure Patients Using Computer Artificial Intelligence."
European Association of Poison Centres and Clinical Toxicologists
Scientific
Meeting. May 18-20, 1995, Krakow, Poland.
Krenzelok, E., F. Provost, T. Jacobsen, J.
Aronis, B.
Buchanan, "Poinsettia (Euphorbia pulcherrima) Exposures Have Good
Outcomes...Just
As We Thought." European Association of Poison Centres and Clinical
Toxicologists Scientific Meeting, 1995.
1994
Provost, F. and D. Hennessy, "Distributed
Machine
Learning: Scaling up with Coarse-grained Parallelism." In Proc
of
the Second International Conference on Intelligent Systems for
Molecular
Biology(ISMB-94).
Provost, F., "Goal-Directed Inductive
Learning: Trading
Off Accuracy for Reduced Error Cost." In Proceedings of the
AAAI
Spring Symposium on Goal-Directed Learning, 1994.
1993
Provost, F., "Iterative Weakening: Optimal
and Near-Optimal
Policies for the Selection of Search Bias." In Proceedings of
the
Eleventh National Conference on Artificial Intelligence (AAAI-93).
Danyluk, A. and F. Provost, "Small Disjuncts
in Action:
Learning to Diagnose Errors in the Telephone Network Local Loop."
In
Proceedings
of the Tenth International Conference on Machine Learning (ICML-93).
Danyluk, A. and F. Provost, "Adaptive Expert
Systems:
Applying Machine Learning to NYNEX MAX." In Proceedings of the
AAAI-93
Workshop: AI in Service and Support--Bridging the Gap between Research
and Applications, 1993.
1992
Provost, F. and R. Melhem, "A Distributed
Algorithm
for Embedding Trees in Hypercubes with Modifications for Run-Time Fault
Tolerance." Journal of Parallel and Distributed Computing
14
(1992).
Provost, F. and B. Buchanan, "Inductive Policy."
In
Proceedings of the Tenth National Conference on Artificial
Intelligence(AAAI-92).
Provost, F., "ClimBS: Searching the Bias Space."
In
Proceedings of the Fourth International IEEE Conference on Tools
with Artificial Intelligence(TAI-92).
Provost, F., "A Baseline Taxonomy of Bias
Adjustment
Policies." In Proceedings of the ML-92 Workshop on Biases in
Learning,
1992.
Provost, F. and B. Buchanan, "Inductive
Strengthening:
The effects of a simple heuristic for restricting hypothesis space
search."
In K.P. Jantke (ed.), Analogical and Inductive Inference (Lecture
Notes
in Artificial Intelligence 642). Springer-Verlag, 1992.
Clearwater, S., W. Cleland, F. Provost, E. Stern
and
Z. Zhang, "A Real-Time Expert System for Experimental High
Energy/Nuclear
Physics." In D. Perrett-Gallix and W. Wojcik (eds.), New
Computing
Techniques in Physics Research. Paris: Centre National de la
Recherche
Scientific, 1990.
1991
Provost, F. and R. Melhem, "Embedding Rings
in Hypercubes
for Run-Time Fault Tolerance." In Proceedings of the Fourth
ISMM/IASTED
Intl. Conference on Parallel and Distributed Computing and Systems,
1991.
1990
Clearwater, S., W. Cleland, F. Provost, E. Stern
and
Z. Zhang, "A Real-Time Expert System for Trigger Logic Monitoring."
Nuclear
Instruments and Methods in Physics Research A293 (1990).
Clearwater, S. and F. Provost, "RL4: A Tool
for Knowledge-Based
Induction." In Proceedings of the Second International IEEE
Conference
on Tools for Artificial Intelligence (TAI-90).
1989
Provost F. and R. Melhem, "Distributed Fault
Tolerant
Embedding of Trees and Rings in Hypercubes." In I. Koren (ed.) Defect
and Fault Tolerance in VLSI systems Volume 1. New York, NY: Plenum
Press, 1989