One can ask this basic question in contexts of individual decisions — will I be happy if I marry this particular person? — but here let us consider Big Events which are unique, or with little data on relevant past events. Is it *likely* that there will be a self-sustaining…

(1) The importance of this type of contest is a central point of David Donoho's 2017 article "50 years of data science".

(2) Was the choice of criterion "RMSE" rather than "predicting exact rating (1-5) correctly" a stroke of genius, a stroke of idiocy, or mere thoughtlessness? Because the "kitchen sink" outcome, that for RMSE it is optimal to average over many equally good algorithms, was certainly predictable. The alternative criterion might have produced a winning algorithm that was both more implementable and more human-interpretable.

This is an unsolicited enthusiastic recommendation for the new book by two Berkeley colleagues. Data Science is perhaps unique in the following sense. A physicist writes technical papers to be read by other physicists, as do chemists or mathematicians or ……. In contrast, a data scientist needs to communicate results…

As a mathematician of the old school, I don’t seek to engage Big Data. Instead, this post is about Small Data on a Big Subject. Each January since 2006 the annual Global Risks Report (GRR) has been published, as background material for the annual World Economic Forum (see Footnote 1…

I occasionally let slip that I am an academic great-grandchild of G.H. Hardy, whose non-technical writings about mathematics, in particular A Mathematician’s Apology are often quoted in general discussions of the nature of mathematics. Here I give my personal thoughts related to four of these well known quotations. These of…

A famous 2010 paper To Explain or to Predict? by Galit Shmueli examined the essential difference between classical mathematical statistics and modern machine learning. Amazon retail uses its data on you and others to *predict* what you might buy, without caring *why* you might want to buy it — an…

Bayesian updating — revising an estimate when new information is available — is a key concept in data science. It seems intuitively obvious that (within an accurate probability model of a real-world phenomenon) the revised estimate will typically be better. …

Popular writings about mathematics, and autobiographies by eminent mathematicians, provide some sense of what doing research-level mathematics is like. In this article I seek to convey the underlying implicit “flavor” of the profession, not the day-to-day explicit activity. Let’s work by analogy. That is, I will make 4 “doing research…

To start with an analogy, what is money? Textbooks and Wikipedia say that money is a medium of exchange, and a store of value, and a unit of account. But that is really the answer to a different question: “what is money *for*”? Asking “what *is* money?”, …

It is widely said that the Kolmogorov axioms provide the standard mathematical formalization of Probability (capitalized, to mean the discipline). This is true, but is not very informative to a non-mathematical reader, so let me explain its significance.

Around 1900 the axiomatic approach to mathematics had spread well beyond its…