A step by step guide to uncertainty definition with probabilistic modeling.
Machine learning models are usually developed as deterministic machines that map input to output using a point estimate of parameter weights calculated by maximum-likelihood methods from data. However, there is a lot of statistical fluke going on in the background. For instance, a dataset itself is a random set of points of arbitrary size from a unknown distribution superimposed by additive noise, and for a particular collection of data, different models (i.e. different parameter combinations) might be reasonable. Hence, there is a lot of uncertainty going on about the parameters and predictions being made. Bayesian statistics provides a framework to deal with aleoteric and epistemic uncertainty, and with the release of TensorFlow Probability, probabilistic modeling has been made a lot easier, as I shall demonstrate with this post.
A Bayesian neural network is characterized by its distribution over weights (parameters) and/or outputs. Depending on wether aleotoric, epistimic, or both uncertainties are considered, the code for a Bayesian neural network looks slighty different. To demonstrate the working principle, the Air Quality dataset from DeVito will serve as an example. It contains data from different sensors and references as a time series. The data has been collected at a main street in an Italian city characterized by heavy car traffic, and the goal is to construct a mapping from sensor responses to reference concentrations (Figure 1), i.e. building a calibration function as a regression task.