O'Reilly logo
live online training icon Live Online training

Probabilistic modeling with TensorFlow Probability

Rethinking machine learning

Deepak Kanungo
Panos Lambrianides

Probabilistic models enable you to easily encode your or your company’s institutional knowledge into the model before you start collecting data, allowing you to make probabilistic inferences automatically from datasets that need not be large or even clean. Unlike many popular machine learning models, such as neural networks, probabilistic models are not black boxes. These models enable you to infer causes from effects in a fairly transparent manner. This is important in heavily regulated industries, such as finance and health care, where you have to explain the basis of your decisions. In addition, the conventional use of maximum likelihood estimates (MLE) in models can lead to costly assessments of risks. It’s imperative that all models quantify the uncertainty inherent in their point estimates so that sound business decisions can be made under uncertainty.

You can quantify the uncertainty in your estimates quite easily using TensorFlow Probability (TFP), one of the most powerful open source probabilistic machine learning libraries. TFP gives you the tools to build and fit complex probabilistic models using a few simple lines of Python code—letting you focus on model building and evaluation while automating the necessary statistical inferences.

In this hands-on four-hour course, Deepak Kanungo teach you to use TFP to quantify the uncertainty inherent in all point estimates. Join in to learn how to make realistic probabilistic predictions without making unrealistic assumptions in your models, enabling you to make sound business decisions in the face of uncertainty.

What you'll learn-and how you can apply it

By the end of this live online course, you’ll understand:

  • The sources of errors in models
  • The hazards of using conventional statistics to quantify uncertainty in estimates
  • The benefits of quantifying uncertainty using Bayesian inference
  • How to explicitly encode personal and institutional knowledge into your models
  • The advantages of using TFP to learn from small datasets
  • The concepts behind Bayesian linear regression
  • The underlying principles of change point test analysis of your business processes
  • State-of-the-art algorithms like Markov chain Monte Carlo (MCMC), No-U-Turn Sampler (NUTS), and automatic differential variational inference (ADVI) at a high level

And you’ll be able to:

  • Build probabilistic models in TFP for your business processes
  • Use these models to quantify the uncertainty in your company’s cost of capital so that you can make better capital budgeting decisions
  • Use these models to estimate the uncertainty around change point tests in your business processes for quality control, intrusion detection, medical diagnostics, spam filtering, and website tracking
  • Continually update your estimates based on new data

This training course is for you because...

  • You’re an analyst or developer who needs to build probabilistic models that quantify the uncertainty in your estimates or forecasts.

Prerequisites

  • A basic understanding of probability and statistics (Read “Seeing Theory” for a visual overview.)
  • A working knowledge of Python programming

Recommended preparation:

Recommended follow-up:

About your instructor

  • Deepak Kanungo is the founder and CEO of Hedged Capital LLC, an AI-powered trading and advisory firm. Previously, Deepak was a financial advisor at Morgan Stanley, a Silicon Valley fintech entrepreneur and a Director in the Global Planning Department at MasterCard International. Deepak was educated at Princeton University (Astrophysics) and The London School of Economics (Finance and Information Systems). Hedged Capital’s trading algorithms use probabilistic models and technologies such as TFP. In 2005, Deepak invented a project portfolio management system using Bayesian Inference, the foundation of all probabilistic programming languages.

  • Panos Lambrianides has software engineering background in many industries including Finance, Biotech and Aerospace. He has served in various developer roles, from Systems Engineer in startups to Enterprise architect in a fortune 500 company. Panos has a passion for Machine Learning and AI and is currently busy doing research in reinforcement learning and control theory with applications in robotics and control of swarms. He holds a BA and MA from Cambridge University, and is expecting his PhD in Applied Mathematics from UC Santa Cruz later this fall.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Introduction and the Monty Hall problem (15 minutes)

  • Group discussion: Introduction; your experience with Python and statistics
  • Hands-on exercises: Explore the Monty Hall problem through the online simulator

Epistemic probability (15 minutes)

  • Group discussion: Epistemic probability; how it differs from the frequentist view of probability, on which much of conventional statistics is based

Bayesian inference (25 minutes)

  • Lecture: The results from the game; why Bayesian inference offers a solution to the apparent paradox; Bayes’s theorem, the fundamental algorithm of all probabilistic programming languages
  • Group discussion and Q&A
  • Break (5 minutes)

Setup (5 minutes)

  • Lecture: A quick validation of the environment; a brief review of the features of Colab notebook

TensorFlow Probability (TFP) (50 minutes)

  • Lecture: The basic concepts and declarative commands in Python code used for building probabilistic models in TFP
  • Hands-on exercises: Walk through the built-in change point test analysis model in the Colab notebook and analyze its output graphs
  • Group discussion and Q&A
  • Break (5 minutes)

Statistical analysis (15 minutes)

  • Hands-on exercises: Run the built-in market model (MM) that uses standard linear regression with various start and end dates to draw 10 random samples to compute alpha, beta, and sample error of your company’s stock (or a proxy stock if private), including the 95% confidence intervals for all parameters; note your company’s cost of capital and other results in your notebook

Types of modeling errors (15 minutes)

  • Group discussion: The sources of errors in models; the imperative need for quantifying uncertainty in your estimates

Confidence intervals (10 minutes)

  • Lecture: The conventional meaning of probability; how confidence intervals are actually meant to be used

Quantifying uncertainty (15 minutes)

  • Group discussion: Why it’s inappropriate to use confidence intervals to quantify uncertainty in estimates that are not normally distributed
  • Break (5 minutes)

TFP algorithms (30 minutes)

  • Lecture: The basic concepts behind the Markov chain Monte Carlo (MCMC), No-U-Turn Sampler (NUTS), and automatic differential variational inference (ADVI) algorithms; what problems they’re best suited to address

Bayesian regression (20 minutes)

  • Hands-on exercises: Recode the MM model in TFP using Bayesian linear regression; produce credible intervals for your company’s cost of capital and all other relevant parameters

Wrap-up and Q&A (10 minutes)