Quickstart: Bayesian Analysis of time-Variable Growth Rate

The assumption of continuous unlimited growth with a maximum growth rate \(\mu_{max}\) often does not hold in reality. Even Corynebacterium glutamicum exhibits an overshoot in the transition into the stationary phase that is not explained by most growth models.

This notebook demonstrates how bletl.growth.fit_mu_t can be used to infer the specific growth rate over time \(\mu(t)\) from backscatter measurements. Internally, this is done with PyMC and requires a calibr8 calibration model that models the relationship between cell dry weight and backscatter.

First install PyMC, optionally also with Graphviz for visualization of the model.

[1]:
import numpy
from matplotlib import pyplot, cm
import pathlib
import scipy.stats

import arviz
import pymc as pm

import calibr8
# The next line imports the "models.py" file lying to this notebook.
import models

import bletl
# the `growth` submodule must be imported explicitly because is uses optional dependencies:
import bletl.growth

The Data

A calibration model for the CDW/Backscatter correlation at the experiment settings (“Pahpshmir” BioLector, 1400 rpm*, Corynebacterium glutamicum WT) is loaded via the quickstart_04_helpers. This calibration model is identical to the one used in the calibr8 paper.

_(*The experiment actually ran at 1300 rpm, but we don’t have a calibration model for that. But this is just an example, so it’s fine.)_

This example uses a BioLector Pro test dataset from bletl. The well D06 that is analyzed was oxygen-limited in the late exponential phase.

[2]:
cm_cdw = models.get_biomass_calibration()
# rename the dependent key for less clutter:
cm_cdw.dependent_key = "BS3"
calibr8.plot_model(cm_cdw)
pyplot.show()
_images/04_mut_3_0.png
[3]:
bldata = bletl.parse(
    pathlib.Path(r"..\tests\data\BLPro\22-HM_Coryne_Batch_O2Up-HM-2018-05-09-08-56-27.csv"),
)
bs3 = bldata['BS3']
t, y = bs3.get_timeseries("D06")

fig, ax = pyplot.subplots(dpi=140)
ax.scatter(t, y, marker="x")
ax.set(
    ylabel="backscatter$_{gain=3}$",
    xlabel="time / h",
)
ax.annotate("", xy=(15, 55), xytext=(12, 60), arrowprops=dict(arrowstyle="->"))
ax.annotate(
    "switchpoints",
    xy=(16, 71), xytext=(12, 60),
    arrowprops=dict(arrowstyle="->"), horizontalalignment="right"
)
pyplot.show()
_images/04_mut_4_0.png

Fitting

The fitting is done with the fit_mu_t function from bletl.growth. It takes the time- and backscatter vectors of the data along with the calibr8 calibration model and a manually specified list of switchpoint-times.

The model describes the timeseries of biomass \(X_t\) as a function from two parameters \(X_0\) and \(\vec{\mu_t}\) where \(\vec{\mu_t}\) is the concatenation of growth rate vectors for individual phases:

\[\vec{\mu_t} = concat(\vec{\mu}_{phase\ 0}, \ldots, \vec{\mu}_{phase\ n}, \vec{\mu}_{stationary})\]

The biomass timeseries is calculated from a cumulative sum:

\[\vec{X_t} = X_0 + X_0 \cdot e^{cumsum(\vec{\mu_t}\cdot \vec{dt})}\]

The \(\vec{\mu_t}\) vectors of individual phases are described by either a Gaussian or StudentT random walk. If a StudentT random walk is used, the model can automatically detect switchpoints in the growth rate.

Most Corynebacterium curves have a switchpoint when the substrate is depleted. Conveniently, this switchpoint falls together with the highest backscatter value, so we can get its time with t[numpy.argmax(y)].

[4]:
result = bletl.growth.fit_mu_t(
    t, y,
    cm_cdw,
    # drift_scale is the standard deviation with which the growth rate mu may drift per measurement cycle.
    drift_scale=0.025,
    # the (optional) switpoints may be a sequence or dict:
    switchpoints=[
        t[numpy.argmax(y)],
    ],
    # When switchpoints are set, it defaults to a Gaussian random walk.
    # By enforcing the StudentT random walk, we can detect _additional_ switchpoints:
    student_t=True,
    # Initial biomass concentration around 1 g/L
    x0_prior=1,
)
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
100.00% [410/410 00:01<00:00 logp = 458.99, ||grad|| = 5.3408]

Interpreting the Result

The returned object is a bletl.growth.GrowthRateResult that has all relevant variables available attached as properties. You can run help(result) to get a full description.

The result.pmodel property is the PyMC model object, which is visualized in the next cell. From this so called plate model you can see that the length 173 mu_t is modeled from two shorter fragments, a length 101 mu_phase_0 and a length 73 mu_stationary.

[5]:
plot = pm.model_to_graphviz(result.pmodel)
display(plot)
_images/04_mut_8_0.svg

Visualizing the Result

With the default settings, the fit_mu_t function just fitted the maximum-a-posteriori (MAP) estimate to the data. This is the best fit and can be plotted directly from result.mu_map and a few additional properties:

[6]:
fig, (left, right) = pyplot.subplots(ncols=2, figsize=(14,6), dpi=120)

# backscatter observations are translated to CDW via `predict_independent`
left.scatter(t, cm_cdw.predict_independent(y), s=25, label="$y_{cal}$", marker="x", color="black")
left.plot(t, result.x_map, color="orange", label="MAP fit")

left.set(
    ylabel="$\mathrm{X\ /\ g\ L ^{-1}}$",
    xlabel="time / h",
    ylim=(0, None),
)
left.legend(loc='upper left', frameon=False)

right.axhline(0)
right.plot(result.t_segments, result.mu_map, color='orange', label='MAP fit')
right.set(
    ylabel="$\mathrm{\mu\ /\ h^{-1}}$",
    xlabel="time / h",
)

# draw vertical lines at the (detected) switchpoints
for t_switch, label in result.switchpoints.items():
    left.axvline(t_switch, linestyle=":", color="orange", label=label)
    left.text(
        t_switch, 1,
        s=f"{label}\n",
        color="gray",
        rotation=90,
        horizontalalignment="center",
        verticalalignment="bottom",
    )

# zoom in on the range containing the switchpoints
left.set_xlim(13, 18)

pyplot.show()
_images/04_mut_10_0.png

Sampling the posterior probability distribution of \(\vec{\mu}_t\)

The GrowthRateResult object has a convenience function that may be called to run MCMC sampling. This sampling is not executed by default, because it takes significantly longer than the MAP estimation. For models with StudentT random walks, the MCMC sampling is much slower than for models with Gaussian random walks!

With result.sample() you can run the default settings. Any keyword-arguments will be forwarded to the pymc.sample call, if you want to customize.

[7]:
result.sample()
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
Auto-assigning NUTS sampler...
Initializing NUTS using adapt_diag...
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [studentt_rv{0, (0, 0, 0), floatX, False}.0, studentt_rv{0, (0, 0, 0), floatX, False}.out]
  warnings.warn(
c:\Users\zufal\miniconda3\envs\murefidev\lib\site-packages\pymc\logprob\joint_logprob.py:167: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: [normal_rv{0, (0, 0), floatX, False}.0, normal_rv{0, (0, 0), floatX, False}.out]
  warnings.warn(
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [mu_phase_0, mu_phase_1, X0]
100.00% [4000/4000 02:54<00:00 Sampling 4 chains, 0 divergences]
Sampling 4 chains for 500 tune and 500 draw iterations (2_000 + 2_000 draws total) took 194 seconds.

You can save the sampling result with ArviZ for reproducing the analysis later:

[8]:
result.idata.to_netcdf(f"Quickstart_04_Bayesian Analysis of time-Variable Growth Rate.nc")
[8]:
'Quickstart_04_Bayesian Analysis of time-Variable Growth Rate.nc'

Visualizing the \(\vec{\mu}_t\) posterior distribution

The posterior samples are accessible as a 2D array via result.mu_mcmc and can be plotted conveniently using the pymc.gp.util.plot_gp_dist helper function.

The following plot also shows the DO profile that explains why the growth rate suddenly breaks down before depletion of the carbon source.

[9]:
fig, (left, right) = pyplot.subplots(ncols=2, figsize=(14,6), dpi=120)

pm.gp.util.plot_gp_dist(
    ax=left,
    x=t,
    samples=result.x_mcmc,
    samples_alpha=0,
)
left.scatter(t, cm_cdw.predict_independent(y), s=50, label="$y_{cal}$", marker="x")
left.set(
    ylabel="$\mathrm{X\ /\ g\ L ^{-1}}$",
    xlabel="time /2",
)
# draw vertical lines at switchpoints
for t_switch, label in result.switchpoints.items():
    left.axvline(t_switch, linestyle=":", color="orange", label=label)
    left.text(
        t_switch, 1,
        s=f"{label}\n",
        color="gray",
        rotation=90,
        horizontalalignment="center",
        verticalalignment="bottom",
    )

pm.gp.util.plot_gp_dist(
    ax=right,
    x=result.t_segments,
    samples=result.mu_mcmc,
    samples_alpha=0,
)
right.plot(result.t_segments, result.mu_map, color="orange", label="MAP fit")
right.set(
    ylabel="$\mathrm{\mu\ /\ h^{-1}}$",
    xlabel="time / h",
    ylim=(-0.1, 0.6),
)
right.legend(frameon=False)

# plot the DO curve for comparison
tdo, do = bldata["DO"].get_timeseries("D06")
right2 = right.twinx()
right2.plot(tdo, do, color="green")
right2.set_ylim(0)
right2.set_ylabel("DO / %", color="green")

# zoom in on the switchpoints
left.set_xlim(13, 18)

pyplot.show()
_images/04_mut_16_0.png

The comparison of posterior distribution and MAP estimate is consistent, indicating that the MAP estimate can be used for a quick look, before hitting the computationally expensive inference button™.

MCMC Diagnostics

If the sampling returned any warnings about convergence or sample size, you can create diagnostic plots with ArviZ.

In this case the convergence doesn’t look great - for a publication you should take more draws, or switch to Gaussian random walks with manually specified switchpoints.

[10]:
arviz.plot_trace(result.idata)
pyplot.gcf().tight_layout()
pyplot.show()
_images/04_mut_19_0.png

Supplementary Explanations

The key difference between a Normal (Gaussian) distribution and a Student-t distribution are the tails of the distribution. The Student-t distribution may be described as having fat tails, with the degree-of-freedom parameter df or \(\nu\) tuning how fat they are. With \(\nu \to \infty\), the shape of the distribution coverges to the shape of a Normal distribution.

In the \(\vec{\mu}_t\) model, decreasing distance between \(\mu_{t+1}\) and \(\mu_t\) is rewarded according to the logpdf of the distribution used for the random walk.

With a Gaussian distribution, the logpdf decreases faster and faster as the distance increases. In other words, a large distance between \(\mu_{t+1}\) and \(\mu_t\) is regarded as very unlikely. This makes jumps in \(\vec{\mu}_t\) virtually impossible.

With a Student-t distribution, the logpdf decreases slower as the distance increases, making it possible to get large jumps in \(\vec{\mu}_t\).

The automatic switchpoint detection relies on the Student-t distribution permitting these large jumps. They are detected automatically if they lie outside of the interval that contains 99 % of the distribution’s probability mass.

[11]:
fig, (left, right) = pyplot.subplots(dpi=140, ncols=2, figsize=(10, 5))

x = numpy.linspace(-10, 10, 300)
left.plot(x, scipy.stats.norm.pdf(x), label="Normal")
right.plot(x, -scipy.stats.norm.logpdf(x), label="Normal")
for nu in [1, 10]:
    left.plot(x, scipy.stats.t.pdf(x, df=nu), label=rf"Student-t ($\nu={nu}$)")
    right.plot(x, -scipy.stats.t.logpdf(x, df=nu), label=rf"Student-t ($\nu={nu}$)")
left.legend(frameon=False)
right.legend(frameon=False)
left.set_ylabel("pdf(x)")
right.set_ylabel("− logpdf(x)")
pyplot.show()
_images/04_mut_21_0.png
[12]:
%load_ext watermark
%watermark -n -u -v -iv -w
Last updated: Mon Dec 19 2022

Python implementation: CPython
Python version       : 3.9.15
IPython version      : 8.7.0

bletl     : 1.2.0
matplotlib: 3.6.2
scipy     : 1.9.3
arviz     : 0.14.0
sys       : 3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:41:22) [MSC v.1929 64 bit (AMD64)]
numpy     : 1.23.5
calibr8   : 7.0.0
pymc      : 5.0.0

Watermark: 2.3.1

[ ]: