Optimal Experiment Design

The pyro.contrib.oed module provides tools to create optimal experiment designs for pyro models. In particular, it provides estimators for the average posterior entropy (APE) criterion.

To estimate the APE for a particular design, use:

def model(design):
    ...

eig = vi_ape(model, design, ...)

APE can then be minimised using existing optimisers in pyro.optim.

Expected Information Gain

vi_ape(model, design, observation_labels, target_labels, vi_parameters, is_parameters, y_dist=None)[source]

Estimates the average posterior entropy (APE) loss function using variational inference (VI).

The APE loss function estimated by this method is defined as

\(APE(d)=E_{Y\sim p(y|\theta, d)}[H(p(\theta|Y, d))]\)

where \(H[p(x)]\) is the differential entropy. The APE is related to expected information gain (EIG) by the equation

\(EIG(d)=H[p(\theta)]-APE(d)\)

in particular, minimising the APE is equivalent to maximising EIG.

Parameters:
  • model (function) – A pyro model accepting design as only argument.
  • design (torch.Tensor) – Tensor representation of design
  • observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred.
  • target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured.
  • vi_parameters (dict) – Variational inference parameters which should include: optim: an instance of pyro.Optim, guide: a guide function compatible with model, num_steps: the number of VI steps to make, and loss: the loss function to use for VI
  • is_parameters (dict) – Importance sampling parameters for the marginal distribution of \(Y\). May include num_samples: the number of samples to draw from the marginal.
  • y_dist (pyro.distributions.Distribution) – (optional) the distribution assumed for the response variable \(Y\)
Returns:

Loss function estimate

Return type:

torch.Tensor

naive_rainforth_eig(model, design, observation_labels, target_labels=None, N=100, M=10, M_prime=None)[source]

Naive Rainforth (i.e. Nested Monte Carlo) estimate of the expected information gain (EIG). The estimate is

\[\frac{1}{N}\sum_{n=1}^N \log p(y_n | \theta_n, d) - \log \left(\frac{1}{M}\sum_{m=1}^M p(y_n | \theta_m, d)\right)\]

Monte Carlo estimation is attempted for the \(\log p(y | \theta, d)\) term if the parameter M_prime is passed. Otherwise, it is assumed that that \(\log p(y | \theta, d)\) can safely be read from the model itself.

Parameters:
  • model (function) – A pyro model accepting design as only argument.
  • design (torch.Tensor) – Tensor representation of design
  • observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred.
  • target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured.
  • N (int) – Number of outer expectation samples.
  • M (int) – Number of inner expectation samples for p(y|d).
  • M_prime (int) – Number of samples for p(y | theta, d) if required.
Returns:

EIG estimate

Return type:

torch.Tensor

donsker_varadhan_eig(model, design, observation_labels, target_labels, num_samples, num_steps, T, optim, return_history=False, final_design=None, final_num_samples=None)[source]

Donsker-Varadhan estimate of the expected information gain (EIG).

The Donsker-Varadhan representation of EIG is

\[\sup_T E_{p(y, \theta | d)}[T(y, \theta)] - \log E_{p(y|d)p(\theta)}[\exp(T(\bar{y}, \bar{\theta}))]\]

where \(T\) is any (measurable) function.

This methods optimises the loss function over a pre-specified class of functions T.

Parameters:
  • model (function) – A pyro model accepting design as only argument.
  • design (torch.Tensor) – Tensor representation of design
  • observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred.
  • target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured.
  • num_samples (int) – Number of samples per iteration.
  • num_steps (int) – Number of optimisation steps.
  • or torch.nn.Module T (function) – optimisable function T for use in the Donsker-Varadhan loss function.
  • optim (pyro.optim.Optim) – Optimiser to use.
  • return_history (bool) – If True, also returns a tensor giving the loss function at each step of the optimisation.
  • final_design (torch.Tensor) – The final design tensor to evaluate at. If None, uses design.
  • final_num_samples (int) – The number of samples to use at the final evaluation, If None, uses `num_samples.
Returns:

EIG estimate, optionally includes full optimisatio history

Return type:

torch.Tensor or tuple

barber_agakov_ape(model, design, observation_labels, target_labels, num_samples, num_steps, guide, optim, return_history=False, final_design=None, final_num_samples=None)[source]

Barber-Agakov estimate of average posterior entropy (APE).

The Barber-Agakov representation of APE is

\(sup_{q}E_{p(y, \theta | d)}[\log q(\theta | y, d)]\)

where \(q\) is any distribution on \(\theta\).

This method optimises the loss over a given guide family guide representing \(q\).

Parameters:
  • model (function) – A pyro model accepting design as only argument.
  • design (torch.Tensor) – Tensor representation of design
  • observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred.
  • target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured.
  • num_samples (int) – Number of samples per iteration.
  • num_steps (int) – Number of optimisation steps.
  • guide (function) – guide family for use in the (implicit) posterior estimation. The parameters of guide are optimised to maximise the Barber-Agakov objective.
  • optim (pyro.optim.Optim) – Optimiser to use.
  • return_history (bool) – If True, also returns a tensor giving the loss function at each step of the optimisation.
  • final_design (torch.Tensor) – The final design tensor to evaluate at. If None, uses design.
  • final_num_samples (int) – The number of samples to use at the final evaluation, If None, uses `num_samples.
Returns:

EIG estimate, optionally includes full optimisatio history

Return type:

torch.Tensor or tuple

class EwmaLog(alpha)[source]

Logarithm function with exponentially weighted moving average for gradients.

For input inputs this function return inputs.log(). However, it computes the gradient as

\(\frac{\sum_{t=0}^{T-1} \alpha^t}{\sum_{t=0}^{T-1} \alpha^t x_{T-t}}\)

where \(x_t\) are historical input values passed to this function, \(x_T\) being the most recently seen value.

This gradient may help with numerical stability when the sequence of inputs to the function form a convergent sequence.