Algorithms & Computationally Intensive Inference seminars

The seminars will take place on Fridays 1 pm UK time in room MB0.08.

2025-2026 Organiser: Adrien Corenflos and Shishen Lin

If you would like to speak, or you want to be included in any emails, please contact the organiser.

Website URL: www.warwick.ac.uk/compstat

Mailing List Sign-Up: https://ivory-cuscus.lnx.warwick.ac.uk/mailman3/lists/algorithmseminar.newlistserv.warwick.ac.uk/ (only working on campus at the moment)

Mailing List: algorithmseminar@listserv.csv.warwick.ac.uk (NB - only approved members can post)

Upcoming (see below for future and past speakers):

12/12

Shreya Sinha Roy (Warwick)

Prequential Posteriors

Note:

Hybrid

Abstract: Data assimilation is a fundamental task in updating forecasting models upon observing new data, with applications ranging from weather prediction to online reinforcement learning. Deep generative forecasting models (DGFMs) have shown excellent performance in these areas, but assimilating data into such models is challenging due to their intractable likelihood functions. This limitation restricts the use of standard Bayesian data assimilation methodologies for DGFMs. To overcome this, we introduce prequential posteriors, based upon a predictive-sequential (prequential) loss function; an approach naturally suited for temporally dependent data which is the focus of forecasting tasks. Since the true datagenerating process often lies outside the assumed model class, we adopt an alternative notion of consistency and prove that, under mild conditions, both the prequential loss minimizer and the prequential posterior concentrate around parameters with optimal predictive performance. For scalable inference, we employ easily parallelizable wastefree sequential Monte Carlo (SMC) samplers with preconditioned gradient-based kernels, enabling efficient exploration of high-dimensional parameter spaces such as those in DGFMs. We validate our method on both a synthetic multi-dimensional time series and a real-world meteorological dataset; highlighting its practical utility for data assimilation for complex dynamical systems.

Term 2:

Date

Speaker

Title

16/01

Aleksandar Mijatovic (Warwick)

TBC -- Heavy-tail MCMC

Note:

Hybrid

Term 1:

Date	Speaker	Title
12/12	Shreya Sinha Roy (Warwick)	Prequential Posteriors
Note: Hybrid	Abstract: Data assimilation is a fundamental task in updating forecasting models upon observing new data, with applications ranging from weather prediction to online reinforcement learning. Deep generative forecasting models (DGFMs) have shown excellent performance in these areas, but assimilating data into such models is challenging due to their intractable likelihood functions. This limitation restricts the use of standard Bayesian data assimilation methodologies for DGFMs. To overcome this, we introduce prequential posteriors, based upon a predictive-sequential (prequential) loss function; an approach naturally suited for temporally dependent data which is the focus of forecasting tasks. Since the true datagenerating process often lies outside the assumed model class, we adopt an alternative notion of consistency and prove that, under mild conditions, both the prequential loss minimizer and the prequential posterior concentrate around parameters with optimal predictive performance. For scalable inference, we employ easily parallelizable wastefree sequential Monte Carlo (SMC) samplers with preconditioned gradient-based kernels, enabling efficient exploration of high-dimensional parameter spaces such as those in DGFMs. We validate our method on both a synthetic multi-dimensional time series and a real-world meteorological dataset; highlighting its practical utility for data assimilation for complex dynamical systems.
05/12	Rémi Bardenet (CNRS, CRIStAL, Univ. Lille)	Rejection sampling with quantum tricks: sampling determinantal point processes on a quantum computer
Note: Hybrid	Abstract: Be it feature selection in regression or minibatch sampling in stochastic gradient descent, many data science applications can be formulated as selecting a small representative subset of a larger ground set of N items. Determinantal point processes (DPPs) are probability distributions originating in quantum optics, which allow selecting such representative subsets at a reasonable computational cost, and sometimes yield better-than-iid statistical guarantees. I will show that, given access to a quantum computer with as many qubits as the cardinality N of the ground set (an unrealistic assumption in 2025), one can sample some determinantal point processes faster than on a classical computer. For Monte Carlo aficionados, I will show neat peculiarities of the quantum framework that allow us to replace a costly matrix decomposition by a cheaper rejection sampling routine. The talk is based on Reference [2]. I will try to stick to a level of exposition that does not require any prior affinity with quantum computing, but if you have time prior to the talk, you can read Section 3 of [1] for a 3-page introduction to some of the notions I'll use. [1] https://arxiv.org/abs/2305.15851Link opens in a new window [2] https://arxiv.org/abs/2503.05906Link opens in a new window
28/11	Lanya Yang (Lancaster)	Exchangeable Particle Gibbs for Markov Jump Processes
Note: Hybrid	Abstract: Inference in stochastic reaction-network models—such as the SEIR epidemic model or the Lotka–Volterra predator–prey system—is crucial for understanding the dynamics of interacting systems in epidemiology, ecology, and systems biology. These models are typically represented as Markov jump processes (MJPs) with intractable likelihoods. As a result, particle Markov chain Monte Carlo (particle MCMC) methods, particularly the Particle Gibbs sampler, have become standard tools for Bayesian inference. However, PG suffers from severe particle degeneracy, especially in high-dimensional state spaces, leading to poor mixing and inefficiency. In this talk, I focus on improving the efficiency of particle MCMC methods for inference in reaction networks by addressing the degeneracy problem. Building on recent work on the Exchangeable Particle Gibbs (xPG) sampler for continuous-state diffusions, this project develops a novel version of xPG tailored to discrete-state reaction networks, where randomness is driven by Poisson processes rather than Brownian motion. The proposed method retains the exchangeability framework of xPG while adapting it to the structural and computational challenges specific to reaction networks.
21/11	Luhuan Wu (Johns Hopkins University)	Sequential Monte Carlo meets diffusion models
Note: Hybrid	Abstract: Diffusion models have emerged as a powerful paradigm for generative modeling. They are grounded in a pair of Markovian forward and reverse denoising dynamics. In this talk, I will explore how sequential Monte Carlo (SMC) methods, in particular, twisted SMC, can offer principled solutions to two sampling problems related to diffusion models. In the first part, I will introduce the Twisted Diffusion Sampler (TDS), a purely inference-time algorithm for conditional sampling in diffusion models with asymptotic guarantees. TDS provides an appealing scaling property between computational cost and statistical accuracy, improving empirical performance even with a small number of particles. We illustrate TDS in image generation and protein design tasks. The link to this page can be found [https://arxiv.org/abs/2306.17775]. In the second part, I will turn to the classical problem of sampling from an unnormalized distribution. I will present the Reverse Diffusion Sequential Monte Carlo (RDSMC) sampler, which operates on an annealing schedule constructed from the diffusion reverse denoising process. This approach yields unbiassed estimation of normalization constants and consistency sampling, enabling application ins a variety of sampling tasks. The link to this page can be found [https://arxiv.org/abs/2508.05926].
14/11	Alistair Benford (Birmingham)	Stability Analysis of Coevolutionary Algorithms
Note: Hybrid	Abstract: Coevolutionary algorithms (CoEAs) are a useful approach for optimising games with strategy spaces too large for a Nash equilibrium to be computed directly. Through selection and mutation, opposing populations discover stronger strategies in an iterative simulated arms race. The presence of randomisation is essential to the flexibility and success of CoEAs, however it also introduces vulnerabilities which limit their trustworthiness. Most critically, many CoEAs cannot guarantee that optimal strategies will be retained and remembered, even if they are discovered. The question of how to design and parameterise CoEAs which are stable (that is, which retain optimal strategies after discovery) is therefore essential for having confidence in their output. This seminar will present a theoretical tool for assessing the stability of a CoEA. Given the algorithm's core design and basic properties of the application (such as whether the game is zero-sum or has a unique optimum), the tool explains how to parameterise the algorithm's mutation operator in order to ensure stability. Our results additionally describe parameterisations for which unstable behaviour is inevitable, and hence must be avoided. While these two regimes do not fully encompass all possible parameterisations, the gap between them is small, and empirical analysis not only verifies their validity but also indicates that tighter results may only be possible by sacrificing generality. This is joint work with Per Kristian Lehre.
07/11	Michela Ottobre (TBC, Edinburgh)	A multiscale perspective on Maximum Marginal Likelihood estimation
Note: Hybrid	Abstract: We provide a multiscale perspective on the problem of maximum marginal likelihood estimation. We consider and analyse a diffusion-based maximum marginal likelihood estimation scheme using ideas from multiscale dynamics. Our perspective is based on stochastic averaging; we make an explicit connection between ideas in applied probability and parameter inference in computational statistics. In particular, we consider a general class of coupled Langevin diffusions for joint inference of latent variables and parameters in statistical models, where the latent variables are sampled from a fast Langevin process (which acts as a sampler), and the parameters are updated using a slow Langevin process (which acts as an optimiser). We show that the resulting system of stochastic differential equations (SDEs) can be viewed as a two-time scale system. To demonstrate the utility of such a perspective, we show that the averaged parameter dynamics obtained in the limit of scale separation can be used to estimate the optimal parameter, within the strongly convex setting. We do this by using recent uniform-in-time non-asymptotic averaging bounds. Finally, we conclude by showing that the slow-fast algorithm we consider here, termed Slow-Fast Langevin Algorithm, performs on par with state-of-the-art methods on a variety of examples. We believe that the stochastic averaging approach we provide in this paper enables us to look at these algorithms from a fresh angle, as well as unlocking the path to develop and analyse new methods using well-established averaging principles.
31/10	Yuga Iguchi (Lancaster)	Skew-symmetric schemes for stochastic differential equations with non-Lipschitz drift: an unadjusted Barker algorithm
Note: Hybrid	Abstract: We propose a new simple and explicit numerical scheme for time-homogeneous stochastic differential equations. The scheme is based on sampling increments at each time step from a skew-symmetric probability distribution, with the level of skewness determined by the drift and volatility of the underlying process. We show that as the step-size decreases the scheme converges weakly to the diffusion of interest. We then consider the problem of simulating from the limiting distribution of an ergodic diffusion process using the numerical scheme with a fixed step-size. We establish conditions under which the numerical scheme converges to equilibrium at a geometric rate, and quantify the bias between the equilibrium distributions of the scheme and of the true diffusion process. Notably, our results do not require a global Lipschitz assumption on the drift, in contrast to those required for the Euler--Maruyama scheme for long-time simulation at fixed step-sizes. Our weak convergence result relies on an extension of the theory of Milstein & Tretyakov to stochastic differential equations with non-Lipschitz drift, which could also be of independent interest. We support our theoretical results with numerical simulations. This is a joint work with Samuel Livingstone, Nikolas Nüsken, Giorgos Vasdekis and Rui-Yang Zhang.
24/10	Shishen Lin (University of Warwick)	Learning in Games: Overcoming Binary Adversarial Optimisation with Competitive Coevolution
Note: Hybrid	Abstract: Co-evolutionary algorithms (CoEAs), which pair candidate designs with test cases, are frequently used in adversarial optimisation, particularly for binary test-based problems where designs and tests yield binary outcomes. The effectiveness of designs is determined by their performance against tests, and the value of tests is based on their ability to identify failing designs, often leading to more sophisticated tests and improved designs. However, CoEAs can exhibit complex, sometimes pathological behaviours like disengagement. Through runtime analysis, we aim to rigorously analyse whether CoEAs can efficiently solve test-based adversarial optimisation problems in an expected polynomial runtime. This paper carries out the first rigorous runtime analysis of (1, λ)-CoEA for binary test-based adversarial optimisation problems. In particular, we introduce a binary test-based benchmark problem called the Diagonal problem and initiate the first runtime analysis of competitive CoEA on this problem. The mathematical analysis shows that the (1, λ)-CoEA can efficiently find an ε approximation to the optimal solution of the Diagonal problem, i.e. in expected polynomial runtime assuming sufficiently low mutation rates and large offspring population size. On the other hand, the standard (1, λ)-EA fails to find an ε approximation to the optimal solution of the Diagonal problem in polynomial runtime. This illustrates the potential of coevolution for solving binary adversarial optimisation problems. Arxiv link: https://arxiv.org/pdf/2407.17875.
17/10	Marina Riabiz (King's College London)	Extrapolation (and smoothing) of Tempered Posteriors Expectations
Note: Hybrid	Abstract: Tempering is a popular tool in Bayesian computation, being used to transform a posterior distribution into a reference distribution that is more easily approximated. Several algorithms exist that start by approximating the reference distribution and proceed through a sequence of intermediate distributions until an approximation to the posterior is obtained. Our contribution reveals that high-quality approximation of terms up to the posterior is not essential, as knowledge of the intermediate distributions enables posterior quantities of interest to be extrapolated. Specifically, we establish conditions under which posterior expectations are determined by their associated tempered expectations on any non-empty interval. Harnessing this result, we propose novel methodology for approximating posterior expectations based on extrapolation and smoothing of tempered expectations, which we implement as a post-processing variance-reduction tool for sequential Monte Carlo. Arxiv: https://arxiv.org/abs/2509.12173
10/10	Yingzhen Li (Imperial College)	Neural Flow Samplers: Improved Training and Architectures
Note: Hybrid	Abstract: Sampling from unnormalized densities, either for continuous or discrete variables, presents a fundamental challenge with wide-ranging applications, from posterior inference to molecular dynamics simulations and combinatorial optimisation. Continuous or discrete flow-based neural samplers offer a promising approach, learning a velocity field for continuous variable densities (or rate matrix for a continuous-time Markov chain (CTMC) built for discrete variable densities) that satisfies key principles of marginal density evolution (e.g., the continuity equation for continuous variable case and the Kolmogorov equation for discrete variable case) to generate samples. However, this learning procedure requires accurate estimation of intractable terms linked to the computationally challenging partition function, for which existing estimators often suffer from high variance or low accuracy. To overcome this, we introduce an improved estimator for these challenging quantities, employing an improved Sequential Monte Carlo method enhanced with control variates. We further introduce specific adjustments and advances for the trained sampler tailored for continuous variable or discrete variable case. In both scenarios, our proposed Neural Flow Samplers empirically outperforms existing flow-based neural samplers on both synthetic datasets and complex targets motivated by real-world applications.
03/10	Yvann Le Fay (ENSAE Paris)	Least squares variational inference
Note: Hybrid	Abstract: Variational inference seeks the best approximation of a target distribution within a chosen family, where "best" means minimising Kullback-Leibler divergence. When the approximation family is exponential, the optimal approximation satisfies a fixed-point equation. We introduce LSVI (Least Squares Variational Inference), a gradient-free, Monte Carlo-based scheme for the fixed-point recursion, where each iteration boils down to performing ordinary least squares regression on tempered log-target evaluations under the variational approximation. We show that LSVI is equivalent to biased stochastic natural gradient descent and use this to derive convergence rates with respect to the numbers of samples and iterations. When the approximation family is Gaussian, LSVI involves inverting the Fisher information matrix, whose size grows quadratically with dimension d. We exploit the regression formulation to eliminate the need for this inversion, yielding O(d³) complexity in the full-covariance case and O(d) in the mean-field case. Finally, we numerically demonstrate LSVI’s performance on various tasks, including logistic regression, discrete variable selection, and Bayesian synthetic likelihood, showing competitive results with state-of-the-art methods, even when gradients are unavailable. Arxiv: https://arxiv.org/abs/2502.18475