Algorithms & Computationally Intensive Inference seminars
The seminars will take place on Fridays 1 pm UK time in room MB0.08.
2025-2026 Organiser: Adrien Corenflos and Shishen Lin
If you would like to speak, or you want to be included in any emails, please contact the organiser.
Website URL: www.warwick.ac.uk/compstat
Mailing List Sign-Up: https://ivory-cuscus.lnx.warwick.ac.uk/mailman3/lists/algorithmseminar.newlistserv.warwick.ac.uk/ (only working on campus at the moment)
Mailing List: algorithmseminar@listserv.csv.warwick.ac.uk (NB - only approved members can post)
Upcoming (see below for future and past speakers):
| 7/11 | Michela Ottobre (Herriot Watt University) |
A multiscale perspective on Maximum Marginal Likelihood estimation
|
|
Note: Hybrid |
Abstract: We provide a multiscale perspective on the problem of maximum marginal likelihood estimation. We consider and analyse a diffusion-based maximum marginal likelihood estimation scheme using ideas from multiscale dynamics. Our perspective is based on stochastic averaging; we make an explicit connection between ideas in applied probability and parameter inference in computational statistics. In particular, we consider a general class of coupled Langevin diffusions for joint inference of latent variables and parameters in statistical models, where the latent variables are sampled from a fast Langevin process (which acts as a sampler), and the parameters are updated using a slow Langevin process (which acts as an optimiser). We show that the resulting system of stochastic differential equations (SDEs) can be viewed as a two-time scale system. To demonstrate the utility of such a perspective, we show that the averaged parameter dynamics obtained in the limit of scale separation can be used to estimate the optimal parameter, within the strongly convex setting. We do this by using recent uniform-in-time non-asymptotic averaging bounds. Finally, we conclude by showing that the slow-fast algorithm we consider here, termed Slow-Fast Langevin Algorithm, performs on par with state-of-the-art methods on a variety of examples. We believe that the stochastic averaging approach we provide in this paper enables us to look at these algorithms from a fresh angle, as well as unlocking the path to develop and analyse new methods using well-established averaging principles.
|
|
Term 1:
Date |
Speaker |
Title |
| 14/11 | Alistair Benford (Birmingham) |
Stability Analysis of Coevolutionary Algorithms
|
|
Note: Hybrid |
Abstract: Coevolutionary algorithms (CoEAs) are a useful approach for optimising games with strategy spaces too large for a Nash equilibrium to be computed directly. Through selection and mutation, opposing populations discover stronger strategies in an iterative simulated arms race. The presence of randomisation is essential to the flexibility and success of CoEAs, however it also introduces vulnerabilities which limit their trustworthiness. Most critically, many CoEAs cannot guarantee that optimal strategies will be retained and remembered, even if they are discovered. The question of how to design and parameterise CoEAs which are stable (that is, which retain optimal strategies after discovery) is therefore essential for having confidence in their output.
This seminar will present a theoretical tool for assessing the stability of a CoEA. Given the algorithm's core design and basic properties of the application (such as whether the game is zero-sum or has a unique optimum), the tool explains how to parameterise the algorithm's mutation operator in order to ensure stability. Our results additionally describe parameterisations for which unstable behaviour is inevitable, and hence must be avoided. While these two regimes do not fully encompass all possible parameterisations, the gap between them is small, and empirical analysis not only verifies their validity but also indicates that tighter results may only be possible by sacrificing generality.
This is joint work with Per Kristian Lehre.
|
|
| 07/11 | Michela Ottobre (TBC, Edinburgh) |
A multiscale perspective on Maximum Marginal Likelihood estimation
|
|
Note: Hybrid |
Abstract: We provide a multiscale perspective on the problem of maximum marginal likelihood estimation. We consider and analyse a diffusion-based maximum marginal likelihood estimation scheme using ideas from multiscale dynamics. Our perspective is based on stochastic averaging; we make an explicit connection between ideas in applied probability and parameter inference in computational statistics. In particular, we consider a general class of coupled Langevin diffusions for joint inference of latent variables and parameters in statistical models, where the latent variables are sampled from a fast Langevin process (which acts as a sampler), and the parameters are updated using a slow Langevin process (which acts as an optimiser). We show that the resulting system of stochastic differential equations (SDEs) can be viewed as a two-time scale system. To demonstrate the utility of such a perspective, we show that the averaged parameter dynamics obtained in the limit of scale separation can be used to estimate the optimal parameter, within the strongly convex setting. We do this by using recent uniform-in-time non-asymptotic averaging bounds. Finally, we conclude by showing that the slow-fast algorithm we consider here, termed Slow-Fast Langevin Algorithm, performs on par with state-of-the-art methods on a variety of examples. We believe that the stochastic averaging approach we provide in this paper enables us to look at these algorithms from a fresh angle, as well as unlocking the path to develop and analyse new methods using well-established averaging principles.
|
|
| 31/10 | Yuga Iguchi (Lancaster) |
Skew-symmetric schemes for stochastic differential equations with non-Lipschitz drift: an unadjusted Barker algorithm
|
|
Note: Hybrid |
Abstract: We propose a new simple and explicit numerical scheme for time-homogeneous stochastic differential equations. The scheme is based on sampling increments at each time step from a skew-symmetric probability distribution, with the level of skewness determined by the drift and volatility of the underlying process. We show that as the step-size decreases the scheme converges weakly to the diffusion of interest. We then consider the problem of simulating from the limiting distribution of an ergodic diffusion process using the numerical scheme with a fixed step-size. We establish conditions under which the numerical scheme converges to equilibrium at a geometric rate, and quantify the bias between the equilibrium distributions of the scheme and of the true diffusion process. Notably, our results do not require a global Lipschitz assumption on the drift, in contrast to those required for the Euler--Maruyama scheme for long-time simulation at fixed step-sizes. Our weak convergence result relies on an extension of the theory of Milstein & Tretyakov to stochastic differential equations with non-Lipschitz drift, which could also be of independent interest. We support our theoretical results with numerical simulations. This is a joint work with Samuel Livingstone, Nikolas Nüsken, Giorgos Vasdekis and Rui-Yang Zhang. |
|
| 24/10 | Shishen Lin (University of Warwick) |
Learning in Games: Overcoming Binary Adversarial Optimisation with Competitive Coevolution |
|
Note: Hybrid |
Abstract: Co-evolutionary algorithms (CoEAs), which pair candidate designs with test cases, are frequently used in adversarial optimisation, particularly for binary test-based problems where designs and tests yield binary outcomes. The effectiveness of designs is determined by their performance against tests, and the value of tests is based on their ability to identify failing designs, often leading to more sophisticated tests and improved designs. However, CoEAs can exhibit complex, sometimes pathological behaviours like disengagement. Through runtime analysis, we aim to rigorously analyse whether CoEAs can efficiently solve test-based adversarial optimisation problems in an expected polynomial runtime. This paper carries out the first rigorous runtime analysis of (1, λ)-CoEA for binary test-based adversarial optimisation problems. In particular, we introduce a binary test-based benchmark problem called the Diagonal problem and initiate the first runtime analysis of competitive CoEA on this problem. The mathematical analysis shows that the (1, λ)-CoEA can efficiently find an ε approximation to the optimal solution of the Diagonal problem, i.e. in expected polynomial runtime assuming sufficiently low mutation rates and large offspring population size. On the other hand, the standard (1, λ)-EA fails to find an ε approximation to the optimal solution of the Diagonal problem in polynomial runtime. This illustrates the potential of coevolution for solving binary adversarial optimisation problems. Arxiv link: https://arxiv.org/pdf/2407.17875. |
|
| 17/10 | Marina Riabiz (King's College London) |
Extrapolation (and smoothing) of Tempered Posteriors Expectations
|
|
Note: Hybrid |
Abstract: Tempering is a popular tool in Bayesian computation, being used to transform a posterior distribution into a reference distribution that is more easily approximated. Several algorithms exist that start by approximating the reference distribution and proceed through a sequence of intermediate distributions until an approximation to the posterior is obtained. Our contribution reveals that high-quality approximation of terms up to the posterior is not essential, as knowledge of the intermediate distributions enables posterior quantities of interest to be extrapolated. Specifically, we establish conditions under which posterior expectations are determined by their associated tempered expectations on any non-empty interval. Harnessing this result, we propose novel methodology for approximating posterior expectations based on extrapolation and smoothing of tempered expectations, which we implement as a post-processing variance-reduction tool for sequential Monte Carlo. |
|
| 10/10 |
Yingzhen Li (Imperial College)
|
Neural Flow Samplers: Improved Training and Architectures |
|
Note: Hybrid |
Abstract: Sampling from unnormalized densities, either for continuous or discrete variables, presents a fundamental challenge with wide-ranging applications, from posterior inference to molecular dynamics simulations and combinatorial optimisation. Continuous or discrete flow-based neural samplers offer a promising approach, learning a velocity field for continuous variable densities (or rate matrix for a continuous-time Markov chain (CTMC) built for discrete variable densities) that satisfies key principles of marginal density evolution (e.g., the continuity equation for continuous variable case and the Kolmogorov equation for discrete variable case) to generate samples. However, this learning procedure requires accurate estimation of intractable terms linked to the computationally challenging partition function, for which existing estimators often suffer from high variance or low accuracy. To overcome this, we introduce an improved estimator for these challenging quantities, employing an improved Sequential Monte Carlo method enhanced with control variates. We further introduce specific adjustments and advances for the trained sampler tailored for continuous variable or discrete variable case. In both scenarios, our proposed Neural Flow Samplers empirically outperforms existing flow-based neural samplers on both synthetic datasets and complex targets motivated by real-world applications. |
|
| 03/10 | Yvann Le Fay (ENSAE Paris) |
Least squares variational inference
|
|
Note: Hybrid |
Abstract: Variational inference seeks the best approximation of a target distribution within a chosen family, where "best" means minimising Kullback-Leibler divergence. When the approximation family is exponential, the optimal approximation satisfies a fixed-point equation. We introduce LSVI (Least Squares Variational Inference), a gradient-free, Monte Carlo-based scheme for the fixed-point recursion, where each iteration boils down to performing ordinary least squares regression on tempered log-target evaluations under the variational approximation. We show that LSVI is equivalent to biased stochastic natural gradient descent and use this to derive convergence rates with respect to the numbers of samples and iterations. When the approximation family is Gaussian, LSVI involves inverting the Fisher information matrix, whose size grows quadratically with dimension d. We exploit the regression formulation to eliminate the need for this inversion, yielding O(d³) complexity in the full-covariance case and O(d) in the mean-field case. Finally, we numerically demonstrate LSVI’s performance on various tasks, including logistic regression, discrete variable selection, and Bayesian synthetic likelihood, showing competitive results with state-of-the-art methods, even when gradients are unavailable. |
|