Publications
Empirical Privacy Variance
Yuzheng Hu, Fan Wu, Ruicheng Xian, Yuhang Liu, Lydia Zakynthinou, Pritish Kamath, Chiyuan Zhang, David Forsyth
ICML 2025
We propose the notion of empirical privacy variance and study it in the context of differentially private fine-tuning of language models. Specifically, we show that models calibrated to the same (ε,δ)-DP guarantee using DP-SGD with different hyperparameter configurations can exhibit significant variations in empirical privacy, which we quantify through the lens of memorization. We investigate the generality of this phenomenon across multiple dimensions and discuss why it is surprising and relevant. Through regression analysis, we examine how individual and composite hyperparameters influence empirical privacy. The results reveal a no-free-lunch trade-off: existing practices of hyperparameter tuning in DP-SGD, which focus on optimizing utility under a fixed privacy budget, often come at the expense of empirical privacy. To address this, we propose refined heuristics for hyperparameter selection that explicitly account for empirical privacy, showing that they are both precise and practically useful. Finally, we take preliminary steps to understand empirical privacy variance. We propose two hypotheses, identify limitations in existing techniques like privacy auditing, and outline open questions for future research.
Finite-Sample Convergence Bounds for Trust Region Policy Optimization in Mean-Field Games
Antonio Ocello, Daniil Tiapkin, Lorenzo Mancini, Mathieu Laurière, Eric Moulines
ICML 2025
We introduce Mean-Field Trust Region Policy Optimization (MF-TRPO), a novel algorithm designed to compute approximate Nash equilibria for ergodic Mean-Field Games (MFG) in finite state-action spaces. Building on the well-established performance of TRPO in the reinforcement learning (RL) setting, we extend its methodology to the MFG framework, leveraging its stability and robustness in policy optimization. Under standard assumptions in the MFG literature, we provide a rigorous analysis of MF-TRPO, establishing theoretical guarantees on its convergence. Our results cover both the exact formulation of the algorithm and its sample-based counterpart, where we derive high-probability guarantees and finite sample complexity. This work advances MFG optimization by bridging RL techniques with mean-field decision-making, offering a theoretically grounded approach to solving complex multi-agent problems.
Scaling of piecewise deterministic Monte Carlo for anisotropic targets
Joris Bierkens, Kengo Kamatani, Gareth O. Roberts
Bernouilli
Piecewise deterministic Markov processes (PDMPs) are a type of continuous-time Markov process that combine deterministic flows with jumps. Recently, PDMPs have garnered attention within the Monte Carlo community as a potential alternative to traditional Markov chain Monte Carlo (MCMC) methods. The Zig-Zag sampler and the Bouncy Particle Sampler are commonly used examples of the PDMP methodology which have also yielded impressive theoretical properties, but little is known about their robustness to extreme dependence or anisotropy of the target density. It turns out that PDMPs may suffer from poor mixing due to anisotropy and this paper investigates this effect in detail in the stylised but important Gaussian case. To this end, we employ a multi-scale analysis framework in this paper. Our results show that when the Gaussian target distribution has two scales, of order 1 and ɛ, the computational cost of the Bouncy Particle Sampler is of order ɛ−1, and the computational cost of the Zig-Zag sampler is ɛ−2 . In comparison, the cost of the traditional MCMC methods such as RWM is of order ɛ−2, at least when the dimensionality of the small component is more than 1. Therefore, there is a robustness advantage to using PDMPs in this context.
Unbiased and consistent nested sampling via sequential Monte Carlo
Robert Salomone, Leah F South, Christopher Drovandi, Dirk P Kroese, Adam M Johansen Author Notes
Journal of the Royal Statistical Society Series B: Statistical Methodology
We introduce a new class of sequential Monte Carlo methods which reformulates the essence of the nested sampling (NS) method of Skilling in terms of sequential Monte Carlo techniques. Two new algorithms are proposed: nested sampling via sequential Monte Carlo (NS-SMC) and adaptive nested sampling via sequential Monte Carlo (ANS-SMC). The new framework allows convergence results to be obtained in the setting when Markov chain Monte Carlo (MCMC) is used to produce new samples. An additional benefit is that marginal-likelihood (normalizing constant) estimates given by NS-SMC are unbiased. In contrast to NS, the analysis of our proposed algorithms does not require the (unrealistic) assumption that the simulated samples be independent. We show that a minor adjustment to our ANS-SMC algorithm recovers the original NS algorithm, which provides insights as to why NS seems to produce accurate estimates despite a typical violation of its assumptions. A numerical study is conducted where the performance of the proposed algorithms and temperature-annealed SMC is compared on challenging problems.
Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy
Maryam Aliakbarpour, Syomantak Chaudhuri, Thomas A. Courtade, Alireza Fallah, Michael I. Jordan
AISTAT 2025
Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties. However, LDP applies uniform protection to all data features, including less sensitive ones, which degrades performance of downstream tasks. To overcome this limitation, we propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification. This more nuanced approach complements LDP by adjusting privacy protection according to the sensitivity of each feature, enabling improved performance of downstream tasks without compromising privacy. We characterize the properties of BCDP and articulate its connections with standard non-Bayesian privacy frameworks. We further apply our BCDP framework to the problems of private mean estimation and ordinary least-squares regression. The BCDP-based approach obtains improved accuracy compared to a purely LDP-based approach, without compromising on privacy.
Refined Analysis of Federated Averaging’s Bias and Federated Richardson-Romberg Extrapolation
Paul Mangold, Alain Durmus, Aymeric Dieuleveut, Eric Moulines
ICML 2025
This paper proposes a novel analysis for the Scaffold algorithm, a popular method for dealing with data heterogeneity in federated learning. While its convergence in deterministic settings–where local control variates mitigate client drift–is well established, the impact of stochastic gradient updates on its performance is less understood. To address this problem, we first show that its global parameters and control variates define a Markov chain that converges to a stationary distribution in the Wasserstein distance. Leveraging this result, we prove that Scaffold achieves linear speed-up in the number of clients up to higher-order terms in the step size. Nevertheless, our analysis reveals that Scaffold retains a higher-order bias, similar to FedAvg, that does not decrease as the number of clients increases. This highlights opportunities for developing improved stochastic federated learning algorithms.
Refined Analysis of Federated Averaging’s Bias and Federated Richardson-Romberg Extrapolation
Paul Mangold, Alain Durmus, Aymeric Dieuleveut, Sergey Samsonov, Eric Moulines
AISTAT 2025
In this paper, we present a novel analysis of FedAvg with constant step size, relying on the Markov property of the underlying process. We demonstrate that the global iterates of the algorithm converge to a stationary distribution and analyze its resulting bias and variance relative to the problem’s solution. We provide a first-order expansion of the bias in both homogeneous and heterogeneous settings. Interestingly, this bias decomposes into two distinct components: one that depends solely on stochastic gradient noise and another on client heterogeneity. Finally, we introduce a new algorithm based on the Richardson-Romberg extrapolation technique to mitigate this bias.
Robust inference for the unification of confidence intervals in meta-analysis
Wei Liang, Haicheng Huang, Hongsheng Dai, Yinghui Wei
Journal of Nonparametric Statistics
Traditional meta-analysis assumes that the effect sizes estimated in individual studies follow a Gaussian distribution. However, this distributional assumption is not always satisfied in practice, leading to potentially biased results. In the situation when the number of studies, denoted as K, is large, the cumulative Gaussian approximation errors from each study could make the final estimation unreliable. In the situation when K is small, it is not realistic to assume the random effect follows Gaussian distribution. In this paper, we present a novel empirical likelihood method for combining confidence intervals under the meta-analysis framework. This method is free of the Gaussian assumption in effect size estimates from individual studies and from the random effects. We establish the large sample properties of the nonparametric estimator and introduce a criterion governing the relationship between the number of studies, K, and the sample size of each study, ni. Our methodology supersedes conventional meta-analysis techniques in both theoretical robustness and computational efficiency. We assess the performance of our proposed methods using simulation studies and apply our proposed methods to two examples.
Parallel Square-Root Statistical Linear Regression for Inference in Nonlinear State Space Models
Authors: Fatemeh Yaghoobi, Adrien Corenflos, Sakira Hassan, and Simo Särkkä
SIAM Journal on Scientific Computing
In this article, we first derive parallel square-root methods for state estimation in linear state-space models. We then extend the formulations to general nonlinear, non-Gaussian state-space models using statistical linear regression and iterated statistical posterior linearization paradigms. We finally leverage the fixed-point structure of our methods to derive parallel square-root likelihood-based parameter estimation methods. We demonstrate the practical performance of the methods by comparing the parallel and the sequential approaches on a set of numerical experiments.
Modelling pathwise uncertainty of Stochastic Differential Equations samplers via Probabilistic Numerics
Yvann Le Fay, Simo Särkkä, Adrien Corenflos
Bayesian Analysis
Probabilistic ordinary differential equation (ODE) solvers have been introduced over the past decade as uncertainty-aware numerical integrators. They typically proceed by assuming a functional prior to the ODE solution, which is then queried on a grid to form a posterior distribution over the ODE solution. As the queries span the integration interval, the approximate posterior solution then converges to the true deterministic one. Gaussian ODE filters, in particular, have enjoyed a lot of attention due to their computational efficiency, the simplicity of their implementation, and their provable fast convergence rates. In this article, we extend the methodology to stochastic differential equations (SDEs) and propose a probabilistic simulator for SDEs. Our approach involves transforming the SDE into a sequence of random ODEs using piecewise differentiable approximations of the Brownian motion. We then apply probabilistic ODE solvers to the individual ODEs, resulting in a pathwise probabilistic solution to the SDE. We establish worst-case strong 1.5 local and 1.0 global convergence orders for a specific instance of our method. We further show how we can marginalise the Brownian approximations, by incorporating its coefficients as part of the prior ODE model, allowing for computing exact transition densities under our model. Finally, we numerically validate the theoretical findings, showcasing reasonable weak convergence properties in the marginalised version.
Auxiliary MCMC samplers for parallelisable inference in high-dimensional latent dynamical systems
Adrien Corenflos, Simo Särkkä
Electronic Journal of Statistics
Sampling from the full posterior distribution of high-dimensional non-linear, non-Gaussian latent dynamical models presents significant computational challenges. While Particle Gibbs (also known as conditional sequential Monte Carlo) is considered the gold standard for this task, it quickly degrades in performance as the latent space dimensionality increases. Conversely, globally Gaussian-approximated methods like extended Kalman filtering, though more robust, are seldom used for posterior sampling due to their inherent bias. We introduce novel auxiliary sampling approaches that address these limitations. By incorporating artificial observations of the system as auxiliary variables in our MCMC kernels, we develop both efficient exact Kalman-based samplers and enhanced Particle Gibbs algorithms that maintain performance in high-dimensional latent spaces. Some of our methods support parallelization along the time dimension, achieving logarithmic scaling when implemented on GPUs. Empirical evaluations demonstrate superior statistical and computational performance compared to existing approaches for high-dimensional latent dynamical systems.
Conditioning diffusion models by explicit forward-backward bridging
Adrien Corenflos, Zheng Zhao, Simo Särkkä, Jens Sjölund, Thomas B. Schön
AISTAT 2025
Given an unconditional diffusion model targeting a joint model π(x,y), using it to perform conditional simulation π(x∣y) is still largely an open question and is typically achieved by learning conditional drifts to the denoising SDE after the fact. In this work, we express \emph{exact} conditional simulation within the \emph{approximate} diffusion model as an inference problem on an augmented space corresponding to a partial SDE bridge. This perspective allows us to implement efficient and principled particle Gibbs and pseudo-marginal samplers marginally targeting the conditional distribution π(x∣y). Contrary to existing methodology, our methods do not introduce any additional approximation to the unconditional diffusion model aside from the Monte Carlo error. We showcase the benefits and drawbacks of our approach on a series of synthetic and real data examples.
Prediction-Aware Learning in Multi-Agent Systems
Aymeric Capitaine, Etienne Boursier, Eric Moulines, Michael I. Jordan, Alain Durmus
ICML 2026
The framework of uncoupled online learning in multiplayer games has made significant progress in recent years. In particular, the development of time-varying games has considerably expanded its modeling capabilities. However, current regret bounds quickly become vacuous when the game undergoes significant variations over time, even when these variations are easy to predict. Intuitively, the ability of players to forecast future payoffs should lead to tighter guarantees, yet existing approaches fail to incorporate this aspect. This work aims to fill this gap by introducing a novel prediction-aware framework for time-varying games, where agents can forecast future payoffs and adapt their strategies accordingly. In this framework, payoffs depend on an underlying state of nature that agents predict in an online manner. To leverage these predictions, we propose the POMWU algorithm, a contextual extension of the optimistic Multiplicative Weight Update algorithm, for which we establish theoretical guarantees on social welfare and convergence to equilibrium. Our results demonstrate that, under bounded prediction errors, the proposed framework achieves performance comparable to the static setting. Finally, we empirically demonstrate the effectiveness of POWMU in a traffic routing experiment.
Data Acquisition via Experimental Design for Data Markets
Charles Lu, Baihe Huang, Sai Praneeth Karimireddy, Praneeth Vepakomma, Michael Jordan, Ramesh Raskar
Neurips 2024
The acquisition of training data is crucial for machine learning applications. Data markets can increase the supply of data, particularly in data-scarce domains such as healthcare, by incentivizing potential data providers to join the market. A major challenge for a data buyer in such a market is choosing the most valuable data points from a data seller. Unlike prior work in data valuation, which assumes centralized data access, we propose a federated approach to the data acquisition problem that is inspired by linear experimental design. Our proposed data acquisition method achieves lower prediction error without requiring labeled validation data and can be optimized in a fast and federated procedure. The key insight of our work is that a method that directly estimates the benefit of acquiring data for test set prediction is particularly compatible with a decentralized market setting.
Statistical disaggregation—A Monte Carlo approach for imputation under constraints
Shenggang Hu, Hongsheng Dai, Fanlin Meng, Louis Aslett, Murray Pollock, Gareth O. Roberts
Scandinavian Journal of Statistics
Equality-constrained models naturally arise in problems in which the measurements are taken at different levels of resolution. The challenge in this setting is that the models usually induce a joint distribution which is intractable. Resorting to instead sampling from the joint distribution by means of a Monte Carlo approach is also challenging. For example, a naive rejection sampler does not work when the probability mass of the constraint is zero. A typical example of such constrained problems is to learn energy consumption for a higher resolution level based on data at a lower resolution, for example, to decompose a daily reading into readings at a finer level. We introduce a novel Monte Carlo sampling algorithm based on Langevin diffusions and rejection sampling to solve the problem of sampling from equality-constrained models. Our method has the advantage of being exact for linear constraints and naturally deals with multimodal distributions on arbitrary constraints. We test our method on statistical disaggregation problems for electricity consumption datasets, and our approach provides better uncertainty estimation and accuracy in data imputation compared with other naive/unconstrained methods.
Semi-implicit variational inference (SIVI) enriches the expressiveness of variationalfamilies by utilizing a kernel and a mixing distribution to hierarchically define thevariational distribution. Existing SIVI methods parameterize the mixing distributionusing implicit distributions, leading to intractable variational densities. As a result,directly maximizing the evidence lower bound (ELBO) is not possible, so theyresort to one of the following: optimizing bounds on the ELBO, employing costlyinner-loop Markov chain Monte Carlo runs, or solving minimax objectives. In thispaper, we propose a novel method for SIVI called Particle Variational Inference(PVI) which employs empirical measures to approximate the optimal mixingdistributions characterized as the minimizer of a free energy functional. PVI arisesnaturally as a particle approximation of a Euclidean–Wasserstein gradient flow and,unlike prior works, it directly optimizes the ELBO whilst making no parametricassumption about the mixing distribution. Our empirical results demonstrate thatPVI performs favourably compared to other SIVI methods across various tasks.Moreover, we provide a theoretical analysis of the behaviour of the gradient flowof a related free energy functional: establishing the existence and uniqueness ofsolutions as well as propagation of chaos results.
Pareto-Optimality, Smoothness, and Stochasticity in Learning-Augmented One-Max-Search
Ziyad Benomar, Lorenzo Croissant, Vianney Perchet, Spyros Angelopoulos
ICML 2025
One-max search is a classic problem in online decision-making, in which a trader acts on a sequence of revealed prices and accepts one of them irrevocably to maximise its profit. The problem has been studied both in probabilistic and in worst-case settings, notably through competitive analysis, and more recently in learning-augmented settings in which the trader has access to a prediction on the sequence. However, existing approaches either lack smoothness, or do not achieve optimal worst-case guarantees: they do not attain the best possible trade-off between the consistency and the robustness of the algorithm. We close this gap by presenting the first algorithm that simultaneously achieves both of these important objectives. Furthermore, we show how to leverage the obtained smoothness to provide an analysis of one-max search in stochastic learning-augmented settings which capture randomness in both the observed prices and the prediction.
Addressing Bias in Online Selection with Limited Budget of Comparisons
Ziyad Benomar, Evgenii Chzhen, Nicolas Schreuder, Vianney Perchet
Neurips 2024
Consider a hiring process with candidates coming from different universities. It is easy to order candidates with the same background, yet it can be challenging to compare them otherwise. The latter case requires additional costly assessments, leading to a potentially high total cost for the hiring organization. Given an assigned budget, what would be an optimal strategy to select the most qualified candidate?We model the above problem as a multicolor secretary problem, allowing comparisons between candidates from distinct groups at a fixed cost. Our study explores how the allocated budget enhances the success probability of online selection algorithms.
Optimizing the coalition gain in Online Auctions with Greedy Structured Bandits
Dorian Baudry, Hugo Richard, Maria Cherifa, Vianney Perchet, Clément Calauzènes
Neurips 2024
Motivated by online display advertising, this work considers repeated second-price auctions, where agents sample their value from an unknown distribution with cumulative distribution function FF. In each auction tt, a decision-maker bound by limited observations selects ntnt agents from a coalition of NN to compete for a prize with pp other agents, aiming to maximize the cumulative reward of the coalition across all auctions.The problem is framed as an NN-armed structured bandit, each number of player sent being an arm nn, with expected reward r(n)r(n) fully characterized by FF and p+np+n. We present two algorithms, Local-Greedy (LG) and Greedy-Grid (GG), both achieving *constant* problem-dependent regret. This relies on three key ingredients: **1.** an estimator of r(n)r(n) from feedback collected from any arm kk, **2.** concentration bounds of these estimates for kk within an estimation neighborhood of nn and **3.** the unimodality property of rr under standard assumptions on FF. Additionally, GG exhibits problem-independent guarantees on top of best problem-dependent guarantees. However, by avoiding to rely on confidence intervals, LG practically outperforms GG, as well as standard unimodal bandit algorithms such as OSUB or multi-armed bandit algorithms.
Divide-and-Conquer Posterior Sampling for Denoising Diffusion priors
Yazid Janati, Badr Moufad, Alain Durmus, Eric Moulines, Jimmy Olsson
Neurips 2024
Recent advancements in solving Bayesian inverse problems have spotlighted denoising diffusion models (DDMs) as effective priors.Although these have great potential, DDM priors yield complex posterior distributions that are challenging to sample from.Existing approaches to posterior sampling in this context address this problem either by retraining model-specific components, leading to stiff and cumbersome methods, or by introducing approximations with uncontrolled errors that affect the accuracy of the produced samples.We present an innovative framework, divide-and-conquer posterior sampling, which leverages the inherent structure of DDMs to construct a sequence of intermediate posteriors that guide the produced samples to the target posterior.Our method significantly reduces the approximation error associated with current techniques without the need for retraining.We demonstrate the versatility and effectiveness of our approach for a wide range of Bayesian inverse problems.The code is available at \url{https://github.com/Badr-MOUFAD/dcps}
Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents
Safwan Labbi, Daniil Tiapkin, Lorenzo Mancini, Paul Mangold, Eric Moulines
AISTAT 2025
In this paper, we present the Federated Upper Confidence Bound Value Iteration algorithm (𝙵𝚎𝚍-𝚄𝙲𝙱𝚅𝙸), a novel extension of the 𝚄𝙲𝙱𝚅𝙸 algorithm (Azar et al., 2017) tailored for the federated learning framework. We prove an upper bound on the regret of our algorithm, which matches the minimax lower bound up to polylogarithmic factors, while in the multi-agent scenario, 𝙵𝚎𝚍-𝚄𝙲𝙱𝚅𝙸 has linear speed-up. To conduct our analysis, we introduce a new measure of heterogeneity, which may hold independent theoretical interest. Furthermore, we show that, unlike existing federated reinforcement learning approaches, 𝙵𝚎𝚍-𝚄𝙲𝙱𝚅𝙸’s communication complexity only marginally increases with the number of agents.
Bernoulli factory: The 2𝚙-coin problem
Shenggang Hu, Bo Zhang, Hongsheng Dai, Wei Liang
Monte Carlo Methods & Applications
This paper aims to address the Bernoulli factory problem of 2 p coins by analysing the relationship between the negative binomial distributions and binomial distributions generated on the same chain of coin flips. The proposed algorithm requires fewer conditions on the constructed sequences compared with the existing algorithms. The feasibility of obtaining such 2 p-coin based on 2p-coins will be considered as well.
Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy
Maryam Aliakbarpour, Syomantak Chaudhuri, Thomas A. Courtade, Alireza Fallah, Michael I. Jordan
AISTAT 2025
Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties. However, LDP applies uniform protection to all data features, including less sensitive ones, which degrades performance of downstream tasks. To overcome this limitation, we propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification. This more nuanced approach complements LDP by adjusting privacy protection according to the sensitivity of each feature, enabling improved performance of downstream tasks without compromising privacy. We characterize the properties of BCDP and articulate its connections with standard non-Bayesian privacy frameworks. We further apply our BCDP framework to the problems of private mean estimation and ordinary least-squares regression. The BCDP-based approach obtains improved accuracy compared to a purely LDP-based approach, without compromising on privacy.
Importance sampling-based gradient method for dimension reduction in Poisson log-normal model
Bastien Batardière, Julien Chiquet, Joon Kwon, Julien Stoehr
Electronic Journal of Statistics proceedings
High-dimensional count data poses significant challenges for statistical analysis, necessitating effective methods that also preserve ex- plainability. We focus on a low rank constrained variant of the Poisson log- normal model, which relates the observed data to a latent low-dimensional multivariate Gaussian variable via a Poisson distribution. Variational in- ference methods have become a golden standard solution to infer such a model. While computationally efficient, they usually lack theoretical statis- tical properties with respect to the model. To address this issue we propose a projected stochastic gradient scheme that directly maximizes the log- likelihood. We prove the convergence of the proposed method when using importance sampling for estimating the gradient. Specifically, we obtain a rate of convergence of O(T^−1/2 + N^−1) with T the number of iterations and N the number of Monte Carlo draws. The latter follows from a novel descent lemma for non convex L-smooth objective functions, and random biased gradient estimate. We also demonstrate numerically the efficiency of our solution compared to its variational competitor. Our method not only scales with respect to the number of observed samples but also provides access to the desirable properties of the maximum likelihood estimator.
Simulating signed mixtures
Christian P. Robert, Julien Stoehr
Statistics and Computing, Vol 35 (2025)
Simulating mixtures of distributions with both positive and negative (signed) weights proves a challenge as standard simulation algorithms prove inefficient in handling the negative weights. In particular, the natural representation of mixture random variates as being associated with latent component indicators is no longer available. We propose an exact accept–reject algorithm for the general case of finite signed mixtures that relies on optimally pairing positive and negative components and designing a stratified sampling scheme on these pairs. We analyze the performances of our approach, relative to the inverse cdf approach, since the cdf of the targeted distribution remains available for signed mixtures of common distributions.
Composite likelihood inference for the Poisson log-normal model
Julien Stoehr, Stéphane Robin
Journal of the Royal Statistical Society Series C proceedings
The Poisson log-normal model is a latent variable model that provides a generic frame- work for the analysis of multivariate count data. Inferring its parameters can be a daunting task since the conditional distribution of the latent variables given the observed ones is intractable. For this model, variational approaches are the golden standard solution as they prove to be computationally efficient but lack theoretical guarantees on the esti- mates. Sampling based solutions are quite the opposite. Starting from already available variational approximations, we define a first Monte Carlo EM algorithm to obtain max- imum likelihood estimators. We then extend this algorithm to the case of a composite likelihood in order to be able to handle higher dimensional count data.
Improved Algorithms for Contextual Dynamic Pricing
Matilde Tullii, Solenne Gaucher, Nadav Merlis, Vianney Perchet
Neurips 2024
In contextual dynamic pricing, a seller sequentially prices goods based on contextual information. Buyers will purchase products only if the prices are below their valuations.The goal of the seller is to design a pricing strategy that collects as much revenue as possible. We focus on two different valuation models. The first assumes that valuations linearly depend on the context and are further distorted by noise. Under minor regularity assumptions, our algorithm achieves an optimal regret bound of ~O(T2/3)O~(T2/3), improving the existing results. The second model removes the linearity assumption, requiring only that the expected buyer valuation is ββ-H\ »older in the context. For this model, our algorithm obtains a regret ~O(Td+2β/d+3β)O~(Td+2β/d+3β), where dd is the dimension of the context space.
The Value of Reward Lookahead in Reinforcement Learning
Nadav Merlis, Dorian Baudry, Vianney Perchet
Neurips 2024
In reinforcement learning (RL), agents sequentially interact with changing environments while aiming to maximize the obtained rewards. Usually, rewards are observed only after acting, and so the goal is to maximize the expected cumulative reward. Yet, in many practical settings, reward information is observed in advance — prices are observed before performing transactions; nearby traffic information is partially known; and goals are oftentimes given to agents prior to the interaction. In this work, we aim to quantifiably analyze the value of such future reward information through the lens of _competitive analysis. In particular, we measure the ratio between the value of standard RL agents and that of agents with partial future-reward lookahead. We characterize the worst-case reward distribution and derive exact ratios for the worst-case reward expectations. Surprisingly, the resulting ratios relate to known quantities in offline RL and reward-free exploration. We further provide tight bounds for the ratio given the worst-case dynamics. Our results cover the full spectrum between observing the immediate rewards before acting to observing all the rewards before the interaction starts.
Strategic Multi-Armed Bandit Problems Under Debt-Free Reporting
Ahmed Ben Yahmed, Clément Calauzènes, Vianney Perchet
Neurips 2024
We examine multi-armed bandit problems featuring strategic arms under debt-free reporting. In this context, each arm is characterized by a bounded support reward distribution and strategically aims to maximize its own utility by retaining a portion of the observed reward, potentially disclosing only a fraction of it to the player. This scenario unfolds as a game over TT rounds, leading to a competition of objectives between the player, aiming to minimize regret, and the arms, motivated by the desire to maximize their individual utilities. To address these dynamics, we propose an algorithm that establishes an equilibrium wherein each arm behaves truthfully and discloses as much of its rewards as possible. Utilizing this algorithm, the player can attain the second-highest average (true) reward among arms, with a cumulative regret bounded by O(log(T)/Δ)O(log(T)/Δ) (problem-dependent) or O(√Tlog(T))O(Tlog(T)) (worst-case).
Improved learning rates in multi-unit uniform price auctions
Marius Potfer, Dorian Baudry, Hugo Richard, Vianney Perchet, Cheng Wan
Neurips proceedings 2024
Motivated by the strategic participation of electricity producers in electricity day-ahead market, we study the problem of online learning in repeated multi-unit uniform price auctions focusing on the adversarial opposing bid setting. The main contribution of this paper is the introduction of a new modeling of the bid space. Indeed, we prove that a learning algorithm leveraging the structure of this problem achieves a regret of O~(K4/3T2/3) under bandit feedback, improving over the bound of O~(K7/4T3/4) previously obtained in the literature. This improved regret rate is tight up to logarithmic terms. %by deducing a lower bound of Ω(T2/3) from the dynamic pricing literature, proving the optimality in T of our algorithm up to log factors. Inspired by electricity reserve markets, we further introduce a different feedback model under which all winning bids are revealed. This feedback interpolates between the full-information and bandit scenarios depending on the auctions’ results. We prove that, under this feedback, the algorithm that we propose achieves regret O~(K5/2T).
Dimension-free Private Mean Estimation for Anisotropic Distributions
Yuval Dagan, Michael I. Jordan, Xuelin Yang, Lydia Zakynthinou, Nikita Zhivotovskiy
Neurips proceedings 2024
We present differentially private algorithms for high-dimensional mean estimation. Previous private estimators on distributions over Rd suffer from a curse of dimensionality, as they require Ω(d1/2) samples to achieve non-trivial error, even in cases where O(1) samples suffice without privacy. This rate is unavoidable when the distribution is isotropic, namely, when the covariance is a multiple of the identity matrix. Yet, real-world data is often highly anisotropic, with signals concentrated on a small number of principal components. We develop estimators that are appropriate for such signals—our estimators are (ε,δ)-differentially private and have sample complexity that is dimension-independent for anisotropic subgaussian distributions. Given n samples from a distribution with known covariance-proxy Σ and unknown mean μ, we present an estimator μ^ that achieves error, |μ^−μ|2≤α, as long as n≳tr(Σ)/α2+tr(Σ1/2)/(αε). We show that this is the optimal sample complexity for this task up to logarithmic factors. Moreover, for the case of unknown covariance, we present an algorithm whose sample complexity has improved dependence on the dimension, from d1/2 to d1/4
Privacy Can Arise Endogenously in an Economic System with Learning Agents
Nivasini Ananthakrishnan, Tiffany Ding, Mariel Werner, Sai Praneeth Karimireddy, Michael I. Jordan
FORC 2024
We study price-discrimination games between buyers and a seller where privacy arises endogenously–that is, utility maximization yields equilibrium strategies where privacy occurs naturally. In this game, buyers with a high valuation for a good have an incentive to keep their valuation private, lest the seller charge them a higher price. This yields an equilibrium where some buyers will send a signal that misrepresents their type with some probability; we refer to this as buyer-induced privacy. When the seller is able to publicly commit to providing a certain privacy level, we find that their equilibrium response is to commit to ignore buyers’ signals with some positive probability; we refer to this as seller-induced privacy. We then turn our attention to a repeated interaction setting where the game parameters are unknown and the seller cannot credibly commit to a level of seller-induced privacy. In this setting, players must learn strategies based on information revealed in past rounds. We find that, even without commitment ability, seller-induced privacy arises as a result of reputation building. We characterize the resulting seller-induced privacy and seller’s utility under no-regret and no-policy-regret learning algorithms and verify these results through simulations.
Piecewise deterministic generative models
Andrea Bertazzi, Alain Oliviero-Durmus, Dario Shariatian, Umut Simsekli, Eric Moulines
Neurips Proceedings 2024
We introduce a novel class of generative models based on piecewise deterministic Markov processes (PDMPs), a family of non-diffusive stochastic processes consisting of deterministic motion and random jumps at random times. Similarly to diffusions, such Markov processes admit time reversals that turn out to be PDMPs as well. We apply this observation to three PDMPs considered in the literature: the Zig-Zag process, Bouncy Particle Sampler, and Randomised Hamiltonian Monte Carlo. For these three particular instances, we show that the jump rates and kernels of the corresponding time reversals admit explicit expressions depending on some conditional densities of the PDMP under consideration before and after a jump. Based on these results, we propose efficient training procedures to learn these characteristics and consider methods to approximately simulate the reverse process. Finally, we provide bounds in the total variation distance between the data distribution and the resulting distribution of our model in the case where the base distribution is the standard d-dimensional Gaussian distribution. Promising numerical simulations support further investigations into this class of models.
Learning to Mitigate Externalities: the Coase Theorem with Hindsight Rationality
Antoine Scheid, Aymeric Capitaine, Etienne Boursier, Eric Moulines, Michael I Jordan, Alain Durmus
Neurips Proceedings 2024
In Economics, the concept of externality refers to any indirect effect resulting from an interaction between players that affects the social welfare. Most of the models within which externality has been studied assume that agents have perfect knowledge of their environment and preferences. This is a major hindrance to the practical implementation of many proposed solutions. To address this issue, we consider a two-player bandit setting where the actions of one of the players affect the other player and we extend the Coase theorem [Coase, 1960]. This result shows that the optimal approach for maximizing the social welfare in the presence of externality is to establish property rights, i.e., enable transfers and bargaining between the players. Our work removes the classical assumption that bargainers possess perfect knowledge of the underlying game. We first demonstrate that in the absence of property rights, the social welfare breaks down. We then design a policy for the players which allows them to learn a bargaining strategy which maximizes the total welfare, recovering the Coase theorem under uncertainty.
Unravelling in Collaborative Learning
Aymeric Capitaine, Etienne Boursier, Antoine Scheid, Eric Moulines, Michael I. Jordan, El-Mahdi El-Mhamdi, Alain Durmus
Neurips proceedings 2024
Collaborative learning offers a promising avenue for leveraging decentralized data. However, collaboration in groups of strategic learners is not a given. In this work, we consider strategic agents who wish to train a model together but have sampling distributions of different quality. The collaboration is organized by a benevolent aggregator who gathers samples so as to maximize total welfare, but is unaware of data quality. This setting allows us to shed light on the deleterious effect of adverse selection in collaborative learning. More precisely, we demonstrate that when data quality indices are private, the coalition may undergo a phenomenon known as unravelling, wherein it shrinks up to the point that it becomes empty or solely comprised of the worst agent. We show how this issue can be addressed without making use of external transfers, by proposing a novel method inspired by probabilistic verification. This approach makes the grand coalition a Nash equilibrium with high probability despite information asymmetry, thereby breaking unravelling.
Sampling using adaptive regenerative processes
Hector McKimm, Andi Wang, Murray Pollock, Christian P Robert, Gareth O Roberts
Bernoulli 31 (1), 509-536, (February 2025) DOI: 10.3150/24-BEJ1737.
Enriching Brownian motion with regenerations from a fixed regeneration distribution μ at a particular regeneration rate κ results in a Markov process that has a target distribution π as its invariant distribution. For the purpose of Monte Carlo inference, implementing such a scheme requires firstly selection of regeneration distribution μ, and secondly computation of a specific constant C. Both of these tasks can be very difficult in practice for good performance. We introduce a method for adapting the regeneration distribution, by adding point masses to it. This allows the process to be simulated with as few regenerations as possible and obviates the need to find said constant C. Moreover, the choice of fixed μ is replaced with the choice of the initial regeneration distribution, which is considerably less difficult. We establish convergence of this resulting self-reinforcing process and explore its effectiveness at sampling from a number of target distributions. The examples show that adapting the regeneration distribution guards against poor choices of fixed regeneration distribution and can reduce the error of Monte Carlo estimates of expectations of interest, especially when π is skewed.
Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability
Sergey Samsonov, Daniil Tiapkin, Alexey Naumov, Eric Moulines
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:4511-4547, 2024.
In this paper we consider the problem of obtaining sharp bounds for the performance of temporal difference (TD) methods with linear function approximation for policy evaluation in discounted Markov decision processes. We show that a simple algorithm with a universal and instance-independent step size together with Polyak-Ruppert tail averaging is sufficient to obtain near-optimal variance and bias terms. We also provide the respective sample complexity bounds. Our proof technique is based on refined error bounds for linear stochastic approximation together with the novel stability result for the product of random matrices that arise from the TD-type recurrence.
The importance Markov chain
Charly Andral, Randal Duc, Hugo Marival, Christian P. Robert
Stochastic Processes and their Applications, 171, 2024
The Importance Markov chain is a novel algorithm bridging the gap between rejection sampling and importance sampling, moving from one to the other through a tuning parameter. Based on a modified sample of an instrumental Markov chain targeting an instrumental distribution (typically via a MCMC kernel), the Importance Markov chain produces an extended Markov chain where the marginal distribution of the first component converges to the target distribution. For example, when targeting a multimodal distribution, the instrumental distribution can be chosen as a tempered version of the target which allows the algorithm to explore its modes more efficiently. We obtain a Law of Large Numbers and a Central Limit Theorem as well as geometric ergodicity for this extended kernel under mild assumptions on the instrumental kernel. Computationally, the algorithm is easy to implement and preexisting librairies can be used to sample from the instrumental distribution.
Sequential estimation for mixture of regression models for heterogeneous population
Na You, Hongsheng Dai, Xueqin Wang, Qingyun Yu
Computational Statistics & Data Analysis
Heterogeneity among patients commonly exists in clinical studies and leads to challenges in medical research. It is widely accepted that there exist various sub-types in the population and they are distinct from each other. The approach of identifying the sub-types and thus tailoring disease prevention and treatment is known as precision medicine. The mixture model is a classical statistical model to cluster the heterogeneous population into homogeneous sub-populations. However, for the highly heterogeneous population with multiple components, its parameter estimation and clustering results may be ambiguous due to the dependence of the EM algorithm on the initial values. For sub-typing purposes, the finite mixture of regression models with concomitant variables is considered and a novel statistical method is proposed to identify the main components with large proportions in the mixture sequentially. Compared to existing typical statistical inferences, the new method not only requires no pre-specification on the number of components for model fitting, but also provides more reliable parameter estimation and clustering results. Simulation studies demonstrated the superiority of the proposed method. Real data analysis on the drug response prediction illustrated its reliability in the parameter estimation and capability to identify the important subgroup.
Classifier Calibration with ROC-Regularized Isotonic Regression
Eugène Berta, Francis Bach, Michael Jordan
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:1972-1980, 2024
Calibration of machine learning classifiers is necessary to obtain reliable and interpretable predictions, bridging the gap between model outputs and actual probabilities. One prominent technique, isotonic regression (IR), aims at calibrating binary classifiers by minimizing the cross entropy with respect to monotone transformations. IR acts as an adaptive binning procedure that is able to achieve a calibration error of zero but leaves open the issue of the effect on performance. We first prove that IR preserves the convex hull of the ROC curve—an essential performance metric for binary classifiers. This ensures that a classifier is calibrated while controlling for over-fitting of the calibration set. We then present a novel generalization of isotonic regression to accommodate classifiers with 𝐾𝐾-classes. Our method constructs a multidimensional adaptive binning scheme on the probability simplex, again achieving a multi-class calibration error equal to zero. We regularize this algorithm by imposing a form of monotony that preserves the 𝐾𝐾-dimensional ROC surface of the classifier. We show empirically that this general monotony criterion is effective in striking a balance between reducing cross entropy loss and avoiding over-fitting of the calibration set.
On Counterfactual Metrics for Social Welfare: Incentives, Ranking, and Information Asymmetry
Serena Wang, Stephen Bates, P Aronow, Michael Jordan
International Conference on Artificial Intelligence and Statistics, 2024
From the social sciences to machine learning, it is well documented that metrics do not always align with social welfare. In healthcare, Dranove et al.(2003) showed that publishing surgery mortality metrics actually harmed sicker patients by increasing provider selection behavior. Using a principal-agent model, we analyze the incentive misalignments that arise from such average treated outcome metrics, and show that the incentives driving treatment decisions would align with maximizing total patient welfare if the metrics (i) accounted for counterfactual untreated outcomes and (ii) considered total welfare instead of averaging over treated patients. Operationalizing this, we show how counterfactual metrics can be modified to behave reasonably in patient-facing ranking systems. Extending to realistic settings when providers observe more about patients than the regulatory agencies do, we bound the decay in performance by the degree of information asymmetry between principal and agent. In doing so, our model connects principal-agent information asymmetry with unobserved heterogeneity in causal inference.
Queuing dynamics of asynchronous Federated Learning
Leconte, L., Jonckheere, M., Samsonov, S., & Moulines, E.
International Conference on Artificial Intelligence and Statistics, 2024
We study asynchronous federated learning mechanisms with nodes having potentially different computational speeds. In such an environment, each node is allowed to work on models with potential delays and contribute to updates to the central server at its own pace. Existing analyses of such algorithms typically depend on intractable quantities such as the maximum node delay and do not consider the underlying queuing dynamics of the system. In this paper, we propose a non-uniform sampling scheme for the central server that allows for lower delays with better complexity, taking into account the closed Jackson network structure of the associated computational graph. Our experiments clearly show a significant improvement of our method over current state-of-the-art asynchronous algorithms on image classification problems.
Incentivized Learning in Principal-Agent Bandit Games
Antoine Scheid, Daniil Tiapkin, Etienne Boursier, Aymeric Capitaine, Mahdi El Mhamdi, Eric Moulines, Michael I. Jordan, Alain Durmus
ICML proceedings 2024
This work considers a repeated principal-agent bandit game, where the principal can only interact with her environment through the agent. The principal and the agent have misaligned objectives and the choice of action is only left to the agent. However, the principal can influence the agent’s decisions by offering incentives which add up to his rewards. The principal aims to iteratively learn an incentive policy to maximize her own total utility. This framework extends usual bandit roblems and is motivated by several practical applications, such as healthcare or ecological taxation, where traditionally used mechanism design theories often overlook the learning aspect of the problem. We present nearly optimal (with respect to a horizon T) learning algorithms for the principal’s regret in both multi-armed and linear contextual settings. Finally, we support our theoretical guarantees through numerical experiments.
Delegating Data Collection in Decentralized Machine Learning
Nivasini Ananthakrishnan, Stephen Bates, Michael I. Jordan, Nika Haghtalab
Proc. International Conference on Artificial Intelligence and Statistics, 2024.
Motivated by the emergence of decentralized machine learning ecosystems, we study the delegation of data collection. Taking the field of contract theory as our starting point, we design optimal and near-optimal contracts that deal with two fundamental machine learning challenges: lack of certainty in the assessment of model quality and lack of knowledge regarding the optimal performance of any model. We show that lack of certainty can be dealt with via simple linear contracts that achieve 1-1/e fraction of the first-best utility, even if the principal has a small test set. Furthermore, we give sufficient conditions on the size of the principal’s test set that achieves a vanishing additive approximation to the optimal utility. To address the lack of a priori knowledge regarding the optimal performance, we give a convex program that can adaptively and efficiently compute the optimal contract.
Class-Conditional Conformal Prediction with Many Classes
Tiffany Ding, Anastasios Angelopoulos, Stephen Bates, Michael Jordan, Ryan J. Tibshirani
NeurIPS Proceedings 2024
Standard conformal prediction methods provide a marginal coverage guarantee,which means that for a random test point, the conformal prediction set contains the true label with a user-specified probability. In many classificationproblems, we would like to obtain a stronger guarantee–that for test pointsof a specific class, the prediction set contains the true label with thesame user-chosen probability. For the latter goal, existing conformal predictionmethods do not work well when there is a limited amount of labeled data perclass, as is often the case in real applications where the number of classes islarge. We propose a method called clustered conformal prediction thatclusters together classes having « similar » conformal scores and performs conformal prediction at the cluster level. Based on empirical evaluation acrossfour image data sets with many (up to 1000) classes, we find that clusteredconformal typically outperforms existing methods in terms of class-conditionalcoverage and set size metrics.
Computing Bayes: From Then ‘Til Now
Gael M. Martin, David T. Frazier, Christian P Robert
Statistical Science 39(1): 3-19 (February 2024). DOI: 10.1214/22-STS876
This paper takes the reader on a journey through the history of Bayesian computation, from the 18th century to the present day. Beginning with the one-dimensional integral first confronted by Bayes in 1763, we highlight the key contributions of: Laplace, Metropolis (and, importantly, his coauthors), Hammersley and Handscomb, and Hastings, all of which set the foundations for the computational revolution in the late 20th century—led, primarily, by Markov chain Monte Carlo (MCMC) algorithms. A very short outline of 21st century computational methods—including pseudo-marginal MCMC, Hamiltonian Monte Carlo, sequential Monte Carlo and the various “approximate” methods—completes the paper.
Approximating Bayes in the 21st Century
Gael M. Martin, David T. Frazier, Christian P Robert
Statistical Science 39(1): 20-45 (February 2024). DOI: 10.1214/22-STS875
The 21st century has seen an enormous growth in the development and use of approximate Bayesian methods. Such methods produce computational solutions to certain “intractable” statistical problems that challenge exact methods like Markov chain Monte Carlo: for instance, models with unavailable likelihoods, high-dimensional models and models featuring large data sets. These approximate methods are the subject of this review. The aim is to help new researchers in particular—and more generally those interested in adopting a Bayesian approach to empirical work—distinguish between different approximate techniques, understand the sense in which they are approximate, appreciate when and why particular methods are useful and see the ways in which they can can be combined.
Living on the Edge: An Unified Approach to Antithetic Sampling
Roberto Casarin, Radu Craiu, Lorenzo Frattarolo, Christian P Robert
Statistical Science 39(1): 115-136 (February 2024). DOI: 10.1214/23-STS888
We identify recurrent ingredients in the antithetic sampling literature leading to a unified sampling framework. We introduce a new class of antithetic schemes that includes the most used antithetic proposals. This perspective enables the derivation of new properties of the sampling schemes: (i) optimality in the Kullback–Leibler sense; (ii) closed-form multivariate Kendall’s τ and Spearman’s ρ; (iii) ranking in concordance order and (iv) a central limit theorem that characterizes stochastic behaviour of Monte Carlo estimators when the sample size tends to infinity. The proposed simulation framework inherits the simplicity of the standard antithetic sampling method, requiring the definition of a set of reference points in the sampling space and the generation of uniform numbers on the segments joining the points. We provide applications to Monte Carlo integration and Markov Chain Monte Carlo Bayesian estimation.
Preprints
Tukey Depth Mechanisms for Practical Private Mean Estimation
Gavin Brown, Lydia Zakynthinou
arXiv:2502.18698
Mean estimation is a fundamental task in statistics and a focus within differentially private statistical estimation. While univariate methods based on the Gaussian mechanism are widely used in practice, more advanced techniques such as the exponential mechanism over quantiles offer robustness and improved performance, especially for small sample sizes. Tukey depth mechanisms carry these advantages to multivariate data, providing similar strong theoretical guarantees. However, practical implementations fall behind these theoretical developments. In this work, we take the first step to bridge this gap by implementing the (Restricted) Tukey Depth Mechanism, a theoretically optimal mean estimator for multivariate Gaussian distributions, yielding improved practical methods for private mean estimation. Our implementations enable the use of these mechanisms for small sample sizes or low-dimensional data. Additionally, we implement variants of these mechanisms that use approximate versions of Tukey depth, trading off accuracy for faster computation. We demonstrate their efficiency in practice, showing that they are viable options for modest dimensions. Given their strong accuracy and robustness guarantees, we contend that they are competitive approaches for mean estimation in this regime. We explore future directions for improving the computational efficiency of these algorithms by leveraging fast polytope volume approximation techniques, paving the way for more accurate private mean estimation in higher dimensions.
Exact Bayesian inference for Markov switching diffusions
Timothée Stumpf-Fétizon, Krzysztof Łatuszyński, Jan Palczewski, Gareth Roberts
arXiv:2502.09126
We give the first exact Bayesian methodology for the problem of inference in discretely observed regime switching diffusions. We design an MCMC and an MCEM algorithm that target the exact posterior of diffusion parameters and the latent regime process. The algorithms are exact in the sense that they target the correct posterior distribution of the continuous model, so that the errors are due to Monte Carlo only. Switching diffusion models extend ordinary diffusions by allowing for jumps in instantaneous drift and volatility. The jumps are driven by a latent, continuous time Markov switching process. We illustrate the method on numerical examples, including an empirical analysis of the method’s scalability in the length of the time series, and find that it is comparable in computational cost with discrete approximations while avoiding their shortcomings.
Transient regime of piecewise deterministic Monte Carlo algorithms
Sanket Agrawal, Joris Bierkens, Kengo Kamatani, Gareth O. Roberts
arXiv:2509.16062
Piecewise Deterministic Markov Processes (PDMPs) such as the Bouncy Particle Sampler and the Zig-Zag Sampler, have gained attention as continuous-time counterparts of classical Markov chain Monte Carlo. We study their transient regime under convex potentials, namely how trajectories that start in low-probability regions move toward higher-probability sets. Using fluid-limit arguments with a decomposition of the generator into fast and slow parts, we obtain deterministic ordinary differential equation descriptions of early-stage behaviour. The fast dynamics alone are non-ergodic because once the event rate reaches zero it does not restart. The slow component reactivates the dynamics, so averaging remains valid when taken over short micro-cycles rather than with respect to an invariant law. Using the expected number of jump events as a cost proxy for gradient evaluations, we find that for Gaussian targets the transient cost of PDMP methods is comparable to that of random-walk Metropolis. For convex heavy-tailed families with subquadratic growth, PDMP methods can be more efficient when event simulation is implemented well. Forward Event-Chain and Coordinate Samplers can, under the same assumptions, reach the typical set with an order-one expected number of jumps. For the Zig-Zag Sampler we show that, under a diagonal-dominance condition, the transient choice of direction coincides with the solution of a box-constrained quadratic program; outside that regime we give a formal derivation and a piecewise-smooth update rule that clarifies the roles of the gradient and the Hessian. These results provide theoretical insight and practical guidance for the use of PDMP samplers in large-scale inference.
On Global Convergence Rates for Federated Policy Gradient under Heterogeneous Environment
Safwan Labbi, Paul Mangold, Daniil Tiapkin, Eric Moulines
arXiv:2505.23459
Ensuring convergence of policy gradient methods in federated reinforcement learning (FRL) under environment heterogeneity remains a major challenge. In this work, we first establish that heterogeneity, perhaps counter-intuitively, can necessitate optimal policies to be non-deterministic or even time-varying, even in tabular environments. Subsequently, we prove global convergence results for federated policy gradient (FedPG) algorithms employing local updates, under a Łojasiewicz condition that holds only for each individual agent, in both entropy-regularized and non-regularized scenarios. Crucially, our theoretical analysis shows that FedPG attains linear speed-up with respect to the number of agents, a property central to efficient federated learning. Leveraging insights from our theoretical findings, we introduce b-RS-FedPG, a novel policy gradient method that employs a carefully constructed softmax-inspired parameterization coupled with an appropriate regularization scheme. We further demonstrate explicit convergence rates for b-RS-FedPG toward near-optimal stationary policies. Finally, we demonstrate that empirically both FedPG and b-RS-FedPG consistently outperform federated Q-learning on heterogeneous settings.
Permutations accelerate Approximate Bayesian Computation
Antoine Luciano, Charly Andral, Christian P. Robert, Robin J. Ryder
arXiv:2507.06037
Approximate Bayesian Computation (ABC) methods have become essential tools for performing inference when likelihood functions are intractable or computationally prohibitive. However, their scalability remains a major challenge in hierarchical or high-dimensional models. In this paper, we introduce permABC, a new ABC framework designed for settings with both global and local parameters, where observations are grouped into exchangeable compartments. Building upon the Sequential Monte Carlo ABC (ABC-SMC) framework, permABC exploits the exchangeability of compartments through permutation-based matching, significantly improving computational efficiency. We then develop two further, complementary sequential strategies: Over Sampling, which facilitates early-stage acceptance by temporarily increasing the number of simulated compartments, and Under Matching, which relaxes the acceptance condition by matching only subsets of the data. These techniques allow for robust and scalable inference even in high-dimensional regimes. Through synthetic and real-world experiments — including a hierarchical Susceptible-Infectious-Recover model of the early COVID-19 epidemic across 94 French departments — we demonstrate the practical gains in accuracy and efficiency achieved by our approach.
We introduce Backward Conformal Prediction, a method that guarantees conformal coverage while providing flexible control over the size of prediction sets. Unlike standard conformal prediction, which fixes the coverage level and allows the conformal set size to vary, our approach defines a rule that constrains how prediction set sizes behave based on the observed data, and adapts the coverage level accordingly. Our method builds on two key foundations: (i) recent results by Gauthier et al. [2025] on post-hoc validity using e-values, which ensure marginal coverage of the form ℙ(Ytest∈Ĉ α̃ n(Xtest))≥1−𝔼[α̃ ] up to a first-order Taylor approximation for any data-dependent miscoverage α̃ , and (ii) a novel leave-one-out estimator α̂ LOO of the marginal miscoverage 𝔼[α̃ ] based on the calibration set, ensuring that the theoretical guarantees remain computable in practice. This approach is particularly useful in applications where large prediction sets are impractical such as medical diagnosis. We provide theoretical results and empirical evidence supporting the validity of our method, demonstrating that it maintains computable coverage guarantees while ensuring interpretable, well-controlled prediction set sizes.
Information technology is in the midst of a revolution in which omnipresent data collection and machine learning are impacting the human world as never before. The word « intelligence » is being used as a North Star for the development of this technology, with human cognition viewed as a baseline. This view neglects the fact that humans are social animals, and that much of our intelligence is social and cultural in origin. A related issue is that the current view treats the social consequences of technology as an afterthought. The path forward is not merely more data and compute, and not merely more attention paid to cognitive or symbolic representations, but a thorough blending of economic and social concepts with computational and inferential concepts, in the service of system-level designs in which social welfare is a first-class citizen, and with the aspiration that a new human-centric engineering field will emerge.
Online Rolling Controlled Sequential Monte Carlo
Liwen Xue, Axel Finke, Adam M. Johansen
arXiv:2508.00696
We introduce methodology for real-time inference in general-state-space hidden Markov models. Specifically, we extend recent advances in controlled sequential Monte Carlo (CSMC) methods-originally proposed for offline smoothing-to the online setting via a rolling window mechanism. Our novel online rolling controlled sequential Monte Carlo (ORCSMC) algorithm employs two particle systems to simultaneously estimate twisting functions and perform filtering, ensuring real-time adaptivity to new observations while maintaining bounded computational cost. Numerical results on linear-Gaussian, stochastic volatility, and neuroscience models demonstrate improved estimation accuracy and robustness in higher dimensions, compared to standard particle filtering approaches. The method offers a statistically efficient and practical solution for sequential and real-time inference in complex latent variable models.
The Statistical Fairness-Accuracy Frontier
Alireza Fallah, Michael I. Jordan, Annie Ulichney
arXiv:2508.17622
Machine learning models must balance accuracy and fairness, but these goals often conflict, particularly when data come from multiple demographic groups. A useful tool for understanding this trade-off is the fairness-accuracy (FA) frontier, which characterizes the set of models that cannot be simultaneously improved in both fairness and accuracy. Prior analyses of the FA frontier provide a full characterization under the assumption of complete knowledge of population distributions — an unrealistic ideal. We study the FA frontier in the finite-sample regime, showing how it deviates from its population counterpart and quantifying the worst-case gap between them. In particular, we derive minimax-optimal estimators that depend on the designer’s knowledge of the covariate distribution. For each estimator, we characterize how finite-sample effects asymmetrically impact each group’s risk, and identify optimal sample allocation strategies. Our results transform the FA frontier from a theoretical construct into a practical tool for policymakers and practitioners who must often design algorithms with limited data.
Conditional Diffusion Models with Classifier-Free Gibbs-like Guidance
Badr Moufad, Yazid Janati, Alain Durmus, Ahmed Ghorbel, Eric Moulines, Jimmy Olsson
arXiv:2505.21101
Classifier-Free Guidance (CFG) is a widely used technique for improving conditional diffusion models by linearly combining the outputs of conditional and unconditional denoisers. While CFG enhances visual quality and improves alignment with prompts, it often reduces sample diversity, leading to a challenging trade-off between quality and diversity. To address this issue, we make two key contributions. First, CFG generally does not correspond to a well-defined denoising diffusion model (DDM). In particular, contrary to common intuition, CFG does not yield samples from the target distribution associated with the limiting CFG score as the noise level approaches zero — where the data distribution is tilted by a power w>1 of the conditional distribution. We identify the missing component: a Rényi divergence term that acts as a repulsive force and is required to correct CFG and render it consistent with a proper DDM. Our analysis shows that this correction term vanishes in the low-noise limit. Second, motivated by this insight, we propose a Gibbs-like sampling procedure to draw samples from the desired tilted distribution. This method starts with an initial sample from the conditional diffusion model without CFG and iteratively refines it, preserving diversity while progressively enhancing sample quality. We evaluate our approach on both image and text-to-audio generation tasks, demonstrating substantial improvements over CFG across all considered metrics.
Valid Selection among Conformal Sets
Mahmoud Hegazy, Liviu Aolaritei, Michael I. Jordan, Aymeric Dieuleveut
arXiv:2506.20173
Conformal prediction offers a distribution-free framework for constructing prediction sets with coverage guarantees. In practice, multiple valid conformal prediction sets may be available, arising from different models or methodologies. However, selecting the most desirable set, such as the smallest, can invalidate the coverage guarantees. To address this challenge, we propose a stability-based approach that ensures coverage for the selected prediction set. We extend our results to the online conformal setting, propose several refinements in settings where additional structure is available, and demonstrate its effectiveness through experiments.
Exact parameter and trajectory inference in state-space models is typically achieved by one of two methods: particle marginal Metropolis-Hastings (PMMH) or particle Gibbs (PGibbs). PMMH is a pseudo-marginal algorithm which jointly proposes a new trajectory and parameter, and accepts or rejects both at once. PGibbs instead alternates between sampling from the trajectory, using an algorithm known as conditional sequential Monte Carlo (CSMC) and the parameter in a Hastings-within-Gibbs fashion. While particle independent Metropolis Hastings (PIMH), the parameter-free version of PMMH, is known to be statistically worse than CSMC, PGibbs can induce a slow mixing if the parameter and the state trajectory are very correlated. This has made PMMH the method of choice for many practitioners, despite theory and experiments favouring CSMC over PIMH for the parameter-free problem. In this article, we describe a formulation of PGibbs which bypasses the Gibbs step, essentially marginalizing over the trajectory distribution in a fashion similar to PMMH. This is achieved by considering the implementation of a CSMC algortihm for the state-space model integrated over the joint distribution of the current parameter and the parameter proposal. We illustrate the benefits of method on a simple example known to be challenging for PMMH.
Online Decision-Making in Tree-Like Multi-Agent Games with Transfers
Antoine Scheid, Etienne Boursier, Alain Durmus, Eric Moulines, Michael Jordan
arXiv:2501.19388
The widespread deployment of Machine Learning systems everywhere raises challenges, such as dealing with interactions or competition between multiple learners. In that goal, we study multi-agent sequential decision-making by considering principal-agent interactions in a tree structure. In this problem, the reward of a player is influenced by the actions of her children, who are all self-interested and non-cooperative, hence the complexity of making good decisions. Our main finding is that it is possible to steer all the players towards the globally optimal set of actions by simply allowing single-step transfers between them. A transfer is established between a principal and one of her agents: the principal actually offers the proposed payment if the agent picks the recommended action. The analysis poses specific challenges due to the intricate interactions between the nodes of the tree and the propagation of the regret within this tree. Considering a bandit setup, we propose algorithmic solutions for the players to end up being no-regret with respect to the optimal pair of actions and incentives. In the long run, allowing transfers between players makes them act as if they were collaborating together, although they remain self-interested non-cooperative: transfers restore efficiency.
Online Decision-Focused Learning
Aymeric Capitaine, Maxime Haddouche, Eric Moulines, Michael I. Jordan, Etienne Boursier, Alain Durmus
arXiv:2505.13564
Decision-focused learning (DFL) is an increasingly popular paradigm for training predictive models whose outputs are used in decision-making tasks. Instead of merely optimizing for predictive accuracy, DFL trains models to directly minimize the loss associated with downstream decisions. This end-to-end strategy holds promise for tackling complex combinatorial problems; however, existing studies focus solely on scenarios where a fixed batch of data is available and the objective function does not change over time. We instead investigate DFL in dynamic environments where the objective function and data distribution evolve over time. This setting is challenging because the objective function has zero or undefined gradients — which prevents the use of standard first-order optimization methods — and is generally non-convex. To address these difficulties, we (i) regularize the objective to make it differentiable and (ii) make use of the optimism principle, based on a near-optimal oracle along with an appropriate perturbation. This leads to a practical online algorithm for which we establish bounds on the expected dynamic regret, both when the decision space is a simplex and when it is a general bounded convex polytope. Finally, we demonstrate the effectiveness of our algorithm by comparing its performance with a classic prediction-focused approach on a simple knapsack experiment.
Asymptotics of approximate Bayesian computation when summary statistics converge at heterogeneous rates
Caroline Lawless, Christian P. Robert, Judith Rousseau, Robin J. Ryder
arXiv:2311.10080
We consider the asymptotic properties of Approximate Bayesian Computation (ABC) for the realistic case of summary statistics with heterogeneous rates of convergence. We allow some statistics to converge faster than the ABC tolerance, other statistics to converge slower, and cover the case where some statistics do not converge at all. We give conditions for the ABC posterior to converge, and provide an explicit representation of the shape of the ABC posterior distribution in our general setting; in particular, we show how the shape of the posterior depends on the number of slow statistics. We then quantify the gain brought by the local linear post-processing step.
On integral priors for multiple comparison in Bayesian model selection
Diego Salmerón, Juan Antonio Cano, Christian P. Robert
arXiv:2406.14184
Noninformative priors constructed for estimation purposes are usually not appropriate for model selection and testing. The methodology of integral priors was developed to get prior distributions for Bayesian model selection when comparing two models, modifying initial improper reference priors. We propose a generalization of this methodology to more than two models. Our approach adds an artificial copy of each model under comparison by compactifying the parametric space and creating an ergodic Markov chain across all models that returns the integral priors as marginals of the stationary distribution. Besides the garantee of their existance and the lack of paradoxes attached to estimation reference priors, an additional advantage of this methodology is that the simulation of this Markov chain is straightforward as it only requires simulations of imaginary training samples for all models and from the corresponding posterior distributions. This renders its implementation automatic and generic, both in the nested case and in the nonnested case.
Differential privacy guarantees of Markov chain Monte Carlo algorithms
Andrea Bertazzi, Tim Johnston, Gareth O. Roberts, Alain Durmus
arXiv:2502.17150
This paper aims to provide differential privacy (DP) guarantees for Markov chain Monte Carlo (MCMC) algorithms. In a first part, we establish DP guarantees on samples output by MCMC algorithms as well as Monte Carlo estimators associated with these methods under assumptions on the convergence properties of the underlying Markov chain. In particular, our results highlight the critical condition of ensuring the target distribution is differentially private itself. In a second part, we specialise our analysis to the unadjusted Langevin algorithm and stochastic gradient Langevin dynamics and establish guarantees on their (Rényi) DP. To this end, we develop a novel methodology based on Girsanov’s theorem combined with a perturbation trick to obtain bounds for an unbounded domain and in a non-convex setting. We establish: (i) uniform in n privacy guarantees when the state of the chain after n iterations is released, (ii) bounds on the privacy of the entire chain trajectory. These findings provide concrete guidelines for privacy-preserving MCMC.
A Principled Approach to Bayesian Transfer Learning
Adam Bretherton, Joshua J. Bon, David J. Warne, Kerrie Mengersen, Christopher Drovandi
arXiv:2502.19796
Updating a priori information given some observed data is the core tenet of Bayesian inference. Bayesian transfer learning extends this idea by incorporating information from a related dataset to improve the inference on the observed data which may have been collected under slightly different settings. The use of related information can be useful when the observed data is scarce, for example. Current Bayesian transfer learning methods that are based on the so-called power prior can adaptively transfer information from related data. Unfortunately, it is not always clear under which scenario Bayesian transfer learning performs best or even if it will improve Bayesian inference. Additionally, current power prior methods rely on conjugacy to evaluate the posterior of interest. We propose using leave-one-out cross validation on the target dataset as a means of evaluating Bayesian transfer learning methods. Further, we introduce a new framework, transfer sequential Monte Carlo, for power prior approaches that efficiently chooses the transfer parameter while avoiding the need for conjugate priors. We assess the performance of our proposed methods in two comprehensive simulation studies.with the number of agents.
Annealed variational mixtures for disease subtyping and biomarker discovery
Emma Prevot, Rory Toogood, Filippo Pagani, Paul D. W. Kirk
arXiv:2411.19262
Cluster analyses of high-dimensional data are often hampered by the presence of large numbers of variables that do not provide relevant information, as well as the perennial issue of choosing an appropriate number of clusters. These challenges are frequently encountered when analysing `omics datasets, such as in molecular precision medicine, where a key goal is to identify disease subtypes and the biomarkers that define them. Here we introduce an annealed variational Bayes algorithm for fitting high-dimensional mixture models while performing variable selection. Our algorithm is scalable and computationally efficient, and we provide an open source Python implementation, VBVarSel. In a range of simulated and real biomedical examples, we show that VBVarSel outperforms the current state of the art, and demonstrate its use for cancer subtyping and biomarker discovery.
Beyond Log-Concavity and Score Regularity: Improved Convergence Bounds for Score-Based Generative Models in W2-distance
Marta Gentiloni-Silveri, Antonio Ocello
arXiv:2501.02298
Score-based Generative Models (SGMs) aim to sample from a target distribution by learning score functions using samples perturbed by Gaussian noise. Existing convergence bounds for SGMs in the W2-distance rely on stringent assumptions about the data distribution. In this work, we present a novel framework for analyzing W2-convergence in SGMs, significantly relaxing traditional assumptions such as log-concavity and score regularity. Leveraging the regularization properties of the Ornstein-Uhlenbeck (OU) process, we show that weak log-concavity of the data distribution evolves into log-concavity over time. This transition is rigorously quantified through a PDE-based analysis of the Hamilton-Jacobi-Bellman equation governing the log-density of the forward process. Moreover, we establish that the drift of the time-reversed OU process alternates between contractive and non-contractive regimes, reflecting the dynamics of concavity. Our approach circumvents the need for stringent regularity conditions on the score function and its estimators, relying instead on milder, more practical assumptions. We demonstrate the wide applicability of this framework through explicit computations on Gaussian mixture models, illustrating its versatility and potential for broader classes of data distributions.
Discrete Markov Probabilistic Models
Le-Tuyet-Nhi Pham, Dario Shariatian, Antonio Ocello, Giovanni Conforti, Alain Durmus
arXiv:2502.07939
This paper introduces the Discrete Markov Probabilistic Model (DMPM), a novel algorithm for discrete data generation. The algorithm operates in the space of bits {0,1}d, where the noising process is a continuous-time Markov chain that can be sampled exactly via a Poissonian clock that flips labels uniformly at random. The time-reversal process, like the forward noise process, is a jump process, with its intensity governed by a discrete analogue of the classical score function. Crucially, this intensity is proven to be the conditional expectation of a function of the forward process, strengthening its theoretical alignment with score-based generative models while ensuring robustness and efficiency. We further establish convergence bounds for the algorithm under minimal assumptions and demonstrate its effectiveness through experiments on low-dimensional Bernoulli-distributed datasets and high-dimensional binary MNIST data. The results highlight its strong performance in generating discrete structures. This work bridges theoretical foundations and practical applications, advancing the development of effective and theoretically grounded discrete generative modeling.
Self-normalized Cramér-type Moderate Deviation of Stochastic Gradient Langevin Dynamics
Hongsheng Dai, Xiequan Fan, Jianya Lu
arXiv:2410.22047
In this paper, we study the self-normalized Cramér-type moderate deviation of the empirical measure of the stochastic gradient Langevin dynamics (SGLD). Consequently, we also derive the Berry-Esseen bound for SGLD. Our approach is by constructing a stochastic differential equation (SDE) to approximate the SGLD and then applying Stein’s method as developed in [9,19], to decompose the empirical measure into a martingale difference series sum and a negligible remainder term.
Optimal Design for Reward Modeling in RLHF
Antoine Scheid, Etienne Boursier, Alain Durmus, Michael I. Jordan, Pierre Ménard, Eric Moulines, Michal Valko
arXiv:2410.17055
Reinforcement Learning from Human Feedback (RLHF) has become a popular approach to align language models (LMs) with human preferences. This method involves collecting a large dataset of human pairwise preferences across various text generations and using it to infer (implicitly or explicitly) a reward model. Numerous methods have been proposed to learn the reward model and align a LM with it. However, the costly process of collecting human preferences has received little attention and could benefit from theoretical insights. This paper addresses this issue and aims to formalize the reward training model in RLHF. We frame the selection of an effective dataset as a simple regret minimization task, using a linear contextual dueling bandit method. Given the potentially large number of arms, this approach is more coherent than the best-arm identification setting. We then propose an offline framework for solving this problem. Under appropriate assumptions – linearity of the reward model in the embedding space, and boundedness of the reward parameter – we derive bounds on the simple regret. Finally, we provide a lower bound that matches our upper bound up to constant and logarithmic terms. To our knowledge, this is the first theoretical contribution in this area to provide an offline approach as well as worst-case guarantees.
Learning Contracts in Hierarchical Multi-Agent Systems
Antoine Scheid, Etienne Boursier, Alain Durmus, Eric Moulines, Michael Jordan
arXiv:2501.19388
The emergence of Machine Learning systems everywhere raises new challenges, such as dealing with interactions or competition between multiple learners. In that goal, we study multi-agent sequential decision-making by considering principal-agent interactions in a tree structure. In this problem, the reward of a player is influenced by the actions of her children, who are all self-interested and non-cooperative. Our main finding is that it is possible to steer all the players towards the globally optimal set of actions by simply allowing single-step contracts between them. A contract is established between a principal and one of her agents: the principal actually offers the proposed payment if the agent picks the recommended action. The analysis poses specific challenges due to the intricate interactions between the nodes of the tree. Within a bandit setup, we propose algorithmic solutions for the players to end up being no-regret with respect to the optimal pair of actions and contracts. In the long run, allowing contracts makes the players act as if they were collaborating together, although they remain non-cooperative.
We study time-changed Markov processes to speed up the convergence of Markov chain Monte Carlo (MCMC) algorithms in the context of multimodal distributions and rare event simulation. The time-changed process is defined by adjusting the speed of time of a base process via a user-chosen, state-dependent function. We apply this framework to several Markov processes from the MCMC literature, such as Langevin diffusions and piecewise deterministic Markov processes, obtaining novel modifications of classical algorithms and also re-discovering known MCMC algorithms. We prove theoretical properties of the time-changed process under suitable conditions on the base process, focusing on connecting the stationary distributions and qualitative convergence properties such as geometric and uniform ergodicity, as well as a functional central limit theorem. A comparison with the framework of space transformations is provided, clarifying the similarities between the approaches. Throughout the paper we give various visualisations and numerical simulations on simple tasks to gain intuition on the method and its performance.
Importance sampling-based gradient method for dimension reduction in Poisson log-normal model
Bastien Batardière, Julien Chiquet , Joon Kwon, Julien Stoehr
arXiv:2410.00476
High-dimensional count data poses significant challenges for statistical analysis, necessitating effective methods that also preserve explainability. We focus on a low rank constrained variant of the Poisson lognormal model, which relates the observed data to a latent low-dimensional multivariate Gaussian variable via a Poisson distribution. Variational inference methods have become a golden standard solution to infer such a model. While computationally efficient, they usually lack theoretical statistical properties with respect to the model. To address this issue we propose a projected stochastic gradient scheme that directly maximizes the loglikelihood. We prove the convergence of the proposed method when using importance sampling for estimating the gradient. Specifically, we obtain a rate of convergence of O(T −1/2 + N −1) with T the number of iterations and N the number of Monte Carlo draws. The latter follows from a novel descent lemma for non convex L-smooth objective functions, and random biased gradient estimate. We also demonstrate numerically the efficiency of our solution compared to its variational competitor. Our method not only scales with respect to the number of observed samples but also provides access to the desirable properties of the maximum likelihood estimator.
Composite likelihood inference for the Poisson log-normal model
Julien Stoehr, Stéphane Robin
arXiv:2402.14390
The Poisson log-normal model is a latent variable model that provides a generic framework for the analysis of multivariate count data. Inferring its parameters can be a daunting task since the conditional distribution of the latent variables given the observed ones is intractable. For this model, variational approaches are the golden standard solution as they prove to be computationally efficient but lack theoretical guarantees on the estimates. Sampling based solutions are quite the opposite. Starting from already available variational approximations, we define a first Monte Carlo EM algorithm to obtain maximum likelihood estimators. We then extend this algorithm to the case of a composite likelihood in order to be able to handle higher dimensional count data.
Safety vs. Performance: How Multi-Objective Learning Reduces Barriers to Market Entry
Meena Jagadeesan, Michael I. Jordan, Jacob Steinhardt
arXiv:2409.03734
Emerging marketplaces for large language models and other large-scale machine learning (ML) models appear to exhibit market concentration, which has raised concerns about whether there are insurmountable barriers to entry in such markets. In this work, we study this issue from both an economic and an algorithmic point of view, focusing on a phenomenon that reduces barriers to entry. Specifically, an incumbent company risks reputational damage unless its model is sufficiently aligned with safety objectives, whereas a new company can more easily avoid reputational damage. To study this issue formally, we define a multi-objective high-dimensional regression framework that captures reputational damage, and we characterize the number of data points that a new company needs to enter the market. Our results demonstrate how multi-objective considerations can fundamentally reduce barriers to entry — the required number of data points can be significantly smaller than the incumbent company’s dataset size. En route to proving these results, we develop scaling laws for high-dimensional linear regression in multi-objective environments, showing that the scaling rate becomes slower when the dataset size is large, which could be of independent interest.
Learning Variational Inequalities from Data: Fast Generalization Rates under Strong Monotonicity
Eric Zhao, Tatjana Chavdarova, Michael Jordan
arXiv:2410.20649
Variational inequalities (VIs) are a broad class of optimization problems encompassing machine learning problems ranging from standard convex minimization to more complex scenarios like min-max optimization and computing the equilibria of multi-player games. In convex optimization, strong convexity allows for fast statistical learning rates requiring only Θ(1/ϵ) stochastic first-order oracle calls to find an ϵ-optimal solution, rather than the standard Θ(1/ϵ2) calls. In this paper, we explain how one can similarly obtain fast Θ(1/ϵ) rates for learning VIs that satisfy strong monotonicity, a generalization of strong convexity. Specifically, we demonstrate that standard stability-based generalization arguments for convex minimization extend directly to VIs when the domain admits a small covering, or when the operator is integrable and suboptimality is measured by potential functions; such as when finding equilibria in multi-player games.
Stereographic Markov Chain Monte Carlo
Shenggang Hu, Louis Aslett, Hongsheng Dai, Murray Pollock, Gareth O. Roberts
arXiv:2205.12112
High-dimensional distributions, especially those with heavy tails, are notoriously difficult for off-the-shelf MCMC samplers: the combination of unbounded state spaces, diminishing gradient information, and local moves results in empirically observed « stickiness » and poor theoretical mixing properties — lack of geometric ergodicity. In this paper, we introduce a new class of MCMC samplers that map the original high-dimensional problem in Euclidean space onto a sphere and remedy these notorious mixing problems. In particular, we develop random-walk Metropolis type algorithms as well as versions of the Bouncy Particle Sampler that are uniformly ergodic for a large class of light and heavy-tailed distributions and also empirically exhibit rapid convergence in high dimensions. In the best scenario, the proposed samplers can enjoy the « blessings of dimensionality » that the convergence is faster in higher dimensions.
Privacy Guarantees in Posterior Sampling under Contamination
Shenggang Hu, Louis Aslett, Hongsheng Dai, Murray Pollock, Gareth O. Roberts
arXiv:2403.07772
In recent years, differential privacy has been adopted by tech-companies and governmental agencies as the standard for measuring privacy in algorithms. We study the level of differential privacy in Bayesian posterior sampling setups. As opposed to the common privatization approach of injecting Laplace/Gaussian noise into the output, Huber’s contamination model is considered, where we replace at random the data points with samples from a heavy-tailed distribution. We derived bounds for the differential privacy level (ϵ,δ) for our approach while lifting the common restriction on assuming bounded observation and parameter space seen in the existing literature. We further consider the effect of sample size on privacy level and the convergence rate of (ϵ,δ) to zero. Asymptotically, the contamination approach is fully private at no cost of information loss. We also provide some examples depicting inference models that our setup is applicable to with a theoretical estimation of convergence rate.
Information Elicitation in Agency Games
Serena Wang, Michael I. Jordan, Katrina Ligett, R. Preston McAfee
arXiv:2402.14005
Rapid progress in scalable, commoditized tools for data collection and data processing has made it possible for firms and policymakers to employ ever more complex metrics as guides for decision-making. These developments have highlighted a prevailing challenge — deciding *which* metrics to compute. In particular, a firm’s ability to compute a wider range of existing metrics does not address the problem of *unknown unknowns*, which reflects informational limitations on the part of the firm. To guide the choice of metrics in the face of this informational problem, we turn to the evaluated agents themselves, who may have more information than a principal about how to measure outcomes effectively. We model this interaction as a simple agency game, where we ask: *When does an agent have an incentive to reveal the observability of a cost-correlated variable to the principal?* There are two effects: better information reduces the agent’s information rents but also makes some projects go forward that otherwise would fail. We show that the agent prefers to reveal information that exposes a strong enough differentiation between high and low costs. Expanding the agent’s action space to include the ability to *garble* their information, we show that the agent often prefers to garble over full revelation. Still, giving the agent the ability to garble can lead to higher total welfare. Our model has analogies with price discrimination, and we leverage some of these synergies to analyze total welfare.
Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning
Sergey Samsonov, Eric Moulines, Qi-Man Shao, Zhuo-Song Zhang, Alexey Naumov
arXiv:2405.16644
In this paper, we obtain the Berry-Esseen bound for multivariate normal approximation for the Polyak-Ruppert averaged iterates of the linear stochastic approximation (LSA) algorithm with decreasing step size. Our findings reveal that the fastest rate of normal approximation is achieved when setting the most aggressive step size αk≍k−1/2. Moreover, we prove the non-asymptotic validity of the confidence intervals for parameter estimation with LSA based on multiplier bootstrap. This procedure updates the LSA estimate together with a set of randomly perturbed LSA estimates upon the arrival of subsequent observations. We illustrate our findings in the setting of temporal difference learning with linear function approximation.
Probability and moment inequalities for additive functionals of geometrically ergodic Markov chains
Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov
arXiv:2109.00331
In this paper, we establish moment and Bernstein-type inequalities for additive functionals of geometrically ergodic Markov chains. These inequalities extend the corresponding inequalities for independent random variables. Our conditions cover Markov chains converging geometrically to the stationary distribution either in V-norms or in weighted Wasserstein distances. Our inequalities apply to unbounded functions and depend explicitly on constants appearing in the conditions that we consider.
Almost sure convergence rates of adaptive increasingly rare Markov chain Monte Carlo
Julian Hofstadler, Krzysztof Latuszynski, Gareth O. Roberts, Daniel Rudolf
arXiv:2402.04114
We consider adaptive increasingly rare Markov chain Monte Carlo (AIR MCMC), which is an adaptive MCMC method, where the adaptation concerning the past happens less and less frequently over time. Under a contraction assumption for a Wasserstein-like function we deduce upper bounds of the convergence rate of Monte Carlo sums taking a renormalisation factor into account that is close to the one that appears in a law of the iterated logarithm. We demonstrate the applicability of our results by considering different settings, among which are those of simultaneous geometric and uniform ergodicity. All proofs are carried out on an augmented state space, including the classical non-augmented setting as a special case. In contrast to other adaptive MCMC limit theory, some technical assumptions, like diminishing adaptation, are not needed.
SCAFFLSA: Taming Heterogeneity in Federated Linear Stochastic Approximation and TD Learning
Paul Mangold, Sergey Samsonov, Safwan Labbi, Ilya Levin, Reda Alami, Alexey Naumov, Eric Moulines
arXiv:2402.04114
In this paper, we analyze the sample and communication complexity of the federated linear stochastic approximation (FedLSA) algorithm. We explicitly quantify the effects of local training with agent heterogeneity. We show that the communication complexity of FedLSA scales polynomially with the inverse of the desired accuracy ϵ. To overcome this, we propose SCAFFLSA a new variant of FedLSA that uses control variates to correct for client drift, and establish its sample and communication complexities. We show that for statistically heterogeneous agents, its communication complexity scales logarithmically with the desired accuracy, similar to Scaffnew. An important finding is that, compared to the existing results for Scaffnew, the sample complexity scales with the inverse of the number of agents, a property referred to as linear speed-up. Achieving this linear speed-up requires completely new theoretical arguments. We apply the proposed method to federated temporal difference learning with linear function approximation and analyze the corresponding complexity improvements.
On integral priors for multiple comparison in Bayesian model selection
Diego Salmerón, Juan Antonio Cano, Christian P. Robert
Arxiv 2406.14184
Noninformative priors constructed for estimation purposes are usually not appropriate for model selection and testing. The methodology of integral priors was developed to get prior distributions for Bayesian model selection when comparing two models, modifying initial improper reference priors. We propose a generalization of this methodology to more than two models. Our approach adds an artificial copy of each model under comparison by compactifying the parametric space and creating an ergodic Markov chain across all models that returns the integral priors as marginals of the stationary distribution. Besides the garantee of their existance and the lack of paradoxes attached to estimation reference priors, an additional advantage of this methodology is that the simulation of this Markov chain is straightforward as it only requires simulations of imaginary training samples for all models and from the corresponding posterior distributions. This renders its implementation automatic and generic, both in the nested case and in the nonnested case.
Fair Allocation in Dynamic Mechanism Design
Alireza Fallah, Michael I. Jordan, Annie Ulichney
Arxiv 2406.00147
We consider a dynamic mechanism design problem where an auctioneer sells an indivisible good to two groups of buyers in every round, for a total of T rounds. The auctioneer aims to maximize their discounted overall revenue while adhering to a fairness constraint that guarantees a minimum average allocation for each group. We begin by studying the static case (T=1) and establish that the optimal mechanism involves two types of subsidization: one that increases the overall probability of allocation to all buyers, and another that favors the group which otherwise has a lower probability of winning the item. We then extend our results to the dynamic case by characterizing a set of recursive functions that determine the optimal allocation and payments in each round. Notably, our results establish that in the dynamic case, the seller, on the one hand, commits to a participation reward to incentivize truth-telling, and on the other hand, charges an entry fee for every round. Moreover, the optimal allocation once more involves subsidization in favor of one group, where the extent of subsidization depends on the difference in future utilities for both the seller and buyers when allocating the item to one group versus the other. Finally, we present an approximation scheme to solve the recursive equations and determine an approximately optimal and fair allocation efficiently.
Fairness-Aware Meta-Learning via Nash Bargaining
Yi Zeng, Xuelin Yang, Li Chen, Cristian Canton Ferrer, Ming Jin, Michael I. Jordan, Ruoxi Jia
Arxiv 2406.07029
To address issues of group-level fairness in machine learning, it is natural to adjust model parameters based on specific fairness objectives over a sensitive-attributed validation set. Such an adjustment procedure can be cast within a meta-learning framework. However, naive integration of fairness goals via meta-learning can cause hypergradient conflicts for subgroups, resulting in unstable convergence and compromising model performance and fairness. To navigate this issue, we frame the resolution of hypergradient conflicts as a multi-player cooperative bargaining game. We introduce a two-stage meta-learning framework in which the first stage involves the use of a Nash Bargaining Solution (NBS) to resolve hypergradient conflicts and steer the model toward the Pareto front, and the second stage optimizes with respect to specific fairness goals. Our method is supported by theoretical results, notably a proof of the NBS for gradient aggregation free from linear independence assumptions, a proof of Pareto improvement, and a proof of monotonic improvement in validation loss. We also show empirical effects across various fairness objectives in six key fairness datasets and two image classification tasks.
Defection-Free Collaboration between Competitors in a Learning System
Mariel Werner, Sai Praneeth Karimireddy, Michael I. Jordan
Arxiv 2406.15898
We study collaborative learning systems in which the participants are competitors who will defect from the system if they lose revenue by collaborating. As such, we frame the system as a duopoly of competitive firms who are each engaged in training machine-learning models and selling their predictions to a market of consumers. We first examine a fully collaborative scheme in which both firms share their models with each other and show that this leads to a market collapse with the revenues of both firms going to zero. We next show that one-sided collaboration in which only the firm with the lower-quality model shares improves the revenue of both firms. Finally, we propose a more equitable, *defection-free* scheme in which both firms share with each other while losing no revenue, and we show that our algorithm converges to the Nash bargaining solution.
Learning to Mitigate Externalities: the Coase Theorem with Hindsight Rationality
Antoine Scheid, Aymeric Capitaine, Etienne Boursier, Eric Moulines, Michael I Jordan, Alain Durmus
Neurips Proceedings 2024
In Economics, the concept of externality refers to any indirect effect resulting from an interaction between players that affects the social welfare. Most of the models within which externality has been studied assume that agents have perfect knowledge of their environment and preferences. This is a major hindrance to the practical implementation of many proposed solutions. To address this issue, we consider a two-player bandit setting where the actions of one of the players affect the other player and we extend the Coase theorem [Coase, 1960]. This result shows that the optimal approach for maximizing the social welfare in the presence of externality is to establish property rights, i.e., enable transfers and bargaining between the players. Our work removes the classical assumption that bargainers possess perfect knowledge of the underlying game. We first demonstrate that in the absence of property rights, the social welfare breaks down. We then design a policy for the players which allows them to learn a bargaining strategy which maximizes the total welfare, recovering the Coase theorem under uncertainty.
A discussion of the paper « Safe testing » by Grünwald, de Heide, and Koolen
Joshua Bon, Christian P Robert
ArXiv:2402.15574, February 2024
This is a discussion of the paper « Safe testing » by Grünwald, de Heide, and Koolen, Read before The Royal Statistical Society at a meeting organized by the Research Section on Wednesday, 24 January, 2024. It argues that the central proposal found in this quite original paper could prove relevant for the so-called objective Bayes approach (Berger, 1985; Robert, 2001) as it may open a way to deriving prior distributions from (frequentist) first principles. However, it remains unclear to the discussants how to construct the least favourable prior on H0 on a general basis, especially from a computational viewpoint, and, more importantly, (ii) whether it is at all of inferential interest [i.e., whether it degenerates into a point mass]. With respect to the sequential directions of the paper, we also wonder at the potential connections with sequential Monte Carlo, for instance, towards conducting sequential model choice by constructing efficiently an amalgamated evidence value when the product of Bayes factors is not a Bayes factor (see, e.g. Buchholz et al. 2023).
Asymptotics of approximate Bayesian computation when summary statistics converge at heterogeneous rates
Caroline Lawless, Christian P Robert, Judith Rousseau, Robin J Ryder
ArXiv:2311.10080, November 2023
We consider the asymptotic properties of Approximate Bayesian Computation (ABC) for the realistic case of summary statistics with heterogeneous rates of convergence. We allow some statistics to converge faster than the ABC tolerance, other statistics to converge slower, and cover the case where some statistics do not converge at all. We give conditions for the ABC posterior to converge, and provide an explicit representation of the shape of the ABC posterior distribution in our general setting; in particular, we show how the shape of the posterior depends on the number of slow statistics. We then quantify the gain brought by the local linear post-processing step.
An analysis of the noise schedule for score-based generative models
Stanislas Strasman, Antonio Ocello, Claire Boyer, Sylvain Le Corff, Vincent Lemaire
ArXiv:2402.04650, February 2024
Score-based generative models (SGMs) aim at estimating a target data distribution by learning score functions using only noise-perturbed samples from the target. Recent literature has focused extensively on assessing the error between the target and estimated distributions, gauging the generative quality through the Kullback-Leibler (KL) divergence and Wasserstein distances. All existing results have been obtained so far for time-homogeneous speed of the noise schedule. Under mild assumptions on the data distribution, we establish an upper bound for the KL divergence between the target and the estimated distributions, explicitly depending on any time-dependent noise schedule. Assuming that the score is Lipschitz continuous, we provide an improved error bound in Wasserstein distance, taking advantage of favourable underlying contraction mechanisms. We also propose an algorithm to automatically tune the noise schedule using the proposed upper bound. We illustrate empirically the performance of the noise schedule optimization in comparison to standard choices in the literature.
Simulating mixtures of distributions with signed weights proves a challenge as standard simulation algorithms are inefficient in handling the negative weights. In particular, the natural representation of mixture variates as associated with latent component indicators is no longer available. We propose here an exact accept-reject algorithm in the general case of finite signed mixtures that relies on optimaly pairing positive and negative components and designing a stratified sampling scheme on pairs. We analyze the performances of our approach, relative to the inverse cdf approach, since the cdf of the distribution remains available for standard signed mixtures..
Insufficient Gibbs Sampling
Antoine Luciano, Christian P Robert, Robin Ryder
arXiv:2307.14973, July 2023
In some applied scenarios, the availability of complete data is restricted, often due to privacy concerns, and only aggregated, robust and inefficient statistics derived from the data are accessible. These robust statistics are not sufficient, but they demonstrate reduced sensitivity to outliers and offer enhanced data protection due to their higher breakdown point. In this article, operating within a parametric framework, we propose a method to sample from the posterior distribution of parameters conditioned on different robust and inefficient statistics: specifically, the pairs (median, MAD) or (median, IQR), or one or more quantiles. Leveraging a Gibbs sampler and the simulation of latent augmented data, our approach facilitates simulation according to the posterior distribution of parameters belonging to specific families of distributions. We demonstrate its applicability on the Gaussian, Cauchy, and translated Weibull families.
Other communications
Michael I. Jordan shares his (Oceanic) thoughts on AI!
In this video, Michael I. Jordan shares his view on the future of AI. He explains how inflecting classic Machine Learning with Economics reasoning could lead to the emergence of a new discipline able to design large scale, intelligent systems for the collective good. The charming setting of Trieste provides the perfect backdrop for this intellectual journey!
