Publications

Scientific publications

Incentivized Learning in Principal-Agent Bandit Games

Antoine Scheid, Daniil Tiapkin, Etienne Boursier, Aymeric Capitaine, Mahdi El Mhamdi, Eric Moulines, Michael I. Jordan, Alain Durmus

ICML proceedings 2024

This work considers a repeated principal-agent bandit game, where the principal can only interact with her environment through the agent. The principal and the agent have misaligned objectives and the choice of action is only left to the agent. However, the principal can influence the agent’s decisions by offering incentives which add up to his rewards. The principal aims to iteratively learn an incentive policy to maximize her own total utility. This framework extends usual bandit roblems and is motivated by several practical applications, such as healthcare or ecological taxation, where traditionally used mechanism design theories often overlook the learning aspect of the problem. We present nearly optimal (with respect to a horizon T) learning algorithms for the principal’s regret in both multi-armed and linear contextual settings. Finally, we support our theoretical guarantees through numerical experiments.

Delegating Data Collection in Decentralized Machine Learning

Nivasini Ananthakrishnan, Stephen Bates, Michael I. Jordan, Nika Haghtalab

Proc. International Conference on Artificial Intelligence and Statistics, 2024.

Motivated by the emergence of decentralized machine learning ecosystems, we study the delegation of data collection. Taking the field of contract theory as our starting point, we design optimal and near-optimal contracts that deal with two fundamental machine learning challenges: lack of certainty in the assessment of model quality and lack of knowledge regarding the optimal performance of any model. We show that lack of certainty can be dealt with via simple linear contracts that achieve 1-1/e fraction of the first-best utility, even if the principal has a small test set. Furthermore, we give sufficient conditions on the size of the principal’s test set that achieves a vanishing additive approximation to the optimal utility. To address the lack of a priori knowledge regarding the optimal performance, we give a convex program that can adaptively and efficiently compute the optimal contract.

Class-Conditional Conformal Prediction with Many Classes

Tiffany Ding, Anastasios Angelopoulos, Stephen Bates, Michael Jordan, Ryan J. Tibshirani

NeurIPS Proceedings 2024

Standard conformal prediction methods provide a marginal coverage guarantee,which means that for a random test point, the conformal prediction set contains the true label with a user-specified probability. In many classificationproblems, we would like to obtain a stronger guarantee–that for test pointsof a specific class, the prediction set contains the true label with thesame user-chosen probability. For the latter goal, existing conformal predictionmethods do not work well when there is a limited amount of labeled data perclass, as is often the case in real applications where the number of classes islarge. We propose a method called clustered conformal prediction thatclusters together classes having « similar » conformal scores and performs conformal prediction at the cluster level. Based on empirical evaluation acrossfour image data sets with many (up to 1000) classes, we find that clusteredconformal typically outperforms existing methods in terms of class-conditionalcoverage and set size metrics.

Computing Bayes: From Then ‘Til Now

Gael M. Martin, David T. Frazier, Christian P Robert

Statistical Science 39(1): 3-19 (February 2024). DOI: 10.1214/22-STS876

This paper takes the reader on a journey through the history of Bayesian computation, from the 18th century to the present day. Beginning with the one-dimensional integral first confronted by Bayes in 1763, we highlight the key contributions of: Laplace, Metropolis (and, importantly, his coauthors), Hammersley and Handscomb, and Hastings, all of which set the foundations for the computational revolution in the late 20th century—led, primarily, by Markov chain Monte Carlo (MCMC) algorithms. A very short outline of 21st century computational methods—including pseudo-marginal MCMC, Hamiltonian Monte Carlo, sequential Monte Carlo and the various “approximate” methods—completes the paper.

Approximating Bayes in the 21st Century

Gael M. Martin, David T. Frazier, Christian P Robert

Statistical Science 39(1): 20-45 (February 2024). DOI: 10.1214/22-STS875

The 21st century has seen an enormous growth in the development and use of approximate Bayesian methods. Such methods produce computational solutions to certain “intractable” statistical problems that challenge exact methods like Markov chain Monte Carlo: for instance, models with unavailable likelihoods, high-dimensional models and models featuring large data sets. These approximate methods are the subject of this review. The aim is to help new researchers in particular—and more generally those interested in adopting a Bayesian approach to empirical work—distinguish between different approximate techniques, understand the sense in which they are approximate, appreciate when and why particular methods are useful and see the ways in which they can can be combined.

Living on the Edge: An Unified Approach to Antithetic Sampling

Roberto Casarin, Radu Craiu, Lorenzo Frattarolo, Christian P Robert

Statistical Science 39(1): 115-136 (February 2024). DOI: 10.1214/23-STS888

We identify recurrent ingredients in the antithetic sampling literature leading to a unified sampling framework. We introduce a new class of antithetic schemes that includes the most used antithetic proposals. This perspective enables the derivation of new properties of the sampling schemes: (i) optimality in the Kullback–Leibler sense; (ii) closed-form multivariate Kendall’s τ and Spearman’s ρ; (iii) ranking in concordance order and (iv) a central limit theorem that characterizes stochastic behaviour of Monte Carlo estimators when the sample size tends to infinity. The proposed simulation framework inherits the simplicity of the standard antithetic sampling method, requiring the definition of a set of reference points in the sampling space and the generation of uniform numbers on the segments joining the points. We provide applications to Monte Carlo integration and Markov Chain Monte Carlo Bayesian estimation.

Preprints

A discussion of the paper « Safe testing » by Grünwald, de Heide, and Koolen

Joshua Bon, Christian P Robert

ArXiv:2402.15574, February 2024

This is a discussion of the paper « Safe testing » by Grünwald, de Heide, and Koolen, Read before The Royal Statistical Society at a meeting organized by the Research Section on Wednesday, 24 January, 2024. It argues that the central proposal found in this quite original paper could prove relevant for the so-called objective Bayes approach (Berger, 1985; Robert, 2001) as it may open a way to deriving prior distributions from (frequentist) ﬁrst principles. However, it remains unclear to the discussants how to construct the least favourable prior on H0 on a general basis, especially from a computational viewpoint, and, more importantly, (ii) whether it is at all of inferential interest [i.e., whether it degenerates into a point mass]. With respect to the sequential directions of the paper, we also wonder at the potential connections with sequential Monte Carlo, for instance, towards conducting sequential model choice by constructing eﬃciently an amalgamated evidence value when the product of Bayes factors is not a Bayes factor (see, e.g. Buchholz et al. 2023).

Asymptotics of approximate Bayesian computation when summary statistics converge at heterogeneous rates

Caroline Lawless, Christian P Robert, Judith Rousseau, Robin J Ryder

ArXiv:2311.10080, November 2023

We consider the asymptotic properties of Approximate Bayesian Computation (ABC) for the realistic case of summary statistics with heterogeneous rates of convergence. We allow some statistics to converge faster than the ABC tolerance, other statistics to converge slower, and cover the case where some statistics do not converge at all. We give conditions for the ABC posterior to converge, and provide an explicit representation of the shape of the ABC posterior distribution in our general setting; in particular, we show how the shape of the posterior depends on the number of slow statistics. We then quantify the gain brought by the local linear post-processing step.

An analysis of the noise schedule for score-based generative models

Stanislas Strasman, Antonio Ocello, Claire Boyer, Sylvain Le Corff, Vincent Lemaire

ArXiv:2402.04650, February 2024

Score-based generative models (SGMs) aim at estimating a target data distribution by learning score functions using only noise-perturbed samples from the target. Recent literature has focused extensively on assessing the error between the target and estimated distributions, gauging the generative quality through the Kullback-Leibler (KL) divergence and Wasserstein distances. All existing results have been obtained so far for time-homogeneous speed of the noise schedule. Under mild assumptions on the data distribution, we establish an upper bound for the KL divergence between the target and the estimated distributions, explicitly depending on any time-dependent noise schedule. Assuming that the score is Lipschitz continuous, we provide an improved error bound in Wasserstein distance, taking advantage of favourable underlying contraction mechanisms. We also propose an algorithm to automatically tune the noise schedule using the proposed upper bound. We illustrate empirically the performance of the noise schedule optimization in comparison to standard choices in the literature.

Sampling signed mixtures

Julien Stoehr, Christian P Robert

ArXiv:2401.16828, February 2024

Simulating mixtures of distributions with signed weights proves a challenge as standard simulation algorithms are inefficient in handling the negative weights. In particular, the natural representation of mixture variates as associated with latent component indicators is no longer available. We propose here an exact accept-reject algorithm in the general case of finite signed mixtures that relies on optimaly pairing positive and negative components and designing a stratified sampling scheme on pairs. We analyze the performances of our approach, relative to the inverse cdf approach, since the cdf of the distribution remains available for standard signed mixtures..

Insufficient Gibbs Sampling

Antoine Luciano, Christian P Robert, Robin Ryder

ArXiv:2307.14973, July 2023

In some applied scenarios, the availability of complete data is restricted, often due to privacy concerns, and only aggregated, robust and inefficient statistics derived from the data are accessible. These robust statistics are not sufficient, but they demonstrate reduced sensitivity to outliers and offer enhanced data protection due to their higher breakdown point. In this article, operating within a parametric framework, we propose a method to sample from the posterior distribution of parameters conditioned on different robust and inefficient statistics: specifically, the pairs (median, MAD) or (median, IQR), or one or more quantiles. Leveraging a Gibbs sampler and the simulation of latent augmented data, our approach facilitates simulation according to the posterior distribution of parameters belonging to specific families of distributions. We demonstrate its applicability on the Gaussian, Cauchy, and translated Weibull families.

Other communications

Michael I. Jordan shares his (Oceanic) thoughts on AI!

In this video, Michael I. Jordan shares his view on the future of AI. He explains how inflecting classic Machine Learning with Economics reasoning could lead to the emergence of a new discipline able to design large scale, intelligent systems for the collective good. The charming setting of Trieste provides the perfect backdrop for this intellectual journey!

A special issue of Statistical Science on Bayesian Computations in the 21st Century

Christian P Robert and Dennis Prangle (University of Bristol) have put together this special issue of Statistical Science that just appeared as the February 2024, with five papers on the future of Bayesian Computational Science (and a survey of the past). It involves a large group of researchers who contributed to collective articles, bringing their own perspectives and research interests into these surveys. This is an exciting time for Bayesian computing, since the Monte Carlo revolution of the 1990s continues to have a huge influence on today’s work, and now is complemented by a diverse range of new directions informed by modern machine learning

How Geopolitics have inflenced the evolution of sign languages ?

Robin Ryder was recently interviewed by French paper Pour la Science to explain how sign languages have evolved in recent history. Using mathematical methods inspired by phylogenetics, similarities between sign languages can be derived and understood by looking at the historical and geopolitical context.