Why OCEAN?

Machine learning and artificial intelligence (AI) have made major strides in the last two decades. The progress has been based on a dramatic increase of data and computing capacity, in the context of a centralized paradigm that requires aggregating data in a single location where massive computing resources can be brought to bear.

This fully centralized machine learning paradigm is, however, increasingly at odds with realworld use cases, for reasons that are both technological and societal. In particular, centralised learning risks exposing user privacy, makes inefficient use of communication resources, creates data processing bottlenecks, and may lead to concentration of economic and political power.

It thus appears most timely to develop the theory and practice of a new form of machine learning that targets heterogeneous, massively decentralised networks, involving self-interested agents who expect to receive value (or rewards, incentive) for their participation in data exchanges.

What is our research agenda?

The science behind OCEAN is a blend of new methods from numerical probability, Bayesian computational statistics, machine learning, distributed algorithms, multi-agent systems, and game theory. Our vision to advance theory is critical to our proposal, as quantitative and rigorous statements about performance are essential to formulate meaningful trade-offs between computational, economic, and inferential goals.

Optimization and dynamic systems

The classical optimization toolbox offers an insufficiently rich corpus of methods for analyzing connections and trade-offs between computational, inferential, and strategic goals. A richer toolbox can be achieved by treating constraints as forces rather than as geometric regions in the configuration space. We want to extend previous work on first-order methods to the more powerful concepts of accelerated gradient descent and to the general setting of variational inequalities.


Bayesian inference and sampling

We will develop uncertainty quantification methods within a coherent Bayesian framework that are applicable to the general federated learning setting. The FL setting requires we reframe the basic Bayesian inferential paradigm to cope with high dimensionality, insufficient information, and heterogeneity. We will also need to develop the theory and methods for efficient approximate Bayesian computation, as well as devise new classes of communication and computationally efficient stochastic gradient Markov Chain Monte Carlo algorithms.


Federated learning

Most federated learning methods focus on predictive approaches. Our objective is to extend their scope to embrass statistical inference on complex models. Our specific objectives are (a) communication efficiency beyond convex risk minimization—with new compression strategies, novel aggregation rules (b) FL beyond stochastic gradient descent—to address complex inference problems, and (c) FL Bayesian methods to provide a complete inferential toolbox.


Privacy

Designing methods providing strong privacy guarantee is key in many real-world inference problems. This requires to develop a framework for inference unifying cryptographic and statistical concepts, to maximise the learning potential of data without compromising privacy.


Economic value of data and incentives

This is an emerging domain that poses many exciting research challenges blending Bayesian statistics and economic concepts. A first challenge is to formalize the concept of “economic” value of data—in terms of a particular inference or prediction problem, given the data already available. From there, we could be able to design and investigate structures of data-sharing markets,
and promote long-term stability in data federation as well as social welfare.


Strategic experimentation

We want to address a learning problem where agents collect data relevant to decision-making and learn from others’ experiments. Specific objectives include devising strategies for multi-agent multi-armed bandits where the agents’ action rewards are interdependent, due to scarcity and congestion, and Markov games with a special emphasis on scenarios in which data and control are decentralized and where multiple, possibly conflicting, objectives should be met.


Online matching

We intend to devise matching process for agents within a dynamic exchange network. The core challenges are integrating relevant local structures, improving algorithm performances with ML driven oracles, and building private and fair matching.

Propulsé par WordPress.com.