Social foundations for statistics and machine learning

 

 

My research agenda is to reexamine the foundations of statistics and artificial intelligence (AI), drawing on economic theory and mechanism design. Such revised foundations will inform efforts to regulate AI, to use AI in socially beneficial ways, and to reform empirical research methods and institutions.

The current decision-theoretic foundations of statistics and machine learning are insufficient for addressing some of the key challenges facing science and society today: First, there are pressing concerns about the social impact of artificial intelligence and machine learning, regarding issues such as fairness, inequality, and value alignment. Single-agent decision theory is insufficient for conceptualizing the underlying conflicts of interest between different agents, or the value alignment issues resulting from divergent objectives. Second, there is a perceived replication crisis of empirical research, which might be due to p-hacking or publication bias. This crisis has motivated proposed solutions such as pre-registration of statistical analyses and reforms of the publication system. Single-agent statistical decision theory again cannot make sense of these problems and solutions, as it does not allow for conflicts of interest between different parties, private information, or dynamic inconsistency.

The research which I will undertake aims to address these issues by providing new foundational frameworks for AI and machine learning (as well as statistics and econometrics) that go beyond decision theory. The new frameworks will explicitly allow for multiple parties with divergent interests and private information, drawing on the techniques of mechanism design. They will take into account a social environment that is characterized by conflicting interests and values, unobservables, and inequality. They will formalize insights from the history as well as sociology of science and technology. The proposed projects augment the framework of optimal statistical decision-making with constraints of implementability in social settings. With such augmented frameworks, we can then give normative methodological recommendations for statistics (conceived as loss function minimization) and artificial intelligence (conceived as autonomous maximization of a stream of rewards).

 

 

The research project is supported by the Alfred P. Sloan Foundation

G-2022-19434