Executive Summary : | The project aims to design efficient algorithms for learning probabilistic graphical models (PGMs), specifically Bayesian networks, Ising models, and Gaussian graphical models, when independent random samples are available. These models are widely used in various scientific disciplines, such as medicine, psychology, biology, genomics, computer science, and social science. Despite existing heuristic algorithms, an algorithm based on theoretical analysis is lacking. The project aims to fill this gap by designing such algorithms, showing provable theoretical guarantees, and implementing and applying these algorithms in real-life settings. The sample space grows exponentially in the number of dimensions, and PGMs can encode any distribution in general over such a large sample space. Learning these models would cost a prohibitive number of samples when the dimensions are interdependent among themselves in a limited manner. Information-theoretic arguments have shown that this limited dependence results in an exponential reduction in the sample complexity of learning these models, making the problem statistically tractable. The computational aspects of learning these models have been largely overlooked, and computationally efficient procedures for learning sparse models have only recently been investigated. The main goal of the research project is to design such efficient algorithms for learning PGMs under sparsity assumptions. The project will be conducted over the next three years, with the aim of showing that the algorithms recover the underlying dependency graph from random samples and learn its parameters, ensuring good empirical performance and solving certain mathematical problems. |