Title: Identifying and Characterizing Adaptive Regulatory Variation in Diverse Human Populations
Abstract: Recently expanded efforts to catalogue genomic diversity within and among human populations offer new opportunities for understanding human history and biology. The lens of adaptive evolution is a powerful framework with which to interpret this data, because natural selection acts on phenotypes affecting long-term outcomes of reproduction and survival. In this thesis, we present a survey of the 1000 Genomes Phase 3 dataset (1KG) for signals of positive natural selection among diverse human populations within the past hundred thousand years. To enable this analysis, we first characterize background patterns of variation in the dataset and fit neutral demographic models to our query populations. Using these demographies, we implement a generative procedure to run coalescent simulations under a broad range of selection scenarios realistic for humans. This simulated dataset grounds a Bayesian analysis that implicates 678 genomic regions in 1KG as likely targets of positive selection (median length: 94.5Kb). Next, we use our simulated data to train a convolutional neural network to distinguish the adaptive causal variant from linked neighbors. Our resulting method, DeepSweep, can localize selection signals to tractable sets of candidate variants for functional scrutiny, outperforms available methods, and is robust to demographic misspecification. Finally, we apply DeepSweep to our putative selected regions, functionalizing top-scoring variants through (i) enrichment analyses, (ii) variant effect prediction, and (iii) querying directly for enhancer activity using a massively parallel reporter assay. Taken together, our results support a strong role for regulatory variation in driving local human adaptation. Moreover, recurrent association of candidate selected regions and variants with autoimmune disease phenotypes suggests an ongoing role for natural selection to infectious diseases in contributing to susceptibility to chronic conditions.
Committee: Pardis Sabeti (Advisor), Terence Capellini (HEB), Hopi Hoekstra, John Wakeley