PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Common Sequence Polymorphisms Shaping Genetic Diversity in Arabidopsis thaliana
R Clark, Gabriele Schweikert, C Toomajian, S Ossowski, G Zeller, P Shinn, N Warthmann, T Hu, G Fu, D Hinds, K Frazer, D Huson, Bernhard Schölkopf, M Nordborg, Gunnar Raetsch, J Ecker and D Weigel
Science Volume 317, Number 5836, pp. 338-342, 2007.


The genomes of individuals from the same species vary in sequence as a result of different evolutionary processes. To examine the patterns of, and the forces shaping, sequence variation in Arabidopsis thaliana, we performed high-density array resequencing of 20 diverse strains (accessions). More than 1 million nonredundant single-nucleotide polymorphisms (SNPs) were identified at moderate false discovery rates (FDRs), and 4% of the genome was identified as being highly dissimilar or deleted relative to the reference genome sequence. Patterns of polymorphism are highly nonrandom among gene families, with genes mediating interaction with the biotic environment having exceptional polymorphism levels. At the chromosomal scale, regional variation in polymorphism was readily apparent. A scan for recent selective sweeps revealed several candidate regions, including a notable example in which almost all variation was removed in a 500-kilobase window. Analyzing the polymorphisms we describe in larger sets of accessions will enable a detailed understanding of forces shaping population-wide sequence variation in A. thaliana.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Multimodal Integration
ID Code:3030
Deposited By:Gunnar Raetsch
Deposited On:02 September 2007