The genetic diversity of plants has traditionally been employed to improve crop plants to suit human needs, and in the future feed the increasing population and protect crops from environmental stresses and climate change. Genome-wide sequencing is a reality and can be used to make association to crop traits to be utilized by high-throughput marker based selection methods. This study describes a strategy of using next generation sequencing (NGS) data from the rice genome to make comparisons to the high-quality reference genome, and identify functional polymorphisms within the genes that might result in gene function changes, which can be used to make correlations to traits and employed in genetic mapping.
We analyzed the NGS data of Oryza sativa ssp indica cv. G4 covering 241 Mb with ~20X coverage and compared to the reference genome of Oryza sativa ssp. japonica, to describe the genome-wide distribution of gene-based single nucleotide polymorphisms (SNPs). The analysis shows that the 63% covered genome consists of 1.6 million SNPs with 6.9 SNPs/Kb, and including 80,146 insertions and 92,655 deletions (INDELs). Out of a total of 535,537 SNPs in genic regions (including intragenic regions), there were 295,136 SNPs in intronic/noncoding regions, 195,098 in coding regions and 45,303 within intragenic regions (between exons). SNP variation was found in 40,761 gene loci, which include 75,262 synonymous and 119,836 non-synonymous changes, 22,686 SNPs at the three-prime (3') UTR region and 23,242 five-prime (5') UTR regions, and functional reading frame changes through 3,886 inducing STOP-codon (isSNP) and 729 preventing STOP-codon (psSNP) variation. There are quickly evolving 194 high SNP hotspot genes (>100 SNPs/gene), and 1,513 out of 2,458 transcription factors displaying 2,294 non-synonymous SNPs that can be a major source of phenotypic diversity within the species. We created a SNP2GENE database from this analysis at UofA. We envision that this strategy will be useful for the identification of genes for crop traits and molecular breeding of rice cultivars.