Journal of Plant Biotechnology 2016; 43(1): 1-11
Published online March 31, 2016
https://doi.org/10.5010/JPB.2016.43.1.1
© The Korean Society of Plant Biotechnology
Correspondence to : e-mail: kykang@hknu.ac.kr, yuyu1216@hknu.ac.kr
Single nucleotide polymorphism (SNP) is an abundant form of genetic variation within individuals of species. DNA polymorphism can arise throughout the whole genome at different frequencies in different species. SNP may cause phenotypic diversity among individuals, such as individuals with different color of plants or fruits, fruit size, ripening, flowering time adaptation, quality of crops, grain yields, or tolerance to various abiotic and biotic factors. SNP may result in changes in amino acids in the exon of a gene (asynonymous). SNP can also be silent (present in coding region but synonymous). It may simply occur in the noncoding regions without having any effect. SNP may influence the promoter activity for gene expression and finally produce functional protein through transcription. Therefore, the identification of functional SNP in genes and analysis of their effects on phenotype may lead to better understanding of their impact on gene function for varietal improvement. In this mini-review, we focused on evidences revealing the role of functional SNPs in genes and their phenotypic effects for the purpose of crop improvements.
Keywords Functional SNPs, Genetic diversity, Phenotypic variation, Biotic and abiotic stresses
Crop plants are very important for human being, therefore different strategies are using for their improvement accordance to current demands. Among these strategies, plant breeding program is a natural way of variety development. During breeding programs, a lot of genetic variations are arisen, which are corresponding to the phenotypes; such as quality of crops, grain yields, different colors of plants or fruits, size of fruits, and tolerance to various biotic and abiotic stresses (Vidal et al. 2012; Jang et al. 2015). Genetic diversity is also generated in different crop species through domestication of the same species in different geographical regions. The most common form of genomic variation is single nucleotide variation in the genome within the individuals. Analysis of DNA variation through DNA sequencing of a target gene regulating phenotypes is a good way to identify causal genes for the traits. The recent advances in sequencing technology are giving great opportunity for plant breeders to find out genetic diversity in different breeding populations, especially for the discovery of functional SNP (single nucleotide polymorphism) in causal genes and development of SNP markers, which are associated with diverse agronomic traits in crops (Vidal et al. 2012). Most of the crop plants contain high nutritional value, which provides some particular nutrients that have high impact to maintain healthy human body. These nutrients may vary largely depending on growing conditions, varieties and mutations in functional genes (Schreiber et al. 2014).
Sequencing of many crop plant genomes is already completed, which was a major milestone for plant research (Huq et al. 2016). Reference genome sequence is essential for measuring genetic polymorphisms among individuals of same species. In order to identify the sequence diversity within crop species like rice, potato, tomato, maize, etc., a lot of resequencing data are now available (Causse et al. 2013; Chen et al. 2014; Xu et al. 2014; Chung et al. 2014). These data contributed to evidence suggesting that during process of domestication, mutation, multiplication, selection breeding and exchange of cultivars, a huge number of polymorphisms were spontaneously or artificially generated in the genome of different individuals of same species. These changes in genome can alter the functions of important genes and ultimately make the phenotypic variations in plants (Vidal et al. 2012; Shi et al. 2015; Shirasawa et al. 2016). The most abundant DNA polymorphisms in the genome sequences are SNPs and are thought to play a major role in the induction of phenotypic variations. There are many reports about the gene specific or genome-wide functional SNP discovery in different breeding varieties or lines, which are associated with different phenotypic changes (Kharabian-Masouleh et al. 2012; Kumar et al. 2014; Jang et al. 2015). In this paper, we focused on the evidences revealing the role of functional SNPs of genes and their phenotype effects for crop improvements.
SNP is a variation at a single position in DNA sequence among individuals of same species. In short, SNP is the polymorphism occurring within DNA samples with difference at single base. SNPs are the most common DNA polymorphisms in genome sequences of human, animals, and plants and they are thought to play a major role in the induction of phenotypic variations. According to international SNP map working group, human genome sequence contains 1.42 million SNPs and average one SNP per 1.9 kb (Sachidanandam et al. 2001). Also in plants, SNP polymorphisms are found in high density across the genome (Ching et al. 2002). In Nipponbare rice genome, 0.64 SNP was found per one kb (Jeong et al. 2013), while in tomato average 6.1 SNP per one kb was observed in the whole genome (Kim et al. 2014).
Different DNA markers are widely used for analysis of genetic diversity of plants, their evolutionary studies, association mapping as well as diagnostics, fingerprinting, and breeding applications. Among all DNA markers, SNPs are the most abundant and robust, feasible for automated high-throughput genotyping, and available for multiple assay options using different technology platforms to meet the demand for genetic studies and molecular breeding in crop plants (Steemers and Gunderson 2007; Alkan and Eichler 2011). In recent years, SNPs have gained much interest in the scientific and breeding community that could be used as potential genetic markers, which may be identified effectively in every gene (Rafalski 2002). SNPs also can identify the genomic diversity of species to demonstrate the speciation and evolution, and associate genomic variations with phenotypic traits (McNally et al. 2009). The major applications of SNP are described shortly.
Genetic map refers to the arrangement of genes, identification of the locus of a gene and measurement of distances between genes. Construction of genetic maps are essential tools in plant breeding for genetic improvement as they are able to identify the gene location and quantitative trait loci (QTL), as well as crucial tools for genome sequence assembly and comparative genomic analysis and map based cloning. Biallelic nature of SNP, their high abundance in genome, uniform genome distribution and cost effectiveness (Ganal et al. 2009) make them an ideal marker for constructing new genetic maps compared to other genetic markers, which are often multiallelic (Kruglyak 1997). Therefore, SNP-based genetic maps have been developed in many economically important agricultural species such as cucumber (Wei et al. 2014), rice (Xie et al. 2010), maize (Buckler et al. 2009), apple (Sun et al. 2015), soybean (Akond et al. 2013), cotton (Byers et al. 2012),
SNPs can be used for evolutionary studies of genome that can reveal about population history, how breeding system and selection affect variation at genetic level. Because, generally SNP is used for study of sequence variation among species and such type of variations are present at all levels of evolution and ultimately SNP can provide an understanding of how modern genome has evolved. The commonly used markers for evolutionary studies are SSRs (simple sequence repeats) and mitochondrial DNA which may be misinterpreted due to homoplasy (Morin et al. 2004). It is possible to avoid this problem by using SNP markers that represent single base nucleotide substitutions (Vignal et al. 2002). Many successful reports are already published about the use of SNPs to study the evolution of genes such as WAG-2 (wheat AG-2) in wheat (Wei et al. 2011).
A large number of techniques have been developed for the identification of SNP polymorphisms in plants. Selection of the technique depends on the cost, time, availability, reliability factors. There are many reports that described the different methodologies of SNP genotyping (Gut 2001; Kumar et al. 2012). From all of these methodologies, direct DNA sequencing technologies are considered as the most used and benefited for SNP identification.
Sequencing-based techniques were first invented at 1977 through Sanger method which depends on a combination of deoxy- and dideoxy-labeled chain terminator nucleotides (Sanger et al. 1977a). In the same year, the first complete genome of bacteriophage phi X174 was sequenced by this method (Sanger et al. 1977b). But in the last decade, several NGS (next generation sequencing) technologies (Roche/454, Illumina, SOLiD) have outperformed Sanger-based sequencing in throughput and overall cost (Kircher and Kelso 2010). With a throughput of hundreds of millions to several billions of bases per run, NGS are able to identify many SNPs in a species at much lower cost in a short time (Mardis 2007). Identification of SNP using NGS is reported in different plants such as
Most recently a new method has been derived for SNP genotyping using illumina NGS platform to reduce the cost for DNA sequencing, is known as GBS which was developed in 2011 (Elshire et al. 2011). GBS is a sequencing by synthesis strategy. GBS system is becoming increasingly important, effective and unique tool for SNP identification in plant species because of its low cost, reduced sample handling, no size fractionation, fewer PCR and purification steps, no reference sequence limits, efficient barcoding and easiness to scale up (Davey et al. 2011). A schematic representation of GBS technology for SNP discovery from plants was shown in Figure 1. GBS is an ideal method for SNP genotyping in plants from single gene markers to whole genome profiling (Poland and Rife 2012). GBS experiments were needed to do isolation of genomic DNA from plant materials, then quantification and normalization, digestion with appropriate restriction enzyme, then ligate the adapter at both end of digested DNA with a bar coding (BC) region in adapter 1, following PCR amplification and sequencing. Finally, bioinformatic analysis of sequencing data is carried out and find out the SNPs (Fig. 2). Compared to other methods, GBS is a considerably less complicated, fragmentation and ligation of appropriate adapters are more straightforward, single-well digestion of genomic DNA, and fewer DNA purification steps make it easy. Moreover, GBS method avoids the separation step of fragments by size resulting in reduced sample handling and ultimately become cost effective. The low cost of GBS system makes it a powerful tool for SNP genotyping in a variety of crop species and populations as well as other plants. GBS has been shown as a valid tool for genomic diversity studies (Fu and Peterson 2011; Lu et al. 2013; Fu et al. 2014), which is already able to prove itself as an excellent system for SNP identification in plant breeding programs even in the absence of reference genome sequences or without any previous information about DNA polymorphism. Available reference genome makes easy to data analysis and identification of SNPs, but it is not essential in GBS system, which is a great advantage to plant breeders for crop improvement programs. Many reports already published about the use of GBS system for genetic analysis, marker development and high throughput SNP genotyping of various crops such as rice, wheat, yellow mustard, rapeseed, lupin, lettuce, switchgrass, soybean, maize, etc. (Poland et al. 2012; Fu et al. 2014; Spindel et al. 2013; Truong et al. 2012; Lu et al. 2013; Sonah et al. 2013).
Overview of SNP discovery in plants through genotyping by sequencing (GBS) system
Data analysis for SNP identification. Reads are aligned to reference sequence to find differences between the reference genome and newly sequenced genome. This concept is taken from Kumar et al. (2014) with modification
Rice is the main food for more than half of the world’s population. The complete genome sequencing of rice in 2002 using bacterial artificial chromosomes (BAC) based approach was a major milestone for rice genomic research. In which genome size was 389 Mb, approximately three times larger than the model plant
Wheat is one of the top three staple grains in the world, along with rice and maize whose genome size is around 17 Gb. The international wheat genome sequencing consortium revealed a chromosome-based draft genome sequence of hexaploid bread wheat in 2014 (The International Wheat Genome Sequencing Consortium 2014). The modern cultivated wheat also known as bread wheat (
Maize is the most produced cereal crop in the world which whole genome was first sequenced at 2009. The genome size of maize is 2.3 Gb with more than predicted 32,000 genes (Schnable et al. 2009). DNA sequence diversity in maize populations is more than human. Tenaillon et al. (2001) measured the sequence diversity in 21 loci distributed along chromosome 1 of maize. They sequenced from 25 inbred lines and data indicated that the maize has an average one SNP per 104 bases between two randomly sampled sequences that was higher than human or
The entire genome of barley was first sequenced at 2012 and the total genome size was around 5.1 Gb, containing 79,379 transcript clusters, including 26,159 high-confidence genes (Mayer et al. 2012). Xia et al. (2013) investigated SNPs in small heat shock protein 17.8 (
The reference genome sequence of soybean is available from 2010 which make it easy to identify the DNA polymorphisms among soybean populations. The genome size is approximately 1.1 Gb with 46,430 protein coding genes (Schmutz et al. 2010). Lee et al. (2015) identified more than four millions high quality SNPs by resequencing 16 soybean accessions. Chung et al. (2014) obtained 3,871,469 high quality SNPs by resequencing of 10 cultivated and 6 wild soybean accessions after mapping reads for each accession to the reference genome sequence. Genic regions contain 20.4% (788,809 SNPs) SNPs and rest of the SNPs were located in the intergenic regions. Jang et al. (2015) discovered a single nucleotide polymorphism in an
Potato genome sequencing consortium first revealed the entire genome sequence of potato at 2011 that was 850 Mb in size. Hamilton et al. (2011) discovered 575,340 SNPs by sequencing normalized cDNA prepared from three commercial potato cultivars (Atlantic, Premier Russet, and Snowden). 230 SNPs were found in
The complete genome of tomato has been sequenced and assembled by tomato genome consortium at 2012 which is enabling the identification of genome-wide SNPs and considered as a model for genomic research in
There are so many other crop plants whose full genome sequence have been completed such as grape (Velasco et al. 2007), cucumber (Huang et al. 2009), apple (Velasco et al. 2010), banana (Hont et al. 2012), oil palm (Singh et al. 2013), eggplant (Hirakawa et al. 2014) etc. These reference genome sequences help the plant breeders to discover SNP among different cultivars or breeding lines which facilitate the development and selection of improved crop varieties.
Single Nucleotide Polymorphism may influence the promoter activity for gene expression, transcriptional and translational efficiency (LeVan et al. 2001). Therefore, they may be responsible for phenotypic variations among individuals for improving of agronomical traits. A gene contains two parts, exon and intron. Intron is removed during post transcriptional modification but the exons are finally translated into amino acid sequence and produce enzyme. So, the SNP in the exon part (coding region) is most important because they can affect the gene function. SNPs in the coding region are of two types, synonymous and asynonymous SNPs. Synonymous SNPs do not affect the amino acid sequence but asynonymous SNPs change the amino acid sequence of protein and may influence the enzyme activity (Fig. 3). There are many reports about the effect of SNP on gene function in different crop plants. One study conducted by Schreiber et al. (2014) and identified SNPs in plastidic starch phosphorylase
A schematic representation of the role of SNP in gene function that can influence enzyme activity by changing amino acids. Met, Methionine; Ala, Alanine; Ser, Serine; Ile, Isoleucine; Leu, Leucine, Val, Valine; Tyr, Tyrosine; Arg, Arginine; Gly, Glycine; Glu, Glutamic acid and Thr, Threonine. This concept is taken from Jang et al. (2015) with modification
As SNPs can change the amino acid that might affect the enzyme activity, so the study of functional SNPs is very important regarding crop improvements. It is important to know the location of SNP in the genome because if the SNP is present in the coding region can highly affect the activity and thermostability level of the enzyme. Sometimes it is also depends on the substituted amino acid positions because some amino acid controls the activity of enzyme. Recent technological advances make it easy to find out functional SNP from various breeding lines which could be used for crop improvements. The success stories indicate that SNPs in the functional parts of the gene may control the level of biotic and abiotic stresses and may develop various abiotic and biotic stress tolerance crop varieties through modifying enzyme activity.
This research was supported by Golden Seed Project (Center for Horticultural Seed Development, No. 213003-04-4-SBC10), by a research grant of the iPET, Ministry of Food, Agriculture, Forestry and Fisheries, Republic of Korea
Journal of Plant Biotechnology 2016; 43(1): 1-11
Published online March 31, 2016 https://doi.org/10.5010/JPB.2016.43.1.1
Copyright © The Korean Society of Plant Biotechnology.
Md. Amdadul Huq1, Shahina Akter1, III Sup Nou2, Hoy Taek Kim2, Yu Jin Jung3,*, and Kwon Kyoo Kang3,*
1Department of Horticulture, Hankyong National University, Ansung City, Gyeonggi-do, 17579, Republic of Korea,
2Department of Horticulture, Sunchon National University, 255, Jungang-ro, Suncheon, Jeonam-do, 57922, Korea,
3Department of Horticulture, Hankyong National University, Ansung City, Gyeonggi-do, 17579, Republic of Korea
Correspondence to:e-mail: kykang@hknu.ac.kr, yuyu1216@hknu.ac.kr
Single nucleotide polymorphism (SNP) is an abundant form of genetic variation within individuals of species. DNA polymorphism can arise throughout the whole genome at different frequencies in different species. SNP may cause phenotypic diversity among individuals, such as individuals with different color of plants or fruits, fruit size, ripening, flowering time adaptation, quality of crops, grain yields, or tolerance to various abiotic and biotic factors. SNP may result in changes in amino acids in the exon of a gene (asynonymous). SNP can also be silent (present in coding region but synonymous). It may simply occur in the noncoding regions without having any effect. SNP may influence the promoter activity for gene expression and finally produce functional protein through transcription. Therefore, the identification of functional SNP in genes and analysis of their effects on phenotype may lead to better understanding of their impact on gene function for varietal improvement. In this mini-review, we focused on evidences revealing the role of functional SNPs in genes and their phenotypic effects for the purpose of crop improvements.
Keywords: Functional SNPs, Genetic diversity, Phenotypic variation, Biotic and abiotic stresses
Crop plants are very important for human being, therefore different strategies are using for their improvement accordance to current demands. Among these strategies, plant breeding program is a natural way of variety development. During breeding programs, a lot of genetic variations are arisen, which are corresponding to the phenotypes; such as quality of crops, grain yields, different colors of plants or fruits, size of fruits, and tolerance to various biotic and abiotic stresses (Vidal et al. 2012; Jang et al. 2015). Genetic diversity is also generated in different crop species through domestication of the same species in different geographical regions. The most common form of genomic variation is single nucleotide variation in the genome within the individuals. Analysis of DNA variation through DNA sequencing of a target gene regulating phenotypes is a good way to identify causal genes for the traits. The recent advances in sequencing technology are giving great opportunity for plant breeders to find out genetic diversity in different breeding populations, especially for the discovery of functional SNP (single nucleotide polymorphism) in causal genes and development of SNP markers, which are associated with diverse agronomic traits in crops (Vidal et al. 2012). Most of the crop plants contain high nutritional value, which provides some particular nutrients that have high impact to maintain healthy human body. These nutrients may vary largely depending on growing conditions, varieties and mutations in functional genes (Schreiber et al. 2014).
Sequencing of many crop plant genomes is already completed, which was a major milestone for plant research (Huq et al. 2016). Reference genome sequence is essential for measuring genetic polymorphisms among individuals of same species. In order to identify the sequence diversity within crop species like rice, potato, tomato, maize, etc., a lot of resequencing data are now available (Causse et al. 2013; Chen et al. 2014; Xu et al. 2014; Chung et al. 2014). These data contributed to evidence suggesting that during process of domestication, mutation, multiplication, selection breeding and exchange of cultivars, a huge number of polymorphisms were spontaneously or artificially generated in the genome of different individuals of same species. These changes in genome can alter the functions of important genes and ultimately make the phenotypic variations in plants (Vidal et al. 2012; Shi et al. 2015; Shirasawa et al. 2016). The most abundant DNA polymorphisms in the genome sequences are SNPs and are thought to play a major role in the induction of phenotypic variations. There are many reports about the gene specific or genome-wide functional SNP discovery in different breeding varieties or lines, which are associated with different phenotypic changes (Kharabian-Masouleh et al. 2012; Kumar et al. 2014; Jang et al. 2015). In this paper, we focused on the evidences revealing the role of functional SNPs of genes and their phenotype effects for crop improvements.
SNP is a variation at a single position in DNA sequence among individuals of same species. In short, SNP is the polymorphism occurring within DNA samples with difference at single base. SNPs are the most common DNA polymorphisms in genome sequences of human, animals, and plants and they are thought to play a major role in the induction of phenotypic variations. According to international SNP map working group, human genome sequence contains 1.42 million SNPs and average one SNP per 1.9 kb (Sachidanandam et al. 2001). Also in plants, SNP polymorphisms are found in high density across the genome (Ching et al. 2002). In Nipponbare rice genome, 0.64 SNP was found per one kb (Jeong et al. 2013), while in tomato average 6.1 SNP per one kb was observed in the whole genome (Kim et al. 2014).
Different DNA markers are widely used for analysis of genetic diversity of plants, their evolutionary studies, association mapping as well as diagnostics, fingerprinting, and breeding applications. Among all DNA markers, SNPs are the most abundant and robust, feasible for automated high-throughput genotyping, and available for multiple assay options using different technology platforms to meet the demand for genetic studies and molecular breeding in crop plants (Steemers and Gunderson 2007; Alkan and Eichler 2011). In recent years, SNPs have gained much interest in the scientific and breeding community that could be used as potential genetic markers, which may be identified effectively in every gene (Rafalski 2002). SNPs also can identify the genomic diversity of species to demonstrate the speciation and evolution, and associate genomic variations with phenotypic traits (McNally et al. 2009). The major applications of SNP are described shortly.
Genetic map refers to the arrangement of genes, identification of the locus of a gene and measurement of distances between genes. Construction of genetic maps are essential tools in plant breeding for genetic improvement as they are able to identify the gene location and quantitative trait loci (QTL), as well as crucial tools for genome sequence assembly and comparative genomic analysis and map based cloning. Biallelic nature of SNP, their high abundance in genome, uniform genome distribution and cost effectiveness (Ganal et al. 2009) make them an ideal marker for constructing new genetic maps compared to other genetic markers, which are often multiallelic (Kruglyak 1997). Therefore, SNP-based genetic maps have been developed in many economically important agricultural species such as cucumber (Wei et al. 2014), rice (Xie et al. 2010), maize (Buckler et al. 2009), apple (Sun et al. 2015), soybean (Akond et al. 2013), cotton (Byers et al. 2012),
SNPs can be used for evolutionary studies of genome that can reveal about population history, how breeding system and selection affect variation at genetic level. Because, generally SNP is used for study of sequence variation among species and such type of variations are present at all levels of evolution and ultimately SNP can provide an understanding of how modern genome has evolved. The commonly used markers for evolutionary studies are SSRs (simple sequence repeats) and mitochondrial DNA which may be misinterpreted due to homoplasy (Morin et al. 2004). It is possible to avoid this problem by using SNP markers that represent single base nucleotide substitutions (Vignal et al. 2002). Many successful reports are already published about the use of SNPs to study the evolution of genes such as WAG-2 (wheat AG-2) in wheat (Wei et al. 2011).
A large number of techniques have been developed for the identification of SNP polymorphisms in plants. Selection of the technique depends on the cost, time, availability, reliability factors. There are many reports that described the different methodologies of SNP genotyping (Gut 2001; Kumar et al. 2012). From all of these methodologies, direct DNA sequencing technologies are considered as the most used and benefited for SNP identification.
Sequencing-based techniques were first invented at 1977 through Sanger method which depends on a combination of deoxy- and dideoxy-labeled chain terminator nucleotides (Sanger et al. 1977a). In the same year, the first complete genome of bacteriophage phi X174 was sequenced by this method (Sanger et al. 1977b). But in the last decade, several NGS (next generation sequencing) technologies (Roche/454, Illumina, SOLiD) have outperformed Sanger-based sequencing in throughput and overall cost (Kircher and Kelso 2010). With a throughput of hundreds of millions to several billions of bases per run, NGS are able to identify many SNPs in a species at much lower cost in a short time (Mardis 2007). Identification of SNP using NGS is reported in different plants such as
Most recently a new method has been derived for SNP genotyping using illumina NGS platform to reduce the cost for DNA sequencing, is known as GBS which was developed in 2011 (Elshire et al. 2011). GBS is a sequencing by synthesis strategy. GBS system is becoming increasingly important, effective and unique tool for SNP identification in plant species because of its low cost, reduced sample handling, no size fractionation, fewer PCR and purification steps, no reference sequence limits, efficient barcoding and easiness to scale up (Davey et al. 2011). A schematic representation of GBS technology for SNP discovery from plants was shown in Figure 1. GBS is an ideal method for SNP genotyping in plants from single gene markers to whole genome profiling (Poland and Rife 2012). GBS experiments were needed to do isolation of genomic DNA from plant materials, then quantification and normalization, digestion with appropriate restriction enzyme, then ligate the adapter at both end of digested DNA with a bar coding (BC) region in adapter 1, following PCR amplification and sequencing. Finally, bioinformatic analysis of sequencing data is carried out and find out the SNPs (Fig. 2). Compared to other methods, GBS is a considerably less complicated, fragmentation and ligation of appropriate adapters are more straightforward, single-well digestion of genomic DNA, and fewer DNA purification steps make it easy. Moreover, GBS method avoids the separation step of fragments by size resulting in reduced sample handling and ultimately become cost effective. The low cost of GBS system makes it a powerful tool for SNP genotyping in a variety of crop species and populations as well as other plants. GBS has been shown as a valid tool for genomic diversity studies (Fu and Peterson 2011; Lu et al. 2013; Fu et al. 2014), which is already able to prove itself as an excellent system for SNP identification in plant breeding programs even in the absence of reference genome sequences or without any previous information about DNA polymorphism. Available reference genome makes easy to data analysis and identification of SNPs, but it is not essential in GBS system, which is a great advantage to plant breeders for crop improvement programs. Many reports already published about the use of GBS system for genetic analysis, marker development and high throughput SNP genotyping of various crops such as rice, wheat, yellow mustard, rapeseed, lupin, lettuce, switchgrass, soybean, maize, etc. (Poland et al. 2012; Fu et al. 2014; Spindel et al. 2013; Truong et al. 2012; Lu et al. 2013; Sonah et al. 2013).
Overview of SNP discovery in plants through genotyping by sequencing (GBS) system
Data analysis for SNP identification. Reads are aligned to reference sequence to find differences between the reference genome and newly sequenced genome. This concept is taken from Kumar et al. (2014) with modification
Rice is the main food for more than half of the world’s population. The complete genome sequencing of rice in 2002 using bacterial artificial chromosomes (BAC) based approach was a major milestone for rice genomic research. In which genome size was 389 Mb, approximately three times larger than the model plant
Wheat is one of the top three staple grains in the world, along with rice and maize whose genome size is around 17 Gb. The international wheat genome sequencing consortium revealed a chromosome-based draft genome sequence of hexaploid bread wheat in 2014 (The International Wheat Genome Sequencing Consortium 2014). The modern cultivated wheat also known as bread wheat (
Maize is the most produced cereal crop in the world which whole genome was first sequenced at 2009. The genome size of maize is 2.3 Gb with more than predicted 32,000 genes (Schnable et al. 2009). DNA sequence diversity in maize populations is more than human. Tenaillon et al. (2001) measured the sequence diversity in 21 loci distributed along chromosome 1 of maize. They sequenced from 25 inbred lines and data indicated that the maize has an average one SNP per 104 bases between two randomly sampled sequences that was higher than human or
The entire genome of barley was first sequenced at 2012 and the total genome size was around 5.1 Gb, containing 79,379 transcript clusters, including 26,159 high-confidence genes (Mayer et al. 2012). Xia et al. (2013) investigated SNPs in small heat shock protein 17.8 (
The reference genome sequence of soybean is available from 2010 which make it easy to identify the DNA polymorphisms among soybean populations. The genome size is approximately 1.1 Gb with 46,430 protein coding genes (Schmutz et al. 2010). Lee et al. (2015) identified more than four millions high quality SNPs by resequencing 16 soybean accessions. Chung et al. (2014) obtained 3,871,469 high quality SNPs by resequencing of 10 cultivated and 6 wild soybean accessions after mapping reads for each accession to the reference genome sequence. Genic regions contain 20.4% (788,809 SNPs) SNPs and rest of the SNPs were located in the intergenic regions. Jang et al. (2015) discovered a single nucleotide polymorphism in an
Potato genome sequencing consortium first revealed the entire genome sequence of potato at 2011 that was 850 Mb in size. Hamilton et al. (2011) discovered 575,340 SNPs by sequencing normalized cDNA prepared from three commercial potato cultivars (Atlantic, Premier Russet, and Snowden). 230 SNPs were found in
The complete genome of tomato has been sequenced and assembled by tomato genome consortium at 2012 which is enabling the identification of genome-wide SNPs and considered as a model for genomic research in
There are so many other crop plants whose full genome sequence have been completed such as grape (Velasco et al. 2007), cucumber (Huang et al. 2009), apple (Velasco et al. 2010), banana (Hont et al. 2012), oil palm (Singh et al. 2013), eggplant (Hirakawa et al. 2014) etc. These reference genome sequences help the plant breeders to discover SNP among different cultivars or breeding lines which facilitate the development and selection of improved crop varieties.
Single Nucleotide Polymorphism may influence the promoter activity for gene expression, transcriptional and translational efficiency (LeVan et al. 2001). Therefore, they may be responsible for phenotypic variations among individuals for improving of agronomical traits. A gene contains two parts, exon and intron. Intron is removed during post transcriptional modification but the exons are finally translated into amino acid sequence and produce enzyme. So, the SNP in the exon part (coding region) is most important because they can affect the gene function. SNPs in the coding region are of two types, synonymous and asynonymous SNPs. Synonymous SNPs do not affect the amino acid sequence but asynonymous SNPs change the amino acid sequence of protein and may influence the enzyme activity (Fig. 3). There are many reports about the effect of SNP on gene function in different crop plants. One study conducted by Schreiber et al. (2014) and identified SNPs in plastidic starch phosphorylase
A schematic representation of the role of SNP in gene function that can influence enzyme activity by changing amino acids. Met, Methionine; Ala, Alanine; Ser, Serine; Ile, Isoleucine; Leu, Leucine, Val, Valine; Tyr, Tyrosine; Arg, Arginine; Gly, Glycine; Glu, Glutamic acid and Thr, Threonine. This concept is taken from Jang et al. (2015) with modification
As SNPs can change the amino acid that might affect the enzyme activity, so the study of functional SNPs is very important regarding crop improvements. It is important to know the location of SNP in the genome because if the SNP is present in the coding region can highly affect the activity and thermostability level of the enzyme. Sometimes it is also depends on the substituted amino acid positions because some amino acid controls the activity of enzyme. Recent technological advances make it easy to find out functional SNP from various breeding lines which could be used for crop improvements. The success stories indicate that SNPs in the functional parts of the gene may control the level of biotic and abiotic stresses and may develop various abiotic and biotic stress tolerance crop varieties through modifying enzyme activity.
This research was supported by Golden Seed Project (Center for Horticultural Seed Development, No. 213003-04-4-SBC10), by a research grant of the iPET, Ministry of Food, Agriculture, Forestry and Fisheries, Republic of Korea
Overview of SNP discovery in plants through genotyping by sequencing (GBS) system
Data analysis for SNP identification. Reads are aligned to reference sequence to find differences between the reference genome and newly sequenced genome. This concept is taken from Kumar et al. (2014) with modification
A schematic representation of the role of SNP in gene function that can influence enzyme activity by changing amino acids. Met, Methionine; Ala, Alanine; Ser, Serine; Ile, Isoleucine; Leu, Leucine, Val, Valine; Tyr, Tyrosine; Arg, Arginine; Gly, Glycine; Glu, Glutamic acid and Thr, Threonine. This concept is taken from Jang et al. (2015) with modification
Ho Bang Kim・Hye-Young Lee・Mi Sun Lee・Yi Lee・Youngtae Choi・Sung-Yeol Kim・Jaeyong Choi
J Plant Biotechnol 2023; 50(1): 207-214Shipra Kumari · Young-Sun Kim · Bashistha Kumar Kanth · Ji-Young Jang · Geung-Joo Lee
J Plant Biotechnol 2019; 46(3): 158-164Kang Hee Cho, Bong Hee Han, Jeom Hwa Han, Seo Jun Park, Se Hee Kim, Han Chan Lee, Mi Young Kim, and Myung-Su Kim
J Plant Biotechnol 2018; 45(4): 382-391
Journal of
Plant BiotechnologyOverview of SNP discovery in plants through genotyping by sequencing (GBS) system
|@|~(^,^)~|@|Data analysis for SNP identification. Reads are aligned to reference sequence to find differences between the reference genome and newly sequenced genome. This concept is taken from Kumar et al. (2014) with modification
|@|~(^,^)~|@|A schematic representation of the role of SNP in gene function that can influence enzyme activity by changing amino acids. Met, Methionine; Ala, Alanine; Ser, Serine; Ile, Isoleucine; Leu, Leucine, Val, Valine; Tyr, Tyrosine; Arg, Arginine; Gly, Glycine; Glu, Glutamic acid and Thr, Threonine. This concept is taken from Jang et al. (2015) with modification