J Plant Biotechnol 2020; 47(3): 209-217
Published online September 30, 2020
https://doi.org/10.5010/JPB.2020.47.3.209
© The Korean Society of Plant Biotechnology
Correspondence to : e-mail: stkim71@pusan.ac.kr
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
The presence of high amounts of seed storage proteins (SSPs) improves the overall quality of soybean seeds. However, these SSPs pose a major limitation due to their high abundance in soybean seeds. Although various technical advancements including mass-spectrometry and bioinformatics resources were reported, only limited information has been derived to date on soybean seeds at proteome level. Here, we applied a tandem mass tags (TMT)-based quantitative proteomic analysis to identify the significantly modulated proteins in the seeds of two soybean cultivars showing varying protein contents. This approach led to the identification of 5,678 proteins of which 13 and 1,133 proteins showed significant changes in Daewon (low-protein content cultivar) and Saedanbaek (high-protein content cultivar) respectively. Functional annotation revealed that proteins with increased abundance in Saedanbaek were mainly associated with the amino acid and protein metabolism involved in protein synthesis, folding, targeting, and degradation. Taken together, the results presented here provide a pipeline for soybean seed proteome analysis and contribute a better understanding of proteomic changes that may lead to alteration in the protein contents in soybean seeds.
Keywords Glycine max, LC-MS/MS, Low-abundance proteins, Protamine sulfate precipitation, Seed storage proteins, Tandem mass tags
Soybean seeds (
The classical workflow of soybean seed proteomics including two-dimensional proteomic analysis (2-DGE) allowed the identification of a few hundreds of significantly modulated proteins due to the presence of high abundant proteins (HAPs) which account for 75% of total proteins in soybean seeds (Min et al. 2019). Recently, this limitation have been overcome by of the development of a variety of methods for the pre-fractionation of total soybean proteins (Gupta et al. 2016) using protamine sulfate (PS) (Kim et al. 2013, 2015), calcium (Krishnan et al. 2009), and polyethylene glycol (PEG) (Kim et al. 2001). Moreover, advancements in the liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based methodologies and analytical software for downstream data processing such as MaxQuant (Tyanova et al. 2016a), Perseus (Tyanova et al. 2016b), Proteome discoverer (Thermo Fisher Scientific, Waltham, MA, UDA), and Skyline (Maclean et al. 2010), have led to the improvement of sensitivity, reliability, and coverage in proteome analysis.
Although the methodological developments have contributed to the identification of thousands of proteins per MS run with complete proteome coverage (Boersema et al. 2015; Niu et al. 2018), soybean seed proteomics is still poorly conducted because of the presence of SSPs and utilization of majorly 2-DGE based proteomics approaches (Min et al. 2019). Previously, Kim’s group carried out high-throughput proteome analysis using soybean seeds by shot-gun proteomic approaches including label-free (Min et al. 2017) and tandem mass tag (TMT) labeling quantitative analysis (Min et al. 2020b). In particular, a TMT-based quantitative analysis of filling stages of soybean seeds identified 5,918 proteins, the highest number of proteins reported to date in soybean seeds (Min et al. 2020b). Moreover, the utilization of the PS precipitation method with shot-gun proteome pipeline, especially the TMT labeling approach, were carried out and comparison of the total, PS-supernatant (PS-S), and PS-pellet (PS-P) proteins, revealed enrichment of various low-abundance proteins (LAPs) related to diverse seed metabolism (Min et al. 2019).
Here, we are reporting a comparative seed proteome profiling of two soybean cultivars differing into protein content. Altogether, this study resulted in the identification of 1,146 differentially modulated proteins (13 and 1,133 protein showed different abundance profiles), providing a list of potential protein candidates using two soybean seed cultivars differing in protein and oil contents.
Soybean seeds (Daewon, and Saedanbaek) were sown in the experimental fields of the National Institute of Crop Science (NICS), Rural Development Administration (RDA), in Miryang, South Korea, in June. The soil was supplemented with a standard RDA N-P-K fertilizer (N-P-K=3-3-3.3 kg/10 acre). Seeds were harvested in October 2018 (average temperature, 23.5±3.5°C; average day length, 12 hours 17 min) (Min et al. 2016).
Total proteins from two different cultivars of soybean seeds were isolated using the PS precipitation method with trichloroacetic acid (TCA)/acetone precipitation method (Gupta et al. 2015; Kim et al. 2015). Briefly, for PS precipitation method, one gram of each seed powder was homogenized with 10 mL of ice-cold Tris-Mg/NP-40 extraction buffer (0.5 M Tris-HCl, pH 8.3, 2% (v/v) NP-40, 20 mM MgCl2) and centrifuged at 15,922
Obtained peptides were dissolved in solvent-A (water/ Acetonitrile (ACN), 98:2 v/v; 0.1% formic acid) and separated by reversed-phase chromatography using a UHPLC Dionex UltiMate ® 3000 (Thermo Fisher Scientific, USA) instrument (Pajarillo et al. 2015). For trapping the sample, the UHPLC was equipped with Acclaim PepMap 100 trap column (100 μm × 2 cm, nanoViper C18, 5 μm, 100 Å) and subsequently washed with 98% solvent A for 6 min at a flow rate of 6 μL/min. The sample was continuously separated on an Acclaim PepMap 100 capillary column (75 μm × 15 cm, nanoViper C18, 3 μm, 100 Å) at a flow rate of 400 nL/min. The LC analytical gradient was run at 2% to 35% solvent B (100% ACN and 0.1% formic acid) over 90 min, then 35% to 95% over 10 minutes, followed by 90% solvent B for 5 minutes, and finally 5% solvent B for 15 minutes. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) was coupled with an electrospray ionization source to the quadrupole-based mass spectrometer QExactive™ Orbitrap High-Resolution Mass Spectrometer (Thermo Fisher Scientific, MA, Waltham, USA). The resulting peptides were electro-sprayed through a coated silica emitted tip (Scientific Instrument Service, NJ, Amwell Township, USA) at an ion spray voltage of 2000 eV. The MS spectra were acquired at a resolution of 70,000 (200 m/z) in a mass range of 350-1650 m/z. The automatic gain control (AGC) target value was 3 × 106 and the isolation window for MS/MS was 1.2 m/z. Eluted samples were used for MS/MS events (resolution of 35,000), measured in a data-dependent mode for the 15 most abundant peaks (Top15 method), in the high mass accuracy Orbitrap after ion activation/dissociation with Higher Energy C-trap Dissociation (HCD) at 32 collision energy in a 100-1650 m/z mass range (Pajarillo et al. 2015). The AGC target value for MS/MS was 2 × 105. The maximum ion injection time for the survey scan and MS/MS scan was 30 ms and 120 ms, respectively.
The acquired raw data were analyzed with the MaxQuant software (version 1.5.3.30) as described previously (Tyanova et al. 2016a; Gupta et al. 2018; Min et al. 2020b). All three technical replicates were cross-referenced against the Uniprot
To investigate the differential modulation of soybean seed proteome in high- and low-protein containing cultivars, seed proteins were isolated from Daewon and Saedanbaek and subjected to protamine sulfate precipitation method for depletion of major seed storage proteins (SSPs) (Kim et al. 2015). SSPs depleted fraction, referred as PS-S fraction, from two different cultivars (marked by DS; Daewon PS-S fraction and SS; Saedanbaek PS-S fraction, respectively) were sequentially subjected to trypsin digestion by filter-aided sample preparation (FASP) method and TMT-6plex labeling in the same manner as reported previously (Min et al. 2020a, 2020b) (Fig. 1A). Sequentially, pre-fractionation by basic-pH reversed-phase (BPRP) using in-house developed stage-tip was carried out to decrease the complexity of multiplex labeling sample mixtures (Han et al. 2014). This approach led to the identification of 51,278 peptides and 22,483 unique peptides matching to 5,678 protein groups from three technical replicates of TMT labeling sample sets (Fig. 1A). Particularly, TMT labeling combined with pre-fractionation approach showed improvements of the resolution and identification of protein as observed by 4,892 (84.3%) while a previous label-free study (Min et al. 2017) using PS-S fraction of soybean seed protein identified a comparatively lower number of protein (247 unique proteins, 0.4%) than present study (Fig. 1B).
For normalization and removal of batch effects within TMT data sets, we applied an internal reference scaling (IRS) method to 4,610 proteins showing more than 70% valid intensity values (Fig. 2A). As per the normalization steps, TMT data sets were normalized at the peptide spectrum match (PSM) level into the MaxQuant software (Yu et al. 2020). Sequentially, PSM-level normalized reporter ion intensities of each TMT data set were applied to the further IRS method for normalization (Plubell et al. 2017). These multiple-step normalization procedures showed the correction of batch effects that occurred by TMT-6plex reagents (Fig. 3A). IRS normalization of the data showed an improvement of the median coefficient of variation (CV) values of each sample from 19.63% to 6.06% (Fig. 3B). Besides, Pearson correlation coefficients showed a high degree of correlation among different replicates of each sample with an average R2 value of 0.996 (Fig. 2B). Of these 4,610 proteins, the sequential application of fold change (FC) calculation and Student’s
MapMan analysis of 1,146 differential proteins showed up- and down-regulation of various proteins in the metabolism and cell function overview categories. Proteins with increased abundance in Saedanbaek, involved in cluster_1, were mainly related to the CHO metabolism (9.3%), photosynthesis (9.3%), secondary metabolism (8.5%), lipid metabolism (15.1%), and amino acid metabolism (13.2%) (Table S2). In the cell function overview category, majority of these proteins were found to be associated with protein degradation (11.9%), stress-related protein (10.8%), signaling (9.1%), transport (8.2%), RNA regulation (7.6%), protein targeting (7.2%), protein synthesis (5.9%) (Table S2). Particularly, in the case of the protein degradation category, various types of protease including subtilases, serine, cysteine, and aspartate protease, among others showed increased abundance Saedanbaek (Fig. 4B). Furthermore, 32 proteins related to protein synthesis including various isoform of ribosomal proteins, initiation, and elongation factors also showed increased abundance in Saedanbaek (Fig. 4B). In addition to protein synthesis, the increased abundance of 4, 9, 39, and 20 proteins related to amino acid activation, protein folding, protein targeting, and post-translational modifications respectively were observed in Saedanbaek cultivar (Fig. 4B).
GO enrichment analysis of identified proteins showed an increased abundance of the proteins associated with the major metabolic pathway. In particular, proteins involved in cluster_2 showed increased abundance of proteins associated with protein metabolic process (GO:0019538), protein localization (GO:0008104), protein transport (GO:0015031), protein folding (GO: 0006457), and protein catabolic process (GO:0030163), among others in biological process categories (Table S3). In order to get further functional insights of proteins involved in cluster_2, KEGG pathway analysis was carried out using DAVID functional annotation web-based software (Jiao et al. 2012). KEGG pathway analysis showed that proteins with increased abundance in Saedanbaek were majorly associated with various metabolic pathways including biosynthesis of secondary metabolites, biosynthesis of amino acids, carbon metabolism, and protein processing in the endoplasmic reticulum (Table S4).
Recently, the next-generation proteomics approaches including label-free and isotope labeling-based quantitative analysis have been showing significant improvements in protein quantification and thus identification of differential proteins (Boersema et al. 2015; Min et al. 2019). However, soybean seed proteomics is still elusive due to several limitations including a narrow range of detection, low reproducibility, and difficulty to detect LAPs due to the presence of high abundant proteins (HAPs) (Gygi et al. 2000; Thompson et al. 2003). Therefore, a number of HAPs depletion methods have been developed specifically for the enrichment of LAPs from soybean seeds using protamine sulfate (Kim et al. 2015), calcium (Krishnan et al. 2009), and PEG (Kim et al. 2001). Our previous study showed a broad application of PS for the enrichment of LAPs from different plant samples including seeds and leaves of rice, soybean, pea, and peanut (Kim et al. 2015). Moreover, a previous report revealed that LAPs related to various major metabolism in filling and matured stages of soybean seeds were successfully enriched and identified in PS-S fraction using TMT-based quantitative analysis (Min et al. 2020b). Therefore, here we utilized the PS precipitation method in combination with TMT-based quantification to identify the differential proteins from the seeds of Daewon and Saedanbaek differing in total protein contents (Gupta et al. 2020; Min et al., 2020a, 2020b). This approach led to the identification of 1,146 significantly modulated proteins (FDR < 0.05, FC > 1.5) by the comparison between Daewon and Saedanbaek cultivars. Moreover, further functional classification of the increased abundance proteins, particularly in Saedanbaek cultivar showed accumulation of various LAPs associated with major seed metabolic pathways including photosynthesis, major/minor CHO metabolism, amino acid metabolism, lipid metabolism, and secondary metabolism, among others.
For the accumulation of storage compounds such as proteins and lipids, an enormous amount of energy is required for which the diffusion of oxygen in plant tissues is pre-requisite for the energy production in mitochondria (Krishnan and Coe, 2001; Galili et al. 2014). Therefore, energy production through photosynthetic activity is required and critical for the accumulation of reserved metabolites during seed desiccation (Fait et al. 2006). Here, we identified 22 proteins including psbP, psb28 subunits, ATP synthase, plastocyanin, ferredoxin, NADH-ubiquinone oxidoreductase chain 1, Ribulose bisphosphate carboxylase, Glyceraldehyde-3-phosphate dehydrogenase, and among others associated with photosynthesis showing increased abundance in Saedanbaek which is similar to that reported previously (Table S1) (Min et al. 2020b).
Besides, enrichment of LAPs led to the identification of 24 and 39 proteins related to major/minor CHO and lipid metabolism which showed increased abundance in the Saedanbaek cultivar (Table S2). Out of these, six and ten proteins related to starch synthesis/degradation and lipid degradation, respectively, showed increased abundance along with an increased abundance of two raffinose synthase proteins in the Saedanbaek cultivar (Table S1 and S2). Moreover, during seed maturation stages, 10 to 15% of lipids are converted to raffinose family oligosaccharides (RFOs) when the supply of exogenous resources from maternal plants are limited (Kambhampati et al. 2020). These RFOs aere produced by carbon remobilization from lipid along with sucrose during the development of seeds (Kambhampati et al. 2020).
In addition, we observed the accumulation of 34 proteins mainly associated with amino acid metabolism including GABA, glutamate, aspartate, branched-chain amino acids, tryptophan, serine, glycine, cysteine, and histidine synthesis (Table S2). Furthermore, MapMan functional classification of metabolism overview revealed the increased abundance of 3 proteins (more than 1.5 and 2.0-FC increase) involved in nitrogen metabolism which have an important role in determining the total amount of storage proteins. The ammonia derived from nitrogen uptake by maternal vegetative tissues is the primary source for supply the nitrogen predominantly as amino acid such as glutamine and asparagine to the seeds (Ohyama et al. 2017). In addition, amino acids participate in the synthesis of storage proteins and thereby contributing the carbon remobilization through proteolysis activity during the late seed developmental stages (Galili et al. 2014; Kambhampati et al. 2020). Here, 64 proteases showed increased abundance in Saedanbaek that might be having a crucial role in the remobilization of endogenous nitrogenous products such as amino acid or proteins to storage proteins (Gallardo et al. 2006, 2007). Taken together, our results suggest a positive correlation of various metabolism-related proteins involved in major/minor CHO metabolism, photosynthesis, nitrogen, amino acid metabolism, and among others with a higher protein content of soybean seeds.
This work was supported by a 2-Year Research Grant of Pusan National University.
J Plant Biotechnol 2020; 47(3): 209-217
Published online September 30, 2020 https://doi.org/10.5010/JPB.2020.47.3.209
Copyright © The Korean Society of Plant Biotechnology.
Cheol Woo Min ・Ravi Gupta ・Nguyen Van Truong ・Jin Woo Bae ・Jong Min Ko ・Byong Won Lee ・Sun Tae Kim
Department of Plant Bioscience, Life and Industry Convergence Research Institute, Pusan National University, Miryang, 50463, Republic of Korea
Department of Botany, School of Chemical and Life Sciences, Jamia Hamdard, New Delhi, 110062, India
Department National Institute of Crop Science, Rural Development Administration, Wanju, 55365, Republic of Korea
Department of Functional Crops, National Institute of Crop Science, Rural Development Administration, Miryang, 50424, Republic of Korea
Department of Central Area Crop Science, National Institute of Crop Science, Rural Development Administration, Suwon, 16429, Republic of Korea
Correspondence to:e-mail: stkim71@pusan.ac.kr
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
The presence of high amounts of seed storage proteins (SSPs) improves the overall quality of soybean seeds. However, these SSPs pose a major limitation due to their high abundance in soybean seeds. Although various technical advancements including mass-spectrometry and bioinformatics resources were reported, only limited information has been derived to date on soybean seeds at proteome level. Here, we applied a tandem mass tags (TMT)-based quantitative proteomic analysis to identify the significantly modulated proteins in the seeds of two soybean cultivars showing varying protein contents. This approach led to the identification of 5,678 proteins of which 13 and 1,133 proteins showed significant changes in Daewon (low-protein content cultivar) and Saedanbaek (high-protein content cultivar) respectively. Functional annotation revealed that proteins with increased abundance in Saedanbaek were mainly associated with the amino acid and protein metabolism involved in protein synthesis, folding, targeting, and degradation. Taken together, the results presented here provide a pipeline for soybean seed proteome analysis and contribute a better understanding of proteomic changes that may lead to alteration in the protein contents in soybean seeds.
Keywords: Glycine max, LC-MS/MS, Low-abundance proteins, Protamine sulfate precipitation, Seed storage proteins, Tandem mass tags
Soybean seeds (
The classical workflow of soybean seed proteomics including two-dimensional proteomic analysis (2-DGE) allowed the identification of a few hundreds of significantly modulated proteins due to the presence of high abundant proteins (HAPs) which account for 75% of total proteins in soybean seeds (Min et al. 2019). Recently, this limitation have been overcome by of the development of a variety of methods for the pre-fractionation of total soybean proteins (Gupta et al. 2016) using protamine sulfate (PS) (Kim et al. 2013, 2015), calcium (Krishnan et al. 2009), and polyethylene glycol (PEG) (Kim et al. 2001). Moreover, advancements in the liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based methodologies and analytical software for downstream data processing such as MaxQuant (Tyanova et al. 2016a), Perseus (Tyanova et al. 2016b), Proteome discoverer (Thermo Fisher Scientific, Waltham, MA, UDA), and Skyline (Maclean et al. 2010), have led to the improvement of sensitivity, reliability, and coverage in proteome analysis.
Although the methodological developments have contributed to the identification of thousands of proteins per MS run with complete proteome coverage (Boersema et al. 2015; Niu et al. 2018), soybean seed proteomics is still poorly conducted because of the presence of SSPs and utilization of majorly 2-DGE based proteomics approaches (Min et al. 2019). Previously, Kim’s group carried out high-throughput proteome analysis using soybean seeds by shot-gun proteomic approaches including label-free (Min et al. 2017) and tandem mass tag (TMT) labeling quantitative analysis (Min et al. 2020b). In particular, a TMT-based quantitative analysis of filling stages of soybean seeds identified 5,918 proteins, the highest number of proteins reported to date in soybean seeds (Min et al. 2020b). Moreover, the utilization of the PS precipitation method with shot-gun proteome pipeline, especially the TMT labeling approach, were carried out and comparison of the total, PS-supernatant (PS-S), and PS-pellet (PS-P) proteins, revealed enrichment of various low-abundance proteins (LAPs) related to diverse seed metabolism (Min et al. 2019).
Here, we are reporting a comparative seed proteome profiling of two soybean cultivars differing into protein content. Altogether, this study resulted in the identification of 1,146 differentially modulated proteins (13 and 1,133 protein showed different abundance profiles), providing a list of potential protein candidates using two soybean seed cultivars differing in protein and oil contents.
Soybean seeds (Daewon, and Saedanbaek) were sown in the experimental fields of the National Institute of Crop Science (NICS), Rural Development Administration (RDA), in Miryang, South Korea, in June. The soil was supplemented with a standard RDA N-P-K fertilizer (N-P-K=3-3-3.3 kg/10 acre). Seeds were harvested in October 2018 (average temperature, 23.5±3.5°C; average day length, 12 hours 17 min) (Min et al. 2016).
Total proteins from two different cultivars of soybean seeds were isolated using the PS precipitation method with trichloroacetic acid (TCA)/acetone precipitation method (Gupta et al. 2015; Kim et al. 2015). Briefly, for PS precipitation method, one gram of each seed powder was homogenized with 10 mL of ice-cold Tris-Mg/NP-40 extraction buffer (0.5 M Tris-HCl, pH 8.3, 2% (v/v) NP-40, 20 mM MgCl2) and centrifuged at 15,922
Obtained peptides were dissolved in solvent-A (water/ Acetonitrile (ACN), 98:2 v/v; 0.1% formic acid) and separated by reversed-phase chromatography using a UHPLC Dionex UltiMate ® 3000 (Thermo Fisher Scientific, USA) instrument (Pajarillo et al. 2015). For trapping the sample, the UHPLC was equipped with Acclaim PepMap 100 trap column (100 μm × 2 cm, nanoViper C18, 5 μm, 100 Å) and subsequently washed with 98% solvent A for 6 min at a flow rate of 6 μL/min. The sample was continuously separated on an Acclaim PepMap 100 capillary column (75 μm × 15 cm, nanoViper C18, 3 μm, 100 Å) at a flow rate of 400 nL/min. The LC analytical gradient was run at 2% to 35% solvent B (100% ACN and 0.1% formic acid) over 90 min, then 35% to 95% over 10 minutes, followed by 90% solvent B for 5 minutes, and finally 5% solvent B for 15 minutes. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) was coupled with an electrospray ionization source to the quadrupole-based mass spectrometer QExactive™ Orbitrap High-Resolution Mass Spectrometer (Thermo Fisher Scientific, MA, Waltham, USA). The resulting peptides were electro-sprayed through a coated silica emitted tip (Scientific Instrument Service, NJ, Amwell Township, USA) at an ion spray voltage of 2000 eV. The MS spectra were acquired at a resolution of 70,000 (200 m/z) in a mass range of 350-1650 m/z. The automatic gain control (AGC) target value was 3 × 106 and the isolation window for MS/MS was 1.2 m/z. Eluted samples were used for MS/MS events (resolution of 35,000), measured in a data-dependent mode for the 15 most abundant peaks (Top15 method), in the high mass accuracy Orbitrap after ion activation/dissociation with Higher Energy C-trap Dissociation (HCD) at 32 collision energy in a 100-1650 m/z mass range (Pajarillo et al. 2015). The AGC target value for MS/MS was 2 × 105. The maximum ion injection time for the survey scan and MS/MS scan was 30 ms and 120 ms, respectively.
The acquired raw data were analyzed with the MaxQuant software (version 1.5.3.30) as described previously (Tyanova et al. 2016a; Gupta et al. 2018; Min et al. 2020b). All three technical replicates were cross-referenced against the Uniprot
To investigate the differential modulation of soybean seed proteome in high- and low-protein containing cultivars, seed proteins were isolated from Daewon and Saedanbaek and subjected to protamine sulfate precipitation method for depletion of major seed storage proteins (SSPs) (Kim et al. 2015). SSPs depleted fraction, referred as PS-S fraction, from two different cultivars (marked by DS; Daewon PS-S fraction and SS; Saedanbaek PS-S fraction, respectively) were sequentially subjected to trypsin digestion by filter-aided sample preparation (FASP) method and TMT-6plex labeling in the same manner as reported previously (Min et al. 2020a, 2020b) (Fig. 1A). Sequentially, pre-fractionation by basic-pH reversed-phase (BPRP) using in-house developed stage-tip was carried out to decrease the complexity of multiplex labeling sample mixtures (Han et al. 2014). This approach led to the identification of 51,278 peptides and 22,483 unique peptides matching to 5,678 protein groups from three technical replicates of TMT labeling sample sets (Fig. 1A). Particularly, TMT labeling combined with pre-fractionation approach showed improvements of the resolution and identification of protein as observed by 4,892 (84.3%) while a previous label-free study (Min et al. 2017) using PS-S fraction of soybean seed protein identified a comparatively lower number of protein (247 unique proteins, 0.4%) than present study (Fig. 1B).
For normalization and removal of batch effects within TMT data sets, we applied an internal reference scaling (IRS) method to 4,610 proteins showing more than 70% valid intensity values (Fig. 2A). As per the normalization steps, TMT data sets were normalized at the peptide spectrum match (PSM) level into the MaxQuant software (Yu et al. 2020). Sequentially, PSM-level normalized reporter ion intensities of each TMT data set were applied to the further IRS method for normalization (Plubell et al. 2017). These multiple-step normalization procedures showed the correction of batch effects that occurred by TMT-6plex reagents (Fig. 3A). IRS normalization of the data showed an improvement of the median coefficient of variation (CV) values of each sample from 19.63% to 6.06% (Fig. 3B). Besides, Pearson correlation coefficients showed a high degree of correlation among different replicates of each sample with an average R2 value of 0.996 (Fig. 2B). Of these 4,610 proteins, the sequential application of fold change (FC) calculation and Student’s
MapMan analysis of 1,146 differential proteins showed up- and down-regulation of various proteins in the metabolism and cell function overview categories. Proteins with increased abundance in Saedanbaek, involved in cluster_1, were mainly related to the CHO metabolism (9.3%), photosynthesis (9.3%), secondary metabolism (8.5%), lipid metabolism (15.1%), and amino acid metabolism (13.2%) (Table S2). In the cell function overview category, majority of these proteins were found to be associated with protein degradation (11.9%), stress-related protein (10.8%), signaling (9.1%), transport (8.2%), RNA regulation (7.6%), protein targeting (7.2%), protein synthesis (5.9%) (Table S2). Particularly, in the case of the protein degradation category, various types of protease including subtilases, serine, cysteine, and aspartate protease, among others showed increased abundance Saedanbaek (Fig. 4B). Furthermore, 32 proteins related to protein synthesis including various isoform of ribosomal proteins, initiation, and elongation factors also showed increased abundance in Saedanbaek (Fig. 4B). In addition to protein synthesis, the increased abundance of 4, 9, 39, and 20 proteins related to amino acid activation, protein folding, protein targeting, and post-translational modifications respectively were observed in Saedanbaek cultivar (Fig. 4B).
GO enrichment analysis of identified proteins showed an increased abundance of the proteins associated with the major metabolic pathway. In particular, proteins involved in cluster_2 showed increased abundance of proteins associated with protein metabolic process (GO:0019538), protein localization (GO:0008104), protein transport (GO:0015031), protein folding (GO: 0006457), and protein catabolic process (GO:0030163), among others in biological process categories (Table S3). In order to get further functional insights of proteins involved in cluster_2, KEGG pathway analysis was carried out using DAVID functional annotation web-based software (Jiao et al. 2012). KEGG pathway analysis showed that proteins with increased abundance in Saedanbaek were majorly associated with various metabolic pathways including biosynthesis of secondary metabolites, biosynthesis of amino acids, carbon metabolism, and protein processing in the endoplasmic reticulum (Table S4).
Recently, the next-generation proteomics approaches including label-free and isotope labeling-based quantitative analysis have been showing significant improvements in protein quantification and thus identification of differential proteins (Boersema et al. 2015; Min et al. 2019). However, soybean seed proteomics is still elusive due to several limitations including a narrow range of detection, low reproducibility, and difficulty to detect LAPs due to the presence of high abundant proteins (HAPs) (Gygi et al. 2000; Thompson et al. 2003). Therefore, a number of HAPs depletion methods have been developed specifically for the enrichment of LAPs from soybean seeds using protamine sulfate (Kim et al. 2015), calcium (Krishnan et al. 2009), and PEG (Kim et al. 2001). Our previous study showed a broad application of PS for the enrichment of LAPs from different plant samples including seeds and leaves of rice, soybean, pea, and peanut (Kim et al. 2015). Moreover, a previous report revealed that LAPs related to various major metabolism in filling and matured stages of soybean seeds were successfully enriched and identified in PS-S fraction using TMT-based quantitative analysis (Min et al. 2020b). Therefore, here we utilized the PS precipitation method in combination with TMT-based quantification to identify the differential proteins from the seeds of Daewon and Saedanbaek differing in total protein contents (Gupta et al. 2020; Min et al., 2020a, 2020b). This approach led to the identification of 1,146 significantly modulated proteins (FDR < 0.05, FC > 1.5) by the comparison between Daewon and Saedanbaek cultivars. Moreover, further functional classification of the increased abundance proteins, particularly in Saedanbaek cultivar showed accumulation of various LAPs associated with major seed metabolic pathways including photosynthesis, major/minor CHO metabolism, amino acid metabolism, lipid metabolism, and secondary metabolism, among others.
For the accumulation of storage compounds such as proteins and lipids, an enormous amount of energy is required for which the diffusion of oxygen in plant tissues is pre-requisite for the energy production in mitochondria (Krishnan and Coe, 2001; Galili et al. 2014). Therefore, energy production through photosynthetic activity is required and critical for the accumulation of reserved metabolites during seed desiccation (Fait et al. 2006). Here, we identified 22 proteins including psbP, psb28 subunits, ATP synthase, plastocyanin, ferredoxin, NADH-ubiquinone oxidoreductase chain 1, Ribulose bisphosphate carboxylase, Glyceraldehyde-3-phosphate dehydrogenase, and among others associated with photosynthesis showing increased abundance in Saedanbaek which is similar to that reported previously (Table S1) (Min et al. 2020b).
Besides, enrichment of LAPs led to the identification of 24 and 39 proteins related to major/minor CHO and lipid metabolism which showed increased abundance in the Saedanbaek cultivar (Table S2). Out of these, six and ten proteins related to starch synthesis/degradation and lipid degradation, respectively, showed increased abundance along with an increased abundance of two raffinose synthase proteins in the Saedanbaek cultivar (Table S1 and S2). Moreover, during seed maturation stages, 10 to 15% of lipids are converted to raffinose family oligosaccharides (RFOs) when the supply of exogenous resources from maternal plants are limited (Kambhampati et al. 2020). These RFOs aere produced by carbon remobilization from lipid along with sucrose during the development of seeds (Kambhampati et al. 2020).
In addition, we observed the accumulation of 34 proteins mainly associated with amino acid metabolism including GABA, glutamate, aspartate, branched-chain amino acids, tryptophan, serine, glycine, cysteine, and histidine synthesis (Table S2). Furthermore, MapMan functional classification of metabolism overview revealed the increased abundance of 3 proteins (more than 1.5 and 2.0-FC increase) involved in nitrogen metabolism which have an important role in determining the total amount of storage proteins. The ammonia derived from nitrogen uptake by maternal vegetative tissues is the primary source for supply the nitrogen predominantly as amino acid such as glutamine and asparagine to the seeds (Ohyama et al. 2017). In addition, amino acids participate in the synthesis of storage proteins and thereby contributing the carbon remobilization through proteolysis activity during the late seed developmental stages (Galili et al. 2014; Kambhampati et al. 2020). Here, 64 proteases showed increased abundance in Saedanbaek that might be having a crucial role in the remobilization of endogenous nitrogenous products such as amino acid or proteins to storage proteins (Gallardo et al. 2006, 2007). Taken together, our results suggest a positive correlation of various metabolism-related proteins involved in major/minor CHO metabolism, photosynthesis, nitrogen, amino acid metabolism, and among others with a higher protein content of soybean seeds.
This work was supported by a 2-Year Research Grant of Pusan National University.
Journal of
Plant Biotechnology