J Plant Biotechnol (2024) 51:377-386
Published online December 3, 2024
https://doi.org/10.5010/JPB.2024.51.037.377
© The Korean Society of Plant Biotechnology
Correspondence to : J.-K. Na (✉)
e-mail: jongkook@kangwon.ac.kr
S.-J. Oh (✉)
e-mail: ohsejin@gdif.or.kr
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Seawater is an economical and eco-friendly alternative to chemical fertilizers because it contains various plant essential minerals required for plant growth. Seawater application has various effects on crops at the physiological and transcriptional levels. In this study, transcriptional changes in Ligularia stenocephala, a vegetable crop known as “Gondalbi”, in response to deep seawater (DSW) treatment were examined using RNA sequencing. L. stenocephala was treated with 5% and 10% DSW (designated to DSW5 and DSW10) or 500X and 1000X of a fertilizer (designated to SWF500 and SWF1000) comprising filtered DSW and additional minerals. RNA sequencing generated 152 million clean sequence reads in total, of which de novo assembly generated 147,406 unigenes with an average length of 566.6 bp. The GC content of five transcriptomes was 42.95-43.65%, and the N50 was 776 bp. Annotation of all identified unigenes was performed using seven different databases, and 67,592 unigenes (45.8%) were annotated. KEGG analysis annotated total of 7009 unigenes (22.9%) into 421 pathways. Because L. stenocephala is known for its anti-oxidative properties, we focused on genes associated with natural antioxidant biosynthesis and identified several unigenes involved in the biosynthesis of glutathione, tocopherol, beta-carotenoids, flavonoids, and ascorbic acid. Furthermore, we carried out the mining of simple sequence repeat (SSR) and identified 35,280 from the L. stenocpehala transcriptome. Present data would be valuable for an enhanced understanding of the transcriptional properties of seawater application in other crops and for the investigation of the functional properties and therapeutic potential of L. stenocephala.
Keywords Gondalbi, Rocket, RNA sequencing, Differentially expressed genes, Antioxidant
Ligularia stenocephala, popularly known as “The Rocket” and in Korea as “Gondalbi”, is a perennial plant indigenous to temperate East Asia. L. stenocephala is one of the main species of the genus Ligularia and it is a leafy herb that grows strictly on well-drained wet sites with full or partial shade and adequate fertilization. The leaves of L. stenocephala has been used for medicinal purposes and also consumed as a leafy vegetable in South Korea (Debnath et al. 2017). Many studies have reported the high level of antithrombotic activity and antibacterial effects of the leaf extract of L. stenocephala among many edible and herbal plants studied (Debnath et al. 2017; Lee et al. 2013; Nugroho et al. 2010; Yoon et al. 2008).
Soil conditions affect plant productivity. Existing plant cultivation methods have raised concerns about sustainability due to the continuous use of chemical fertilizers which is a source of environmental pollution (Rahman and Zhang 2018) and also an expensive method of fertilization due to high energy cost (Christiansen et al. 2012). Plants need essential nutrients for their growth and development and if lacking, this ultimately reduces plant growth and productivity (Chele et al. 2021). Based on current management practices and levels of production intensity, the world demand for the main fertilizer nutrients, including nitrogen, phosphorus, and potassium, is predicted to rise by 2% annually (FAO 2017). To ensure adequate food production and agricultural sustainability with the increase in world population growth, there is a need to embrace locally adapted sustainable agricultural practices (Rashid et al. 2016). Recently, there has been an increase in the development of natural fertilizers to enhance organic farming globally and the sea has long been a source of organic fertilizers in coastal areas (Emadodin et al. 2020). Deep seawater (DSW) generally refers to a low temperature, high-purity, nutrient-rich seawater pumped from a depth of over 200 m (Hwang et al. 2009; Mohd Nani et al. 2016). Compared to other sources of water, DSW is rich in plant-beneficial minerals such as potassium, magnesium, calcium, and zinc (Emadodin et al. 2020).
Many studies have examined the effects of seawater on the physiological properties of crops, focusing on yield and quality, but few have investigated how seawater affects transcriptional change. Physiological changes in plants involve transcriptional regulation, so analyzing the transcriptome of seawater-treated crops could help predict potential physiological responses more precisely. In this study, RNA sequencing was carried out to investigate the transcriptome of L. stenocephala treated with different concentrations of DSW or a fertilizer made of filtered concentrate of DSW and additional nutrients (designated to SWF). The data generated from the RNA sequencing was run for gene ontology (GO) analysis, EggNOG analysis, KEGG metabolic pathway analysis, and SSR mining. Also, we identified the genes involved in the antioxidant and salt stress-related pathway of L. stenocephala and compared the expression of those genes across samples. The transcriptome of L. stenocephala is the first reference transcriptome, which not only enriches our understanding of the genetic framework of this plant but also lays the background for future studies. This paper serves as a vital stepping stone toward more comprehensive research and the eventual unlocking of the full potential of this underexplored species.
Seedlings of L. stenocephala were obtained from one-year- old roots. Six seedlings were transplanted to each pot (60 cm × 20 cm × 15 cm) filled with silt roam soil on March 20, 2023 and pots with seedlings were placed under shade for two weeks before DSW treatment. The experiments were split into five treatment groups: water with 5% or 10% DSW, 500X or 1000X of SWF, and control (NT). DSW and SWF treatment were applied to the seedlings every other week, and tap water was applied in between. Control seedlings were treated with tap water once a week during the experimental period. The experiment was conducted from April 3 to May 22, 2023, at Goseong Deep Sea Water Industry Foundation, Gangwon, Republic of Korea.
L. stenocephala leaves were collected and instantly snap-frozen in liquid nitrogen, and then kept at -70°C. Following the manufacturer’s instructions, total RNA was extracted from the leaves of NT, DSW5, DSW10, SWF500, and SWF1000 using GeneAll Ribospin plant reagent (GeneAll Biotechnology Co., Ltd., Seoul, South Korea). RNA quality was checked by 1% RNase-free agarose gel electrophoresis, and the purity was determined using the NanoPhotometer® spectrophotometer (IMPLEN, CA, USA). RNA integrity was assessed by running samples through the TapeStation RNA screentape (Agilent, #5067-5576), and RNA concentration was determined using the Quan-itTM RiboGreen RNA Assay technique (Invitrogen, cat. #R11490).
Messenger RNA was isolated from 1 ug of total RNA and sheared for library construction. SuperScript II reverse transcriptase was used to generate first-strand cDNA using the sheared mRNA fragments using random primers (Invitrogen, #18064014). Adapters were ligated onto both ends of the cDNA fragments, among which 200 - 400 bp fragments were selected and used for pair-end sequencing using the Illumina NovaSeq 6000 system (Illumina, Inc., San Diego, CA, USA). For assembly, after quality control of the raw reads using the FastQCv0.11.7 program, the adapter sequences, low-quality reads (< Q20), and reads with length < 36 bp were removed using Trimmomatic tool v0.38 (Bolger et al. 2014). To obtain a reference transcriptome of L. stenocephala, all sequencing data were merged and assembled using Trinity program (Trinity version trinityrnaseq_r20140717, bowtie 1.1.2). The CD-HIT program v4.6 was used for clustering transcripts into unigenes.
Unigene annotation was carried out using seven different databases: GO (v20180319), UniProt (2022_05), NCBI NR protein and NT nucleotide databases (20230102), Pfam (20160316), EggNOG (e5.proteomes), and KO_EUK (20230102). Reference transcriptome was used for differentially expressed gene analysis based on read counts generated from RSEM program (RSEM version 1.2.31). Similarity search against the seven databases were performed using BLASTN of NCBI BLAST and BLASTX of DIAMOND software with an E-value at default cutoff of 1.0E-5.
In order to evaluate the SSR composition in the transcriptome of L. stenocephala, SSRs were identified using MISA (pgrc.ipk-gatersleben.de/misa), a tool for microsatellite identification. The criteria included a minimum motif repeat of four and a minimum length of 12 bp. RepeatMasker (v. 4.0.7) was run at default mode to examine the repetitive sequence in the transcriptome of L. stenocephala, with the reference library RepBaseRepeatMakerEdition-20170127(www.girinst.org).
RNA sequencing of L. stenocephala produced a total of 155.8 million raw reads with a length of about 15.7 Gb, from which a total of 152.7 million clear reads (equivalent to about 15.3 Gb) were obtained after filtering raw quality reads. The total number of quality reads of five transcriptomes ranged from 2.85 (SWF1000) to 3.85 (NT) million, and their total length of quality reads were between 2.78 and 3.75 Gb (Table 1). Percentage of clear quality reads was higher than 97.9%. Due to lacking of a reference transcriptome of L. stenocephala, all clear reads of five transcriptomes were merged and used for de novo assembly to generate a reference transcriptome, resulting in 147,406 unigenes with a total length of 83,519,007 bp. GC content of the reference transcriptome was 38.1% and N50 was 776 bp (Table 2). The size range of unigenes was 201~14,728 bp with average of 566 bp, and 13,787 unigenes were in full length. From annotation based on seven different databases, a total of 67,592 (45.85%) unigenes were annotated, with 42.52% of the annotated unigenes showing similarities to proteins on the NCBI Non-Redundant Protein (NR) database (Fig. 1).
Table 1 Summary of RNA sequencing data of Ligularia stenocephala treated with deep seawater or a fertilizer derived from deep seawater
Description of sequenced data | NT | DSW5 | DSW10 | SWF500 | SWF1000 |
---|---|---|---|---|---|
Total number of raw reads | 38,129,414 | 29,150,308 | 30,525,785 | 29,675,962 | 28,269,621 |
Total length of raw reads (bp) | 3,851,070,814 | 2,944,181,108 | 3,083,104,285 | 2,997,272,162 | 2,855,231,721 |
Total number of clean reads | 37,350,843 | 28,621,397 | 29,901,820 | 29,087,042 | 27,695,022 |
Total length of clean reads (bp) | 3,752,226,264 | 2,879,166,872 | 3,007,585,370 | 2,925,642,225 | 2,785,479,823 |
GC content of clean reads (%) | 43.12 | 42.95 | 43.58 | 43.6 | 43.65 |
Percentage of clean reads | 97.96% | 98.19% | 97.96% | 98.02% | 97.97% |
Number of mapped reads | 19,767,956 | 15,400,751 | 16,005,022 | 15,430,469 | 14,485,027 |
Overall mapping ratio (%) | 52.93% | 53.81% | 53.53% | 53.05% | 52.30% |
Q20 (%) | 98.64 | 98.66 | 98.63 | 98.62 | 98.63 |
Q30 (%) | 95.21 | 95.26 | 95.2 | 95.18 | 95.22 |
*SWF500 and SWF1000 indicate RNA sequencing results from Ligularia stenocephala treated with 500X and 1000X of a fertilizer comprising deep seawater and additional nutrients. DSW5 and DSW10 indicate treatments of water containing 5% and 10% of deep seawater.
Table 2 Summary of de novo assembly of Ligularia stenocephala reference transcriptome
Assembly | No. of unigenes | GC content (%) | N50 | Longest contig (bp) | Shortest contig (bp) | Average contig length (bp) | Total assembled bases (bp) |
---|---|---|---|---|---|---|---|
Ligularia | 147,406 | 38.1 | 776 | 14,728 | 201 | 566.59 | 83,519,007 |
All clean reads of each RNA sequencing data were used for mapping to the reference transcriptome of L. stenocephala. Total number of mapped reads of each sequencing data ranged from 14.5 (SWF1000) to 19.8 (NT) million reads, and average mapping ratio of five sequencing data was 53.1% (Table 1).
Go functional analysis of L. stenocephala unigenes classified 108,046 unigenes into cellular components, 97,851 unigenes into biological processes, and 57,494 unigenes into molecular functions (Fig. 2). However, the number of unique unigenes excluding unigenes involved in more than two sub-functional categories was 39,599 (37%) for cellular components, 25,607 (26%) for biological processes, and 22,612 (39%) for molecular functions. In addition, most of the unigenes were classified into two subcategories in the molecular function category: catalytic activity and binding. In total, 22,612 (39%) genes were classified into catalytic activity and 20,083 (35%) into binding (Fig. 2). In the biological process category, the top three subcategories were the cellular process with 25,607 (26%) unigenes, the metabolic process with 21,876 (22%), and biological regulation consisting of 11,689 (12%) unigenes. In the cellular component category, most genes were classified into four subcategories, cell part consisting of 39,599 (37%) genes, organelle with 25,640 (24%), membrane with 11,042 (10%), and organelle part with 10,163 (9%) unigenes, respectively (Fig. 2).
EggNOG (The evolutionary gene genealogy non-supervised orthologous groups) analysis classified 98,785 unigenes into 25 functional categories (Fig. 3), with the largest classification being “replication, recombination and repair” (12,238, 27.04%), followed by “transcription” (8,202 8.30%), “posttranslational modification, protein turnover, chaperones” (6,386, 6.46%), “signal transduction mechanisms (6,050 6.12%), and “amino acid transport and metabolism” (4,262 4.31%). The cell motility category has the smallest EggNOG classification of 31 genes (0.03%).
All unigenes were run on the BlastKOALA tool (http://www.genome.jp/kegg) to allocate KEGG orthology number, known as KO number, resulting in 7,009 unigenes being assigned numbers. The largest number of unigenes were categorized into metabolism consisting of 7,040 unigenes in 12 subcategories with 145 pathways, while the pathways of environmental information processing contained the fewest associated unigenes with 1,390 in 36 pathways (Table 3). The top five pathways with the highest number of entry enzyme hits among all pathways were the metabolic pathways (map01100) with 886 entry enzymes, the biosynthesis of secondary metabolites (map01110) with 460, microbial metabolism in diverse environments (map01120) with 142, amyotrophic lateral sclerosis (05014) with 127, the pathway of neurodegeneration with 121 (map05022), respectively (Fig. 4).
Table 3 Categories of KEGG metabolic pathways and number of unigenes of Ligularia stenocephala associated with each category
Category | No. of subcategories | No. of associated pathways | No. of associated unigene entries | No. of associated genes |
---|---|---|---|---|
Metabolism | 12 | 145 | 3554 | 7040 |
Genetic information processing | 6 | 27 | 928 | 1529 |
Environmental information processing | 3 | 36 | 433 | 1390 |
Cellular processes | 5 | 35 | 668 | 1394 |
Organismal systems | 10 | 88 | 642 | 1926 |
Human Diseases | 12 | 89 | 1739 | 3697 |
The major naturally occurring antioxidants include ascorbic acid, α-tocopherol, glutathione, carotenoids, and flavonoids (Huchzermeyer et al. 2022). Several metabolic pathways, including the metabolism of terpenoids, biosynthesis of other secondary metabolites, and metabolism of cofactors and vitamins, are connected to the biosynthesis of antioxidants in plants. Therefore, it is essential to examine genes involved in such pathways for understanding underlying mechanisms associated to antioxidant biosynthesis in a specific plant species. In L. stenocephala, 22 unigenes were found to be involved in the carotenoid biosynthesis pathway (map 00906) with 16 entries. For the flavonoid metabolism, 29 unigenes were associated with flavonoid biosynthesis pathways (map 00941) with 14 entries. For the metabolism of ascorbic acid in L. stenocephala, some of the associated pathways include the Galactose metabolism (map 00052) where we identified 39 unigenes with 15 entries and the Ascorbate and aldarate metabolism pathway (map 00053) with 41 associated unigenes and 20 entries respectively (Table 4; Supplementary Table S1).
Table 4 KEGG metabolic pathways related to antioxidants and the number of associated unigenes in Ligularia stenocephala transcriptome
Metabolic pathway | KEGG map ID | No. of entries | No. of associated unigenes |
---|---|---|---|
Metabolism of terpenoids | |||
Carotenoid biosynthesis | 00906 | 16 | 22 |
Monoterpenoid biosynthesis | 00902 | 2 | 9 |
Biosynthesis of other secondary metabolites | |||
Phenylpropanoid biosynthesis | 00940 | 16 | 76 |
Flavonoid biosynthesis | 00941 | 14 | 29 |
Flavone and flavonol biosynthesis | 00944 | 3 | 3 |
Metabolism of cofactors and vitamins | |||
Ubiquinone and other terpenoid-quinone biosynthesis | 00130 | 20 | 40 |
Metabolism of other amino acids | |||
Glutathione metabolism | 00480 | 19 | 65 |
Carbohydrate metabolism (ascorbic acid as an end product) | |||
Galactose metabolism | 00052 | 15 | 39 |
Ascorbate and aldarate metabolism | 00053 | 20 | 41 |
RepeatMasker (http://www.repeatmasker.org) was used to analyze the repetitive sequences in the transcriptome of L. stenocephala. It was found that repetitive and low-complexity sequences occupied 1,316,959 bp (1.58%) of the L. stenocephala transcriptome (Table 5). Most of the repetitive sequences were simple sequence repeats, with 27,627 elements occupying 1,111,801 bp (1.33%) (Table 5).
Table 5 Repetitive sequences in Ligularia stenocephala transcriptome (83,519,007 bp)
Type of repeats | No. of elements | Sequence length occupied (bp) | Percentage of sequence |
---|---|---|---|
Retroelements | 82 | 5606 | 0.01 |
SINEs | 10 | 1038 | 0.00 |
LINEs | 46 | 2535 | 0.00 |
LTR elements | 26 | 2033 | 0.00 |
DNA transposons | 61 | 8157 | 0.01 |
hobo-Activator | 58 | 7996 | 0.01 |
Tc1-IS630-Pogo | 2 | 115 | 0.00 |
Unclassified | 32 | 4490 | 0.01 |
Total interspersed repeats | 18,253 | 0.02 | |
Small RNA | 146 | 24,311 | 0.03 |
Satellite | 6 | 586 | |
Simple repeats | 27627 | 1,111,801 | 1.33 |
Low complexity | 3369 | 162,008 | 0.19 |
Total number of bases masked | 1,316,959 | 1.58 |
SSR search allocated 35,280 perfect SSRs from 26,793 SSR-containing unigenes of L. stenocephala (Table 6). Among SSR-containing unigenes, 6,134 had more than one SSR, of which 4061 had compound SSRs (Table 6). The frequencies of total identified SSRs of L. stenocephala were examined and the SSR frequency was 422.4 per one million base pairs (Mbp) in the L. stenocephala transcriptome. Trinucleotide SSRs were the most abundant SSRs with 12,636 (35.82%), followed by di-nucleotide SSRs with 11,607 (32.90%) occurrence (Fig. 5a). Furthermore, the frequencies of trinucleotide SSRs was 151.29 per Mbp and with penta-nucleotide SSRs having the least frequencies of 26.35 Mbp (Fig. 5b). The number of SSR occurrences by individual motif ranged from 4239 in AC/GT followed by 3819 in AG/CT motif to 6 in CG/GG, respectively (Fig. 6a). The highest SSR occurrence by motif unit was AC/GT with 50.75 occurrences per Mbp, followed by AG/CT motif with 45.73 occurrences per Mbp (Fig. 6b; Table 6).
Table 6 Distribution and comparison of simple sequence repeats from Ligularia stenocephala transcriptome by repeat motifs
Motif | No. of motif repeats | Total | Occurrence per 1 Mbp | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ≥12 | |||
AC/GT | 0 | 0 | 2425 | 794 | 372 | 249 | 172 | 147 | 77 | 3 | 4239 | 50.8 |
AG/CT | 0 | 0 | 1862 | 675 | 391 | 285 | 246 | 250 | 106 | 4 | 3819 | 45.7 |
AT/AT | 0 | 0 | 1719 | 568 | 300 | 241 | 305 | 312 | 95 | 3 | 3543 | 42.4 |
CG/CG | 0 | 0 | 4 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 6 | 0.1 |
AAC/GTT | 0 | 1737 | 639 | 304 | 125 | 5 | 0 | 0 | 1 | 0 | 2811 | 33.7 |
AAG/CTT | 0 | 1374 | 345 | 161 | 80 | 5 | 0 | 0 | 0 | 0 | 1965 | 23.5 |
AAT/ATT | 0 | 950 | 243 | 129 | 103 | 10 | 0 | 1 | 0 | 0 | 1436 | 17.2 |
ACC/GGT | 0 | 1446 | 409 | 186 | 69 | 9 | 0 | 0 | 0 | 1 | 2120 | 25.4 |
ACG/CGT | 0 | 85 | 18 | 9 | 1 | 0 | 0 | 1 | 0 | 0 | 114 | 1.4 |
ACT/AGT | 0 | 141 | 38 | 16 | 7 | 0 | 0 | 0 | 0 | 0 | 202 | 2.4 |
AGC/CTG | 0 | 475 | 128 | 67 | 42 | 6 | 1 | 0 | 0 | 0 | 719 | 8.6 |
AGG/CCT | 0 | 259 | 81 | 24 | 14 | 5 | 0 | 1 | 0 | 0 | 384 | 4.6 |
ATC/ATG | 0 | 1541 | 492 | 314 | 178 | 7 | 0 | 0 | 0 | 0 | 2532 | 30.3 |
CCG/CGG | 0 | 286 | 48 | 13 | 6 | 0 | 0 | 0 | 0 | 0 | 353 | 4.2 |
Tetra- | 5381 | 628 | 161 | 35 | 0 | 0 | 1 | 1 | 0 | 0 | 6207 | 74.3 |
≥ penta- | 3966 | 778 | 60 | 13 | 6 | 4 | 1 | 1 | 1 | 0 | 4830 | 57.8 |
Total | 9347 | 9700 | 8672 | 3308 | 1696 | 826 | 726 | 714 | 280 | 11 | 35280 | 422.4 |
Medicinal plants have long been recognized for their health benefits, particularly due to various bioactive compounds having antioxidant properties, which play a crucial role in neutralizing free radicals and reducing oxidative stress, a key factor in the development of chronic diseases such as cardiovascular diseases, cancer, and neurodegenerative disorders (Sharifi-Rad et al. 2020). Several medicinal plants have been extensively studied for their potential antioxidant properties and a variety of phytochemicals are found to contribute to their overall antioxidant capacity. Antioxidants derived from medicinal plants are increasingly being explored for their potential applications in nutraceuticals, functional foods, and pharmaceuticals (Sorrenti et al. 2023). However, without detailed genomic research, it can be difficult to elucidate the genes associated with plant antioxidant properties accurately and to categorically identify the metabolic pathways involved in this process.
Ligularia is a very diversified genus with a lot of species such as Ligularia dentata Hara that have gained interest for their numerous beneficial natural products (Yaoita et al. 2012). Different organs and extracts of L. stenocephala and several other species from the same genus have been explored for their numerous biological properties and chemical components (Debnath et al. 2017; Nam and Lee 2013) with none of these studies focusing on the genomic studies of L. stenocephala and hence, there are very limited genomic resources available on this particular species.
Production of bioactive compounds in medicinal plants can be increased upon seawater application because seawater can promote the production of bioactive compounds, including antioxidants, phenolic compounds, flavonoids, and other secondary metabolites. In this study, RNA sequencing was performed for the leaves of L. stenocephala subjected to five different DSW treatments. Total number of reads of merged five sequencing data was 155.8 million and total length was 15.7 Gbp (Table 1). De novo assembly of the merged sequencing data generated a total of 147,406 unigenes of which 67,592 (45.85%) were annotated. The GC content of the sequenced samples ranged between 42.95-43.60% which is higher than that of the transcripts of the chloroplast genome of L. stencocphala (Chen et al. 2018).
Among the identified unigenes of L. stencoephala, 7,009 unigenes were assigned KO numbers, and several genes of L. stencocephala that are associated with the five major natural antioxidant pathways were identified in this study (Table 3; Supplementary Table S1; Supplementary Fig. S1, S2, S3, S4, S5). Considering that numerous beneficial metabolites can be produced via interconnected pathways for the biosynthesis of secondary metabolites in plants, and a number of these secondary metabolites possess noteworthy medicinal properties (Choi et al. 2024), it is necessary to explore the metabolic pathways in L. stencocephala.
In plants, SSRs are highly helpful to create important DNA markers for genetic study. These markers have been widely utilized to study genetic links by cross-amplification within closely related species (Choi et al. 2024). In the genus Ligularia, SSR marker has been used as a tool to understand the genetic variation and hybridization (Chen et al. 2018; Zhang et al. 2017). Other DNA markers were also applied to examine genetic difference between L. fischeri and L. stenocephala, in which barcode markers were successful for distinguishing two Ligularia species (Choi et al. 2017). In this study, 35,280 perfect SSRs were identified from L. stenocephala, and SSR frequency was 422.4 per one Mbp (Table 6). Among the 5 types of SSRs that were identified, trinucleotide SSRs were the most prevalent SSRs with 12,636 SSR occurrence and a frequency of 151.29 per Mbp (Fig 5a, b). This finding is consistent with many other previous reports that trinucleotide SSRs are the most abundant (Eum et al. 2019; Kotwal et al. 2016)
This transcriptome data of L. stenocephala offers significant contributions to the understanding of its genetic makeup and also emphasizes the genes and pathways related to the antioxidant properties of this plant. This data can be used for new SSR marker development and in a broader sense to identify the gene regulatory network of this species for a detailed understanding of the genomic feature. Therefore, the transcriptome data presented in this study would not just be a valuable resource for L. stenocephala genomic and genetic studies, it will also serve as a basis for other genomic comparisons on this species.
This work was supported by Goseong Deep Sea Water Industry Foundation, Gangwon, ROK.
The authors declare that there is no conflict of interest.
Raw sequencing data of 5 and 10% of DSW treatments can be accessed at BioProject data accession number: PRJNA1149490.
J Plant Biotechnol 2024; 51(1): 377-386
Published online December 3, 2024 https://doi.org/10.5010/JPB.2024.51.037.377
Copyright © The Korean Society of Plant Biotechnology.
Bimpe Suliyat Azeez ・Se-Jin Oh ・Jong-Kuk Na
Department of Agriculture and Industries, Graduate School, Kangwon National University, Chuncheon, Gangwon, 24341 Republic of Korea
Goseong Deep Sea Water Industry Foundation, Goseong, Gangwon 24747, Republic of Korea
Department of SmartFarm and Agricultural Industries, Kangwon National University, Chuncheon, Gangwon 24341, Republic of Korea
Correspondence to:J.-K. Na (✉)
e-mail: jongkook@kangwon.ac.kr
S.-J. Oh (✉)
e-mail: ohsejin@gdif.or.kr
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Seawater is an economical and eco-friendly alternative to chemical fertilizers because it contains various plant essential minerals required for plant growth. Seawater application has various effects on crops at the physiological and transcriptional levels. In this study, transcriptional changes in Ligularia stenocephala, a vegetable crop known as “Gondalbi”, in response to deep seawater (DSW) treatment were examined using RNA sequencing. L. stenocephala was treated with 5% and 10% DSW (designated to DSW5 and DSW10) or 500X and 1000X of a fertilizer (designated to SWF500 and SWF1000) comprising filtered DSW and additional minerals. RNA sequencing generated 152 million clean sequence reads in total, of which de novo assembly generated 147,406 unigenes with an average length of 566.6 bp. The GC content of five transcriptomes was 42.95-43.65%, and the N50 was 776 bp. Annotation of all identified unigenes was performed using seven different databases, and 67,592 unigenes (45.8%) were annotated. KEGG analysis annotated total of 7009 unigenes (22.9%) into 421 pathways. Because L. stenocephala is known for its anti-oxidative properties, we focused on genes associated with natural antioxidant biosynthesis and identified several unigenes involved in the biosynthesis of glutathione, tocopherol, beta-carotenoids, flavonoids, and ascorbic acid. Furthermore, we carried out the mining of simple sequence repeat (SSR) and identified 35,280 from the L. stenocpehala transcriptome. Present data would be valuable for an enhanced understanding of the transcriptional properties of seawater application in other crops and for the investigation of the functional properties and therapeutic potential of L. stenocephala.
Keywords: Gondalbi, Rocket, RNA sequencing, Differentially expressed genes, Antioxidant
Ligularia stenocephala, popularly known as “The Rocket” and in Korea as “Gondalbi”, is a perennial plant indigenous to temperate East Asia. L. stenocephala is one of the main species of the genus Ligularia and it is a leafy herb that grows strictly on well-drained wet sites with full or partial shade and adequate fertilization. The leaves of L. stenocephala has been used for medicinal purposes and also consumed as a leafy vegetable in South Korea (Debnath et al. 2017). Many studies have reported the high level of antithrombotic activity and antibacterial effects of the leaf extract of L. stenocephala among many edible and herbal plants studied (Debnath et al. 2017; Lee et al. 2013; Nugroho et al. 2010; Yoon et al. 2008).
Soil conditions affect plant productivity. Existing plant cultivation methods have raised concerns about sustainability due to the continuous use of chemical fertilizers which is a source of environmental pollution (Rahman and Zhang 2018) and also an expensive method of fertilization due to high energy cost (Christiansen et al. 2012). Plants need essential nutrients for their growth and development and if lacking, this ultimately reduces plant growth and productivity (Chele et al. 2021). Based on current management practices and levels of production intensity, the world demand for the main fertilizer nutrients, including nitrogen, phosphorus, and potassium, is predicted to rise by 2% annually (FAO 2017). To ensure adequate food production and agricultural sustainability with the increase in world population growth, there is a need to embrace locally adapted sustainable agricultural practices (Rashid et al. 2016). Recently, there has been an increase in the development of natural fertilizers to enhance organic farming globally and the sea has long been a source of organic fertilizers in coastal areas (Emadodin et al. 2020). Deep seawater (DSW) generally refers to a low temperature, high-purity, nutrient-rich seawater pumped from a depth of over 200 m (Hwang et al. 2009; Mohd Nani et al. 2016). Compared to other sources of water, DSW is rich in plant-beneficial minerals such as potassium, magnesium, calcium, and zinc (Emadodin et al. 2020).
Many studies have examined the effects of seawater on the physiological properties of crops, focusing on yield and quality, but few have investigated how seawater affects transcriptional change. Physiological changes in plants involve transcriptional regulation, so analyzing the transcriptome of seawater-treated crops could help predict potential physiological responses more precisely. In this study, RNA sequencing was carried out to investigate the transcriptome of L. stenocephala treated with different concentrations of DSW or a fertilizer made of filtered concentrate of DSW and additional nutrients (designated to SWF). The data generated from the RNA sequencing was run for gene ontology (GO) analysis, EggNOG analysis, KEGG metabolic pathway analysis, and SSR mining. Also, we identified the genes involved in the antioxidant and salt stress-related pathway of L. stenocephala and compared the expression of those genes across samples. The transcriptome of L. stenocephala is the first reference transcriptome, which not only enriches our understanding of the genetic framework of this plant but also lays the background for future studies. This paper serves as a vital stepping stone toward more comprehensive research and the eventual unlocking of the full potential of this underexplored species.
Seedlings of L. stenocephala were obtained from one-year- old roots. Six seedlings were transplanted to each pot (60 cm × 20 cm × 15 cm) filled with silt roam soil on March 20, 2023 and pots with seedlings were placed under shade for two weeks before DSW treatment. The experiments were split into five treatment groups: water with 5% or 10% DSW, 500X or 1000X of SWF, and control (NT). DSW and SWF treatment were applied to the seedlings every other week, and tap water was applied in between. Control seedlings were treated with tap water once a week during the experimental period. The experiment was conducted from April 3 to May 22, 2023, at Goseong Deep Sea Water Industry Foundation, Gangwon, Republic of Korea.
L. stenocephala leaves were collected and instantly snap-frozen in liquid nitrogen, and then kept at -70°C. Following the manufacturer’s instructions, total RNA was extracted from the leaves of NT, DSW5, DSW10, SWF500, and SWF1000 using GeneAll Ribospin plant reagent (GeneAll Biotechnology Co., Ltd., Seoul, South Korea). RNA quality was checked by 1% RNase-free agarose gel electrophoresis, and the purity was determined using the NanoPhotometer® spectrophotometer (IMPLEN, CA, USA). RNA integrity was assessed by running samples through the TapeStation RNA screentape (Agilent, #5067-5576), and RNA concentration was determined using the Quan-itTM RiboGreen RNA Assay technique (Invitrogen, cat. #R11490).
Messenger RNA was isolated from 1 ug of total RNA and sheared for library construction. SuperScript II reverse transcriptase was used to generate first-strand cDNA using the sheared mRNA fragments using random primers (Invitrogen, #18064014). Adapters were ligated onto both ends of the cDNA fragments, among which 200 - 400 bp fragments were selected and used for pair-end sequencing using the Illumina NovaSeq 6000 system (Illumina, Inc., San Diego, CA, USA). For assembly, after quality control of the raw reads using the FastQCv0.11.7 program, the adapter sequences, low-quality reads (< Q20), and reads with length < 36 bp were removed using Trimmomatic tool v0.38 (Bolger et al. 2014). To obtain a reference transcriptome of L. stenocephala, all sequencing data were merged and assembled using Trinity program (Trinity version trinityrnaseq_r20140717, bowtie 1.1.2). The CD-HIT program v4.6 was used for clustering transcripts into unigenes.
Unigene annotation was carried out using seven different databases: GO (v20180319), UniProt (2022_05), NCBI NR protein and NT nucleotide databases (20230102), Pfam (20160316), EggNOG (e5.proteomes), and KO_EUK (20230102). Reference transcriptome was used for differentially expressed gene analysis based on read counts generated from RSEM program (RSEM version 1.2.31). Similarity search against the seven databases were performed using BLASTN of NCBI BLAST and BLASTX of DIAMOND software with an E-value at default cutoff of 1.0E-5.
In order to evaluate the SSR composition in the transcriptome of L. stenocephala, SSRs were identified using MISA (pgrc.ipk-gatersleben.de/misa), a tool for microsatellite identification. The criteria included a minimum motif repeat of four and a minimum length of 12 bp. RepeatMasker (v. 4.0.7) was run at default mode to examine the repetitive sequence in the transcriptome of L. stenocephala, with the reference library RepBaseRepeatMakerEdition-20170127(www.girinst.org).
RNA sequencing of L. stenocephala produced a total of 155.8 million raw reads with a length of about 15.7 Gb, from which a total of 152.7 million clear reads (equivalent to about 15.3 Gb) were obtained after filtering raw quality reads. The total number of quality reads of five transcriptomes ranged from 2.85 (SWF1000) to 3.85 (NT) million, and their total length of quality reads were between 2.78 and 3.75 Gb (Table 1). Percentage of clear quality reads was higher than 97.9%. Due to lacking of a reference transcriptome of L. stenocephala, all clear reads of five transcriptomes were merged and used for de novo assembly to generate a reference transcriptome, resulting in 147,406 unigenes with a total length of 83,519,007 bp. GC content of the reference transcriptome was 38.1% and N50 was 776 bp (Table 2). The size range of unigenes was 201~14,728 bp with average of 566 bp, and 13,787 unigenes were in full length. From annotation based on seven different databases, a total of 67,592 (45.85%) unigenes were annotated, with 42.52% of the annotated unigenes showing similarities to proteins on the NCBI Non-Redundant Protein (NR) database (Fig. 1).
Table 1 . Summary of RNA sequencing data of Ligularia stenocephala treated with deep seawater or a fertilizer derived from deep seawater.
Description of sequenced data | NT | DSW5 | DSW10 | SWF500 | SWF1000 |
---|---|---|---|---|---|
Total number of raw reads | 38,129,414 | 29,150,308 | 30,525,785 | 29,675,962 | 28,269,621 |
Total length of raw reads (bp) | 3,851,070,814 | 2,944,181,108 | 3,083,104,285 | 2,997,272,162 | 2,855,231,721 |
Total number of clean reads | 37,350,843 | 28,621,397 | 29,901,820 | 29,087,042 | 27,695,022 |
Total length of clean reads (bp) | 3,752,226,264 | 2,879,166,872 | 3,007,585,370 | 2,925,642,225 | 2,785,479,823 |
GC content of clean reads (%) | 43.12 | 42.95 | 43.58 | 43.6 | 43.65 |
Percentage of clean reads | 97.96% | 98.19% | 97.96% | 98.02% | 97.97% |
Number of mapped reads | 19,767,956 | 15,400,751 | 16,005,022 | 15,430,469 | 14,485,027 |
Overall mapping ratio (%) | 52.93% | 53.81% | 53.53% | 53.05% | 52.30% |
Q20 (%) | 98.64 | 98.66 | 98.63 | 98.62 | 98.63 |
Q30 (%) | 95.21 | 95.26 | 95.2 | 95.18 | 95.22 |
*SWF500 and SWF1000 indicate RNA sequencing results from Ligularia stenocephala treated with 500X and 1000X of a fertilizer comprising deep seawater and additional nutrients. DSW5 and DSW10 indicate treatments of water containing 5% and 10% of deep seawater..
Table 2 . Summary of de novo assembly of Ligularia stenocephala reference transcriptome.
Assembly | No. of unigenes | GC content (%) | N50 | Longest contig (bp) | Shortest contig (bp) | Average contig length (bp) | Total assembled bases (bp) |
---|---|---|---|---|---|---|---|
Ligularia | 147,406 | 38.1 | 776 | 14,728 | 201 | 566.59 | 83,519,007 |
All clean reads of each RNA sequencing data were used for mapping to the reference transcriptome of L. stenocephala. Total number of mapped reads of each sequencing data ranged from 14.5 (SWF1000) to 19.8 (NT) million reads, and average mapping ratio of five sequencing data was 53.1% (Table 1).
Go functional analysis of L. stenocephala unigenes classified 108,046 unigenes into cellular components, 97,851 unigenes into biological processes, and 57,494 unigenes into molecular functions (Fig. 2). However, the number of unique unigenes excluding unigenes involved in more than two sub-functional categories was 39,599 (37%) for cellular components, 25,607 (26%) for biological processes, and 22,612 (39%) for molecular functions. In addition, most of the unigenes were classified into two subcategories in the molecular function category: catalytic activity and binding. In total, 22,612 (39%) genes were classified into catalytic activity and 20,083 (35%) into binding (Fig. 2). In the biological process category, the top three subcategories were the cellular process with 25,607 (26%) unigenes, the metabolic process with 21,876 (22%), and biological regulation consisting of 11,689 (12%) unigenes. In the cellular component category, most genes were classified into four subcategories, cell part consisting of 39,599 (37%) genes, organelle with 25,640 (24%), membrane with 11,042 (10%), and organelle part with 10,163 (9%) unigenes, respectively (Fig. 2).
EggNOG (The evolutionary gene genealogy non-supervised orthologous groups) analysis classified 98,785 unigenes into 25 functional categories (Fig. 3), with the largest classification being “replication, recombination and repair” (12,238, 27.04%), followed by “transcription” (8,202 8.30%), “posttranslational modification, protein turnover, chaperones” (6,386, 6.46%), “signal transduction mechanisms (6,050 6.12%), and “amino acid transport and metabolism” (4,262 4.31%). The cell motility category has the smallest EggNOG classification of 31 genes (0.03%).
All unigenes were run on the BlastKOALA tool (http://www.genome.jp/kegg) to allocate KEGG orthology number, known as KO number, resulting in 7,009 unigenes being assigned numbers. The largest number of unigenes were categorized into metabolism consisting of 7,040 unigenes in 12 subcategories with 145 pathways, while the pathways of environmental information processing contained the fewest associated unigenes with 1,390 in 36 pathways (Table 3). The top five pathways with the highest number of entry enzyme hits among all pathways were the metabolic pathways (map01100) with 886 entry enzymes, the biosynthesis of secondary metabolites (map01110) with 460, microbial metabolism in diverse environments (map01120) with 142, amyotrophic lateral sclerosis (05014) with 127, the pathway of neurodegeneration with 121 (map05022), respectively (Fig. 4).
Table 3 . Categories of KEGG metabolic pathways and number of unigenes of Ligularia stenocephala associated with each category.
Category | No. of subcategories | No. of associated pathways | No. of associated unigene entries | No. of associated genes |
---|---|---|---|---|
Metabolism | 12 | 145 | 3554 | 7040 |
Genetic information processing | 6 | 27 | 928 | 1529 |
Environmental information processing | 3 | 36 | 433 | 1390 |
Cellular processes | 5 | 35 | 668 | 1394 |
Organismal systems | 10 | 88 | 642 | 1926 |
Human Diseases | 12 | 89 | 1739 | 3697 |
The major naturally occurring antioxidants include ascorbic acid, α-tocopherol, glutathione, carotenoids, and flavonoids (Huchzermeyer et al. 2022). Several metabolic pathways, including the metabolism of terpenoids, biosynthesis of other secondary metabolites, and metabolism of cofactors and vitamins, are connected to the biosynthesis of antioxidants in plants. Therefore, it is essential to examine genes involved in such pathways for understanding underlying mechanisms associated to antioxidant biosynthesis in a specific plant species. In L. stenocephala, 22 unigenes were found to be involved in the carotenoid biosynthesis pathway (map 00906) with 16 entries. For the flavonoid metabolism, 29 unigenes were associated with flavonoid biosynthesis pathways (map 00941) with 14 entries. For the metabolism of ascorbic acid in L. stenocephala, some of the associated pathways include the Galactose metabolism (map 00052) where we identified 39 unigenes with 15 entries and the Ascorbate and aldarate metabolism pathway (map 00053) with 41 associated unigenes and 20 entries respectively (Table 4; Supplementary Table S1).
Table 4 . KEGG metabolic pathways related to antioxidants and the number of associated unigenes in Ligularia stenocephala transcriptome.
Metabolic pathway | KEGG map ID | No. of entries | No. of associated unigenes |
---|---|---|---|
Metabolism of terpenoids | |||
Carotenoid biosynthesis | 00906 | 16 | 22 |
Monoterpenoid biosynthesis | 00902 | 2 | 9 |
Biosynthesis of other secondary metabolites | |||
Phenylpropanoid biosynthesis | 00940 | 16 | 76 |
Flavonoid biosynthesis | 00941 | 14 | 29 |
Flavone and flavonol biosynthesis | 00944 | 3 | 3 |
Metabolism of cofactors and vitamins | |||
Ubiquinone and other terpenoid-quinone biosynthesis | 00130 | 20 | 40 |
Metabolism of other amino acids | |||
Glutathione metabolism | 00480 | 19 | 65 |
Carbohydrate metabolism (ascorbic acid as an end product) | |||
Galactose metabolism | 00052 | 15 | 39 |
Ascorbate and aldarate metabolism | 00053 | 20 | 41 |
RepeatMasker (http://www.repeatmasker.org) was used to analyze the repetitive sequences in the transcriptome of L. stenocephala. It was found that repetitive and low-complexity sequences occupied 1,316,959 bp (1.58%) of the L. stenocephala transcriptome (Table 5). Most of the repetitive sequences were simple sequence repeats, with 27,627 elements occupying 1,111,801 bp (1.33%) (Table 5).
Table 5 . Repetitive sequences in Ligularia stenocephala transcriptome (83,519,007 bp).
Type of repeats | No. of elements | Sequence length occupied (bp) | Percentage of sequence |
---|---|---|---|
Retroelements | 82 | 5606 | 0.01 |
SINEs | 10 | 1038 | 0.00 |
LINEs | 46 | 2535 | 0.00 |
LTR elements | 26 | 2033 | 0.00 |
DNA transposons | 61 | 8157 | 0.01 |
hobo-Activator | 58 | 7996 | 0.01 |
Tc1-IS630-Pogo | 2 | 115 | 0.00 |
Unclassified | 32 | 4490 | 0.01 |
Total interspersed repeats | 18,253 | 0.02 | |
Small RNA | 146 | 24,311 | 0.03 |
Satellite | 6 | 586 | |
Simple repeats | 27627 | 1,111,801 | 1.33 |
Low complexity | 3369 | 162,008 | 0.19 |
Total number of bases masked | 1,316,959 | 1.58 |
SSR search allocated 35,280 perfect SSRs from 26,793 SSR-containing unigenes of L. stenocephala (Table 6). Among SSR-containing unigenes, 6,134 had more than one SSR, of which 4061 had compound SSRs (Table 6). The frequencies of total identified SSRs of L. stenocephala were examined and the SSR frequency was 422.4 per one million base pairs (Mbp) in the L. stenocephala transcriptome. Trinucleotide SSRs were the most abundant SSRs with 12,636 (35.82%), followed by di-nucleotide SSRs with 11,607 (32.90%) occurrence (Fig. 5a). Furthermore, the frequencies of trinucleotide SSRs was 151.29 per Mbp and with penta-nucleotide SSRs having the least frequencies of 26.35 Mbp (Fig. 5b). The number of SSR occurrences by individual motif ranged from 4239 in AC/GT followed by 3819 in AG/CT motif to 6 in CG/GG, respectively (Fig. 6a). The highest SSR occurrence by motif unit was AC/GT with 50.75 occurrences per Mbp, followed by AG/CT motif with 45.73 occurrences per Mbp (Fig. 6b; Table 6).
Table 6 . Distribution and comparison of simple sequence repeats from Ligularia stenocephala transcriptome by repeat motifs.
Motif | No. of motif repeats | Total | Occurrence per 1 Mbp | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ≥12 | |||
AC/GT | 0 | 0 | 2425 | 794 | 372 | 249 | 172 | 147 | 77 | 3 | 4239 | 50.8 |
AG/CT | 0 | 0 | 1862 | 675 | 391 | 285 | 246 | 250 | 106 | 4 | 3819 | 45.7 |
AT/AT | 0 | 0 | 1719 | 568 | 300 | 241 | 305 | 312 | 95 | 3 | 3543 | 42.4 |
CG/CG | 0 | 0 | 4 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 6 | 0.1 |
AAC/GTT | 0 | 1737 | 639 | 304 | 125 | 5 | 0 | 0 | 1 | 0 | 2811 | 33.7 |
AAG/CTT | 0 | 1374 | 345 | 161 | 80 | 5 | 0 | 0 | 0 | 0 | 1965 | 23.5 |
AAT/ATT | 0 | 950 | 243 | 129 | 103 | 10 | 0 | 1 | 0 | 0 | 1436 | 17.2 |
ACC/GGT | 0 | 1446 | 409 | 186 | 69 | 9 | 0 | 0 | 0 | 1 | 2120 | 25.4 |
ACG/CGT | 0 | 85 | 18 | 9 | 1 | 0 | 0 | 1 | 0 | 0 | 114 | 1.4 |
ACT/AGT | 0 | 141 | 38 | 16 | 7 | 0 | 0 | 0 | 0 | 0 | 202 | 2.4 |
AGC/CTG | 0 | 475 | 128 | 67 | 42 | 6 | 1 | 0 | 0 | 0 | 719 | 8.6 |
AGG/CCT | 0 | 259 | 81 | 24 | 14 | 5 | 0 | 1 | 0 | 0 | 384 | 4.6 |
ATC/ATG | 0 | 1541 | 492 | 314 | 178 | 7 | 0 | 0 | 0 | 0 | 2532 | 30.3 |
CCG/CGG | 0 | 286 | 48 | 13 | 6 | 0 | 0 | 0 | 0 | 0 | 353 | 4.2 |
Tetra- | 5381 | 628 | 161 | 35 | 0 | 0 | 1 | 1 | 0 | 0 | 6207 | 74.3 |
≥ penta- | 3966 | 778 | 60 | 13 | 6 | 4 | 1 | 1 | 1 | 0 | 4830 | 57.8 |
Total | 9347 | 9700 | 8672 | 3308 | 1696 | 826 | 726 | 714 | 280 | 11 | 35280 | 422.4 |
Medicinal plants have long been recognized for their health benefits, particularly due to various bioactive compounds having antioxidant properties, which play a crucial role in neutralizing free radicals and reducing oxidative stress, a key factor in the development of chronic diseases such as cardiovascular diseases, cancer, and neurodegenerative disorders (Sharifi-Rad et al. 2020). Several medicinal plants have been extensively studied for their potential antioxidant properties and a variety of phytochemicals are found to contribute to their overall antioxidant capacity. Antioxidants derived from medicinal plants are increasingly being explored for their potential applications in nutraceuticals, functional foods, and pharmaceuticals (Sorrenti et al. 2023). However, without detailed genomic research, it can be difficult to elucidate the genes associated with plant antioxidant properties accurately and to categorically identify the metabolic pathways involved in this process.
Ligularia is a very diversified genus with a lot of species such as Ligularia dentata Hara that have gained interest for their numerous beneficial natural products (Yaoita et al. 2012). Different organs and extracts of L. stenocephala and several other species from the same genus have been explored for their numerous biological properties and chemical components (Debnath et al. 2017; Nam and Lee 2013) with none of these studies focusing on the genomic studies of L. stenocephala and hence, there are very limited genomic resources available on this particular species.
Production of bioactive compounds in medicinal plants can be increased upon seawater application because seawater can promote the production of bioactive compounds, including antioxidants, phenolic compounds, flavonoids, and other secondary metabolites. In this study, RNA sequencing was performed for the leaves of L. stenocephala subjected to five different DSW treatments. Total number of reads of merged five sequencing data was 155.8 million and total length was 15.7 Gbp (Table 1). De novo assembly of the merged sequencing data generated a total of 147,406 unigenes of which 67,592 (45.85%) were annotated. The GC content of the sequenced samples ranged between 42.95-43.60% which is higher than that of the transcripts of the chloroplast genome of L. stencocphala (Chen et al. 2018).
Among the identified unigenes of L. stencoephala, 7,009 unigenes were assigned KO numbers, and several genes of L. stencocephala that are associated with the five major natural antioxidant pathways were identified in this study (Table 3; Supplementary Table S1; Supplementary Fig. S1, S2, S3, S4, S5). Considering that numerous beneficial metabolites can be produced via interconnected pathways for the biosynthesis of secondary metabolites in plants, and a number of these secondary metabolites possess noteworthy medicinal properties (Choi et al. 2024), it is necessary to explore the metabolic pathways in L. stencocephala.
In plants, SSRs are highly helpful to create important DNA markers for genetic study. These markers have been widely utilized to study genetic links by cross-amplification within closely related species (Choi et al. 2024). In the genus Ligularia, SSR marker has been used as a tool to understand the genetic variation and hybridization (Chen et al. 2018; Zhang et al. 2017). Other DNA markers were also applied to examine genetic difference between L. fischeri and L. stenocephala, in which barcode markers were successful for distinguishing two Ligularia species (Choi et al. 2017). In this study, 35,280 perfect SSRs were identified from L. stenocephala, and SSR frequency was 422.4 per one Mbp (Table 6). Among the 5 types of SSRs that were identified, trinucleotide SSRs were the most prevalent SSRs with 12,636 SSR occurrence and a frequency of 151.29 per Mbp (Fig 5a, b). This finding is consistent with many other previous reports that trinucleotide SSRs are the most abundant (Eum et al. 2019; Kotwal et al. 2016)
This transcriptome data of L. stenocephala offers significant contributions to the understanding of its genetic makeup and also emphasizes the genes and pathways related to the antioxidant properties of this plant. This data can be used for new SSR marker development and in a broader sense to identify the gene regulatory network of this species for a detailed understanding of the genomic feature. Therefore, the transcriptome data presented in this study would not just be a valuable resource for L. stenocephala genomic and genetic studies, it will also serve as a basis for other genomic comparisons on this species.
This work was supported by Goseong Deep Sea Water Industry Foundation, Gangwon, ROK.
The authors declare that there is no conflict of interest.
Raw sequencing data of 5 and 10% of DSW treatments can be accessed at BioProject data accession number: PRJNA1149490.
Table 1 . Summary of RNA sequencing data of Ligularia stenocephala treated with deep seawater or a fertilizer derived from deep seawater.
Description of sequenced data | NT | DSW5 | DSW10 | SWF500 | SWF1000 |
---|---|---|---|---|---|
Total number of raw reads | 38,129,414 | 29,150,308 | 30,525,785 | 29,675,962 | 28,269,621 |
Total length of raw reads (bp) | 3,851,070,814 | 2,944,181,108 | 3,083,104,285 | 2,997,272,162 | 2,855,231,721 |
Total number of clean reads | 37,350,843 | 28,621,397 | 29,901,820 | 29,087,042 | 27,695,022 |
Total length of clean reads (bp) | 3,752,226,264 | 2,879,166,872 | 3,007,585,370 | 2,925,642,225 | 2,785,479,823 |
GC content of clean reads (%) | 43.12 | 42.95 | 43.58 | 43.6 | 43.65 |
Percentage of clean reads | 97.96% | 98.19% | 97.96% | 98.02% | 97.97% |
Number of mapped reads | 19,767,956 | 15,400,751 | 16,005,022 | 15,430,469 | 14,485,027 |
Overall mapping ratio (%) | 52.93% | 53.81% | 53.53% | 53.05% | 52.30% |
Q20 (%) | 98.64 | 98.66 | 98.63 | 98.62 | 98.63 |
Q30 (%) | 95.21 | 95.26 | 95.2 | 95.18 | 95.22 |
*SWF500 and SWF1000 indicate RNA sequencing results from Ligularia stenocephala treated with 500X and 1000X of a fertilizer comprising deep seawater and additional nutrients. DSW5 and DSW10 indicate treatments of water containing 5% and 10% of deep seawater..
Table 2 . Summary of de novo assembly of Ligularia stenocephala reference transcriptome.
Assembly | No. of unigenes | GC content (%) | N50 | Longest contig (bp) | Shortest contig (bp) | Average contig length (bp) | Total assembled bases (bp) |
---|---|---|---|---|---|---|---|
Ligularia | 147,406 | 38.1 | 776 | 14,728 | 201 | 566.59 | 83,519,007 |
Table 3 . Categories of KEGG metabolic pathways and number of unigenes of Ligularia stenocephala associated with each category.
Category | No. of subcategories | No. of associated pathways | No. of associated unigene entries | No. of associated genes |
---|---|---|---|---|
Metabolism | 12 | 145 | 3554 | 7040 |
Genetic information processing | 6 | 27 | 928 | 1529 |
Environmental information processing | 3 | 36 | 433 | 1390 |
Cellular processes | 5 | 35 | 668 | 1394 |
Organismal systems | 10 | 88 | 642 | 1926 |
Human Diseases | 12 | 89 | 1739 | 3697 |
Table 4 . KEGG metabolic pathways related to antioxidants and the number of associated unigenes in Ligularia stenocephala transcriptome.
Metabolic pathway | KEGG map ID | No. of entries | No. of associated unigenes |
---|---|---|---|
Metabolism of terpenoids | |||
Carotenoid biosynthesis | 00906 | 16 | 22 |
Monoterpenoid biosynthesis | 00902 | 2 | 9 |
Biosynthesis of other secondary metabolites | |||
Phenylpropanoid biosynthesis | 00940 | 16 | 76 |
Flavonoid biosynthesis | 00941 | 14 | 29 |
Flavone and flavonol biosynthesis | 00944 | 3 | 3 |
Metabolism of cofactors and vitamins | |||
Ubiquinone and other terpenoid-quinone biosynthesis | 00130 | 20 | 40 |
Metabolism of other amino acids | |||
Glutathione metabolism | 00480 | 19 | 65 |
Carbohydrate metabolism (ascorbic acid as an end product) | |||
Galactose metabolism | 00052 | 15 | 39 |
Ascorbate and aldarate metabolism | 00053 | 20 | 41 |
Table 5 . Repetitive sequences in Ligularia stenocephala transcriptome (83,519,007 bp).
Type of repeats | No. of elements | Sequence length occupied (bp) | Percentage of sequence |
---|---|---|---|
Retroelements | 82 | 5606 | 0.01 |
SINEs | 10 | 1038 | 0.00 |
LINEs | 46 | 2535 | 0.00 |
LTR elements | 26 | 2033 | 0.00 |
DNA transposons | 61 | 8157 | 0.01 |
hobo-Activator | 58 | 7996 | 0.01 |
Tc1-IS630-Pogo | 2 | 115 | 0.00 |
Unclassified | 32 | 4490 | 0.01 |
Total interspersed repeats | 18,253 | 0.02 | |
Small RNA | 146 | 24,311 | 0.03 |
Satellite | 6 | 586 | |
Simple repeats | 27627 | 1,111,801 | 1.33 |
Low complexity | 3369 | 162,008 | 0.19 |
Total number of bases masked | 1,316,959 | 1.58 |
Table 6 . Distribution and comparison of simple sequence repeats from Ligularia stenocephala transcriptome by repeat motifs.
Motif | No. of motif repeats | Total | Occurrence per 1 Mbp | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ≥12 | |||
AC/GT | 0 | 0 | 2425 | 794 | 372 | 249 | 172 | 147 | 77 | 3 | 4239 | 50.8 |
AG/CT | 0 | 0 | 1862 | 675 | 391 | 285 | 246 | 250 | 106 | 4 | 3819 | 45.7 |
AT/AT | 0 | 0 | 1719 | 568 | 300 | 241 | 305 | 312 | 95 | 3 | 3543 | 42.4 |
CG/CG | 0 | 0 | 4 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 6 | 0.1 |
AAC/GTT | 0 | 1737 | 639 | 304 | 125 | 5 | 0 | 0 | 1 | 0 | 2811 | 33.7 |
AAG/CTT | 0 | 1374 | 345 | 161 | 80 | 5 | 0 | 0 | 0 | 0 | 1965 | 23.5 |
AAT/ATT | 0 | 950 | 243 | 129 | 103 | 10 | 0 | 1 | 0 | 0 | 1436 | 17.2 |
ACC/GGT | 0 | 1446 | 409 | 186 | 69 | 9 | 0 | 0 | 0 | 1 | 2120 | 25.4 |
ACG/CGT | 0 | 85 | 18 | 9 | 1 | 0 | 0 | 1 | 0 | 0 | 114 | 1.4 |
ACT/AGT | 0 | 141 | 38 | 16 | 7 | 0 | 0 | 0 | 0 | 0 | 202 | 2.4 |
AGC/CTG | 0 | 475 | 128 | 67 | 42 | 6 | 1 | 0 | 0 | 0 | 719 | 8.6 |
AGG/CCT | 0 | 259 | 81 | 24 | 14 | 5 | 0 | 1 | 0 | 0 | 384 | 4.6 |
ATC/ATG | 0 | 1541 | 492 | 314 | 178 | 7 | 0 | 0 | 0 | 0 | 2532 | 30.3 |
CCG/CGG | 0 | 286 | 48 | 13 | 6 | 0 | 0 | 0 | 0 | 0 | 353 | 4.2 |
Tetra- | 5381 | 628 | 161 | 35 | 0 | 0 | 1 | 1 | 0 | 0 | 6207 | 74.3 |
≥ penta- | 3966 | 778 | 60 | 13 | 6 | 4 | 1 | 1 | 1 | 0 | 4830 | 57.8 |
Total | 9347 | 9700 | 8672 | 3308 | 1696 | 826 | 726 | 714 | 280 | 11 | 35280 | 422.4 |
Jong-Kuk Na
J Plant Biotechnol -0001; ():Mi Kyung Choi ・Bimpe Suliyat Azeez ・Sang Woo Lee ・Wan Yi Li ・Sangho Choi ・Ik-Young Choi ・Ki-Young Choi ・Jong-Kuk Na
J Plant Biotechnol 2024; 51(1): 33-49Areumi Park ・Yeon-Ji Lee ・Nalae Kang ・Do-Hyung Kang ・Soo-Jin Heo
J Plant Biotechnol 2022; 49(4): 347-355
Journal of
Plant Biotechnology