J Plant Biotechnol 2022; 49(1): 15-29
Published online March 31, 2022
https://doi.org/10.5010/JPB.2022.49.1.015
© The Korean Society of Plant Biotechnology
Correspondence to : e-mail: ckkim@knu.ac.kr, queen@sunchon.ac.kr
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Cyclophilins (CYPs) are highly conserved ubiquitous proteins belong to the peptidyl prolyl cis/trans isomerase (PPIase) superfamily. These proteins are present in a wide range of organisms; they contain a highly conserved peptidylprolyl cis/trans isomerase domain. A comprehensive database survey identified a total of 35 genes localized in all cellular compartments of Solanum lycopersicum L., but largely in the cytosol. Sequence alignment and conserved motif analyses of the SlCYP proteins revealed a highly conserved CLD motif. Evolutionary analysis predicted the clustering of a large number of gene pairs with high sequence similarity. Expression analysis using the RNA-Seq data showed that the majority of the SlCYP genes were highly expressed in mature leaves and blooming flowers, compared with their expression in other organs. This study provides a basis for the functional characterization of individual CYP genes in the future to elucidate their role(s) in protein refolding and long-distance signaling in tomatoes and in plant biology, in general.
Keywords Genome-wide analysis, cyclophilins, Solanum lycopersicum L., PPIase domain, expression profiles, evolutionary relation
Cyclophilins (CYPs), are highly conserved ubiquitous proteins found in all types of living organism including bacteria, fungi, mammals, plants and insects (Galat 1999). Cyclophilin was first identified in 1984 as a receptor of drug cyclosporin A in mammalian cells (Handschumacher et al. 1984). In plant, the
CYPs are a prominent and abundant class of proteins involved in various fundamental biological process including signalling, protein folding, trafficking, transcription, RNA-binding, apoptosis, pathogen and plant stress responses (Allain et al. 1994; Anderson et al. 2002; Aumüller et al. 2010; Baker et al. 1994; Brazin et al. 2002; Bukrinsky 2002; Dubourg et al. 2004; Jing et al. 2015; Kern et al. 1995; Klappa et al. 1995; Krzywicka et al. 2001; Li et al. 2007; Lin and Lechleiter 2002; Pogorelko et al. 2014; Schiene-Fischer and Yu 2001; Zander et al. 2003). Rice OsCYP2 participate in auxin signaling pathways by interacting with a zinc finger protein (OsZEP) that controls lateral root development (Cui et al. 2017). Soybean GmCYP1 interacts with GmMYB176, an isoflavonoid regulator which is stimulated by several abiotic stresses (Mainali et al. 2017). Two
Genome-wide studies of
Tomato
Protein sequences were aligned using Genedoc (http://www.nrbsc.org/gfx/genedoc/ebinet.htm) multiple sequence alignment tool. The phylogenetic tree was calculated by using the multiple alignment from ClustalOmega, and subsequently processed with MEGA 6.0 in the Neighbor-Joining (NJ) algorithm method (Tamura et al. 2013).
Start and end positions of
Synteny analysis of
The expression of 35 SlCYP (species
String web-based software (https://string-db.org/) was used to predict the protein interactions of tomato cyclophilin proteins and the
A total of 37 tomato cyclophilin sequences syntenic to
Table 1 . All the identified CYP family members in the tomato plant and their nomenclature, locus name, molecular weight, protein sequence length, chromosomal location, subcellular localization, theoretical isoelectric point, and predicted exons
Gene name | Locus name | ORF | Chromosome location | Exon | Protein | |||||
---|---|---|---|---|---|---|---|---|---|---|
Length (aa) | MW (kDa) | pI | Domain information | CLD position | Subcellular localization | |||||
>Solyc12g038110 | 186 | SL4.0ch12:48323666..48323851 | 1 | 61 | 7.169 | 9.30 | SD | 2-61 | Extracellular | |
>Solyc12g089200 | 228 | SL4.0ch12:63881747..63881974 | 1 | 75 | 8.149 | 5.39 | SD | 1-75 | Nuclear | |
>Solyc12g038030 | 282 | SL4.0ch12:48064472..48064862 | 2 | 93 | 10.549 | 9.12 | SD | 2-52 | Mitochondrial | |
>Solyc11g006070 | 462 | SL4.0ch11:904408..904869 | 1 | 153 | 16.489 | 6.06 | SD | 1-149 | Cytoplasmic | |
>Solyc08g006090 | 483 | SL4.0ch08:848486..854332 | 4 | 160 | 17.530 | 7.01 | SD | 2-153 | Cytoplasmic | |
>Solyc10g054910 | 519 | SL4.0ch10:55122558..55123076 | 1 | 172 | 17.873 | 8.59 | SD | 7-170 | Cytoplasmic | |
>Solyc01g111170 | 516 | SL4.0ch01:89876816..89877826 | 1 | 171 | 17.910 | 8.83 | SD | 1-164 | Cytoplasmic | |
>Solyc09g010190 | 495 | SL4.0ch09:3614647..3619878 | 6 | 164 | 18.175 | 8.58 | SD | 11-162 | Cytoplasmic | |
>Solyc12g038070 | 477 | SL4.0ch12:48149484..48150037 | 2 | 158 | 18.403 | 6.72 | SD | 2-74 | Nuclear | |
>Solyc12g038150 | 498 | SL4.0ch12:48601797..48602468 | 2 | 165 | 18.894 | 4.72 | SD | 1-97 | Nuclear | |
>Solyc01g096520 | 573 | SL4.0ch01:79859565..79864679 | 7 | 190 | 20.510 | 8.42 | SD | 26-189 | Cytoplasmic | |
>Solyc06g076970 | 624 | SL2.50ch06:47833488..47837300 | 7 | 207 | 22.249 | 9.19 | SD | 41-204 | Cytoplasmic | |
>Solyc12g038010 | 594 | SL4.0ch12:48040757..48041494 | 3 | 197 | 22.340 | 5.17 | SD | 35-125 | Cytoplasmic | |
>Solyc06g051650 | 678 | SL4.0ch06:32962272..32966959 | 8 | 225 | 24.469 | 8.91 | SD | 59-222 | Cytoplasmic | |
>Solyc12g038000 | 645 | SL2.50ch12:49574555..49575486 | 4 | 214 | 24.485 | 5.15 | MD | 156-211 | Nuclear | |
>Solyc01g111360 | 687 | SL4.0ch01:89991208..89995745 | 7 | 228 | 24.933 | 6.65 | SD | 49-213 | Mitochondrial | |
>Solyc01g010590 | 687 | SL4.0ch01:5632492..5638269 | 8 | 228 | 25.763 | 8.68 | SD | 35-192 | Cytoplasmic | |
>Solyc10g083930 | 693 | SL4.0ch10:62791267..62795709 | 7 | 230 | 26.055 | 9.30 | SD | 76-226 | Mitochondrial | |
>Solyc01g009990 | 747 | SL4.0ch01:4610206..4614220 | 6 | 248 | 26.535 | 9.20 | SD | 84-244 | Chloroplast | |
>Solyc09g008410 | 711 | SL4.0ch09:1895141..1903823 | 7 | 236 | 26.881 | 6.76 | SD | 82-232 | Cytoplasmic | |
>Solyc07g007110 | 894 | SL4.0ch07:1829661..1833568 | 2 | 297 | 32.380 | 8.68 | SD | 89-252 | Chloroplast | |
>Solyc02g061800 | 882 | SL4.0ch02:31311295..31312755 | 2 | 293 | 33.831 | 4.76 | MD | 2-165 | Cytoplasmic | |
>Solyc03g119860 | 954 | SL4.0ch03:62855114..62856604 | 2 | 317 | 34.575 | 8.74 | SD | 97-288 | Chloroplast | |
>Solyc08g077790 | 1032 | SL4.0ch08:59811302..59816547 | 5 | 343 | 37.275 | 5.15 | MD | 167-323 | Extracellular | |
>Solyc02g090480 | 1086 | SL4.0ch02:50052680..50058430 | 8 | 362 | 40.293 | 5.66 | MD | 7-172 | Cytoplasmic | |
>Solyc01g108340 | 1089 | SL4.0ch01:87996043..88000791 | 8 | 362 | 40.347 | 6.05 | MD | 7-172 | Cytoplasmic | |
>Solyc12g049430 | 1149 | SL4.0ch12:60709814..60710962 | 1 | 382 | 43.879 | 5.59 | MD | 2-164 | Nuclear | |
>Solyc12g013580 | 1356 | SL4.0ch12:4454376..4461071 | 12 | 451 | 49.078 | 5.95 | SD | 277-443 | Chloroplast | |
>Solyc02g086910 | 1356 | SL4.0ch02:47512351..47516135 | 7 | 451 | 49.287 | 5.00 | SD | 151-308 | Chloroplast | |
>Solyc08g062700 | 1479 | SL4.0ch08:49890458..49910307 | 10 | 492 | 54.985 | 8.40 | SD | 14-168 | Nuclear | |
>Solyc02g092380 | 1791 | SL4.0ch02:51497813..51503293 | 11 | 596 | 65.980 | 7.29 | MD | 260-443 | Cytoplasmic | |
>Solyc07g066420 | 1764 | SL4.0ch07:67698634..67708435 | 14 | 587 | 68.095 | 5.87 | MD | 2-161 | Nuclear | |
>Solyc11g067090 | 1869 | SL4.0ch11:50864889..50872734 | 13 | 622 | 70.038 | 6.62 | MD | 468-619 | Cytoplasmic | |
>Solyc09g065720 | 1983 | SL4.0ch09:60106675..60114638 | 13 | 660 | 73.006 | 10.69 | SD | 10-174 | Nuclear | |
>Solyc08g067090 | 2430 | SL4.0ch08:54099820..54111045 | 13 | 809 | 91.028 | 11.59 | SD | 9-175 | Nuclear |
CLDs are highly conserved among the members of Cyclophilin. Majority of the identified sequences have full length CLDs but some of the CYPs (e.g. SlCYP1, SlCYP2, and SlCYP3) contained partial CLDs, missing one or two essential residues or complete secondary structure and thus might be lacking PPIase activity (Fig. 1).
Figure 1 represents the conserved sequence of all identified tomato CLDs. CLD sequences of SlCYP aligned with the secondary structure of human cyclophilin A (hCYPA) as an external reference (Fig. 1). The hCYPA often referred to as the “archetypal” CYP consists of a β-barrel (eight antiparallel strands, β1 - β8) and two α-helices (on the top and bottom), respectively (Fig. 1).
Amino acid residues present in the structure are highly conserved and important for CYP function whereas gaps in the conserved regions denote insertions in individual residues of this family Exceptional insertion and deletion of amino acids were observed in plant CYPs in this study (Fig. 1, Supplementary Fig. 1).
Amino acid insertions from 8 to 11 molecules were observed in SlCYP8-1, SlCYP8-1, SlCYP9-1, SlCYP16-1, SlCYP16-2, SlCYP23, SlCYP24 in between α-helix-I and β-sheet-III (Fig. 1). Within the β-V and β-VI junction, SlCYP12, SlCYP14, SlCYP15, and SlCYP18-1 possessed 3-10 amino acids insertion. Romano et al. (2004) described the additional insertion of amino acids between α-helix-I and β-sheet-III, where 8 to 11 amino acids were inserted in several AtCYPs (Romano et al. 2004). Deletion of amino acids was also observed in some cases, for example-SlCYP12, SlCYP14, SlCYP15, and SlCYP24 showed this deletion (Fig. 1).
The amino acid residues that are essential for structure and function of CYP are highly conserved in hCYPA and other species. Among the conserved essential amino acids W121 (W158 in Fig. 1) was present in the 11 out of 35 SlCYP proteins. Further, R55 (R67 in Fig. 1), F60 (F72 in Fig. 1), H126 (H117 in Fig. 1) are the other highly conserved fundamental amino acids were present in 18, 22, 14 SLCYP proteins, respectively (Fig. 1). Another conserved motif VXGXV reported to be highly conserved in all AtCYP proteins. A total of 17 SlCYP proteins out of 35 contained this important VXGXV motif (Fig. 1).
An alignment of full-length protein sequences of SlCYPs with previously characterized CYPs from several different plants including human revealed considerable sequence identity among CYP proteins (Supplementary Fig. 1). The amino acid residues (arginine (R), phenylalanine (F) and histidine (H), which correspond to positions-62, -67 and -133 in AtCyp19-3 that critically essential for PPIase activity were present in most of the SlCYP proteins (Supplementary Fig. 1). The tryptophan (W121) is required for CsA binding activity was present in ten SlCYPs out of thirty five (Supplementary Fig. 1). Our observations are consistent with with previous reports in other species (Romano et al. 2004).
Most of the SlCYPs (24 out of 35) with a comparatively lower molecular weight encoded a protein with a single cyclophilin-like domain (SD) (Fig. 2; Table 1, Supplementary Table 1). The remaining 9 SlCYPs with comparatively high molecular weight possessed MD i.e., they contained CLD together with other functional domains, such as-TPR, RRM_SF, PAN_A, RING, RIN, WD40 and zinc finger domains (Fig. 2; Table 1, Supplementary Table 1). While there are mostly two SlCYPs homologs corresponding to each AtCYP with high sequence identity (80-86%), there are only a few exceptions with only one corresponding SlCYP homolog (Supplementary Table 2). As for example, SlCYP5-1 which is one of the
As shown in Figure 2, the MD CYPs, SlCYP16-1 and SlCYP16-2 are characterized by a TPR motif with CLD (Fig. 2). SlCYP16-1 and SlCYP16-2 being the homologs of AtCYP40 showed 68.3 and 76.4 identity, respectively (Supplementary Table 2). SlCYP22 contained a tryptophan-aspartic acid at the N-terminus and SlCYP22 showed 84.3% sequence identity to its
Variable number of exon-intron distribution were observed among the members of
Phylogenetic analysis of tomato,
However, the bootstrap value of the parent nodes as well as the nodes of the individual clades was found to be very low for some members (Fig. 4). According to the phylogenetic analysis all proteins from tomato,
As previously described for
Genomic distribution of
The duplication analysis of
Microsynteny analysis of 95
The expression data represents the RNA-seq reads from various tissues across several stages such as seed development (cotyledon, hypocotyl), vegetative (roots, young leaves, mature leaves, vegetative meristem), reproductive (young floral bud, anthesis flower) and fruit tissues 10 days post anthesis fruit1 (10 DPA1), 10 days post anthesis fruit2 (10 DPA2), 20 days post anthesis fruit (20 DPA), ripening fruit (33 DPA)}. The expression pattern indicates tissue-specific expression of most of the
Protein interaction network of tomato cyclophilin proteins was constructed with the
Cyclophilins are ubiquitous proteins found in a wide range of organisms those contain the conserved PPIase (peptidyl prolyl cis-trans isomerase) domain (Gasser et al. 1990). In this study, all of the 35 identified tomato cyclophilin proteins have the conserved PPIase domain but considerable variation was observed in their molecular weight, isoelectric point as well as sequence at the N and the C-terminal regions (Wang and Heitman 2005). The amino acid residues that are essential for structure and function of CYP are highly conserved in hCYPA. It has been demonstrated previously that W121 (W158 in Fig. 1) of hCYP is essential for CsA (Cyclosporin A) binding but that is not essential for PPIase activity (Peptidyl-prolyl isomerases) (Liu et al. 1991; Zydowsky et al. 1992). Further, R55 (R67 in Fig. 1), F60 (F72 in Fig. 1), H126 (H117 in Fig. 1) are the highly conserved fundamental amino acids for the PPIase activity of hCYPA (Zydowsky et al. 1992).
Since some of the SlCYPs proteins did not have all of these three essential amino acids therefore their functional mechanism on PPIase activity is a subject of further investigation. The cyclophilin proteins exhibited a huge diversity at the N and the C-terminal region of protein sequences whereas the amino and carboxyl terminal ends were most divergent (Supplementary Fig. 1). Most conserved sequence was seen at the central position with highly conserved PPIase domain supporting an observation made previously (Wang and Heitman 2005).
Cyclophilins are grouped as both single-domain (SD) and multi-domain (MD) forms based on the presenence of domain within the protein (He et al. 2004). Other than the cyclophilin domain (SD), several tomato cyclophilin possessed additional domains (MD) those are involved in mRNA splicing, RNA stabilization and processing and cell morphogenesis. It has been demonstrated that AtCYP18-2 is involved in the regulation of pre-mRNA splicing and because of the high homology its tomato homolog SlCYP6-1 might play a similar role in the nucleus of tomato cell (Romano et al. 2004). TPR motifs mediate protein-protein interactions and help in the assembly of multi-protein complexes (Blatch and Lässle 1999). AtCYP40 with its TPR motif forms a complex with small RNA duplex-bound AGO1 (ARGONAUTE family protein1) and HSP90 (heat shock protein 90) (Earley and Poethig 2011; Iki et al. 2012). The complex of AtCYP40 and HSP90-bound AGO1 plays a unique and important role in the assembly of plant RISC (RNA-induced silencing complex) where the activity of CYP40-HSP90 complex that facilitate RISC assembly is conserved between different species. Therefore, a similar function can be expected for the AtCYP40 homolog SlCYP16-1 and SlCYP16-2. WD40 usually developes a four stranded anti-parallel β-sheet and multiple copies of those sheets build a circular β-propeller structure promoting protein-protein interactions. SlCYP22 showed 84.3% sequence identity to its
The group of proteins that encoded an RRM in addition to the CLD is called cyclophilin-RNA interacting protein (CRIP) (Krzywicka et al. 2001). SlCYP9-2, SlCYP13, SlCYP17 and SlCYP21 contain a RNA recognition motif (RRM) in addition to CLD. Furthermore, AtCYP59 in
Intron is a major component of eukaryotic genomes and there is a correlation of intron size and genome size suggesting the possible evolution of some component of genome size within genes (McLysaght et al. 2000). It has also been found that intron size varies substantially between species, within species, among different genes even within a single gene which may reflect different functional properties they possess for the evolution of genomic and phenotypic traits (Zhang and Edwards 2012). Intron size varied considerably among the different
The putative motif distribution of
The phylogenetic analysis showed that the three clades included genes from both monocotyledons and dicotyledons indicated that the
Microsynteny analysis can be used to speculate the location of both orthologous genes and paralogous genes based on the whole-genome data of different species (Cao et al. 2016; Lin et al. 2014). These results speculated that
RNAseq expression analysis of the tomato
The role of cyclophilins in the regulation of different aspects of plant growth and development has been demonstrated by various recent studies. AtCYP19-1 (ROC3) was implicated in seed development, AtCYP71, resulted in compromised lateral organ formation and apical meristem activity, AtCYP40 was identified as a regulator of vegetative growth in
The present study performed a genome-wide identification of CYPs in an important vegetable crop
This work was carried out with the support of Sunchon National University Research Fund in 2021 (Grant number: 2021-0293) and “Cooperative Research Program for Agriculture Science & Technology Development (Project No. PJ01485801)” Rural Development Administration, Republic of Korea.
J Plant Biotechnol 2022; 49(1): 15-29
Published online March 31, 2022 https://doi.org/10.5010/JPB.2022.49.1.015
Copyright © The Korean Society of Plant Biotechnology.
Khadiza Khatun ·Arif Hasan Khan Robin ·Md. Rafiqul Islam·Subroto Das Jyoti ·Do-Jin Lee · Chang Kil Kim·Mi-Young Chung
Department of Biotechnology, Patuakhali Science and Technology University, Dumki, Patuakhali, Bangladesh-8602
Department of Genetics and Plant Breeding, Bangladesh Agricultural University, Mymensingh, Bangladesh-2202
Department of Biotechnology, Sher-e-Bangla Agricultural University, Dhaka, Bangladesh
Department of Horticultural Science, Kyungpook National University, Daegu, South Korea
Department of Agricultural Education, Sunchon National University, Suncheon, South Korea
Correspondence to:e-mail: ckkim@knu.ac.kr, queen@sunchon.ac.kr
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Cyclophilins (CYPs) are highly conserved ubiquitous proteins belong to the peptidyl prolyl cis/trans isomerase (PPIase) superfamily. These proteins are present in a wide range of organisms; they contain a highly conserved peptidylprolyl cis/trans isomerase domain. A comprehensive database survey identified a total of 35 genes localized in all cellular compartments of Solanum lycopersicum L., but largely in the cytosol. Sequence alignment and conserved motif analyses of the SlCYP proteins revealed a highly conserved CLD motif. Evolutionary analysis predicted the clustering of a large number of gene pairs with high sequence similarity. Expression analysis using the RNA-Seq data showed that the majority of the SlCYP genes were highly expressed in mature leaves and blooming flowers, compared with their expression in other organs. This study provides a basis for the functional characterization of individual CYP genes in the future to elucidate their role(s) in protein refolding and long-distance signaling in tomatoes and in plant biology, in general.
Keywords: Genome-wide analysis, cyclophilins, Solanum lycopersicum L., PPIase domain, expression profiles, evolutionary relation
Cyclophilins (CYPs), are highly conserved ubiquitous proteins found in all types of living organism including bacteria, fungi, mammals, plants and insects (Galat 1999). Cyclophilin was first identified in 1984 as a receptor of drug cyclosporin A in mammalian cells (Handschumacher et al. 1984). In plant, the
CYPs are a prominent and abundant class of proteins involved in various fundamental biological process including signalling, protein folding, trafficking, transcription, RNA-binding, apoptosis, pathogen and plant stress responses (Allain et al. 1994; Anderson et al. 2002; Aumüller et al. 2010; Baker et al. 1994; Brazin et al. 2002; Bukrinsky 2002; Dubourg et al. 2004; Jing et al. 2015; Kern et al. 1995; Klappa et al. 1995; Krzywicka et al. 2001; Li et al. 2007; Lin and Lechleiter 2002; Pogorelko et al. 2014; Schiene-Fischer and Yu 2001; Zander et al. 2003). Rice OsCYP2 participate in auxin signaling pathways by interacting with a zinc finger protein (OsZEP) that controls lateral root development (Cui et al. 2017). Soybean GmCYP1 interacts with GmMYB176, an isoflavonoid regulator which is stimulated by several abiotic stresses (Mainali et al. 2017). Two
Genome-wide studies of
Tomato
Protein sequences were aligned using Genedoc (http://www.nrbsc.org/gfx/genedoc/ebinet.htm) multiple sequence alignment tool. The phylogenetic tree was calculated by using the multiple alignment from ClustalOmega, and subsequently processed with MEGA 6.0 in the Neighbor-Joining (NJ) algorithm method (Tamura et al. 2013).
Start and end positions of
Synteny analysis of
The expression of 35 SlCYP (species
String web-based software (https://string-db.org/) was used to predict the protein interactions of tomato cyclophilin proteins and the
A total of 37 tomato cyclophilin sequences syntenic to
Table 1 . All the identified CYP family members in the tomato plant and their nomenclature, locus name, molecular weight, protein sequence length, chromosomal location, subcellular localization, theoretical isoelectric point, and predicted exons.
Gene name | Locus name | ORF | Chromosome location | Exon | Protein | |||||
---|---|---|---|---|---|---|---|---|---|---|
Length (aa) | MW (kDa) | pI | Domain information | CLD position | Subcellular localization | |||||
>Solyc12g038110 | 186 | SL4.0ch12:48323666..48323851 | 1 | 61 | 7.169 | 9.30 | SD | 2-61 | Extracellular | |
>Solyc12g089200 | 228 | SL4.0ch12:63881747..63881974 | 1 | 75 | 8.149 | 5.39 | SD | 1-75 | Nuclear | |
>Solyc12g038030 | 282 | SL4.0ch12:48064472..48064862 | 2 | 93 | 10.549 | 9.12 | SD | 2-52 | Mitochondrial | |
>Solyc11g006070 | 462 | SL4.0ch11:904408..904869 | 1 | 153 | 16.489 | 6.06 | SD | 1-149 | Cytoplasmic | |
>Solyc08g006090 | 483 | SL4.0ch08:848486..854332 | 4 | 160 | 17.530 | 7.01 | SD | 2-153 | Cytoplasmic | |
>Solyc10g054910 | 519 | SL4.0ch10:55122558..55123076 | 1 | 172 | 17.873 | 8.59 | SD | 7-170 | Cytoplasmic | |
>Solyc01g111170 | 516 | SL4.0ch01:89876816..89877826 | 1 | 171 | 17.910 | 8.83 | SD | 1-164 | Cytoplasmic | |
>Solyc09g010190 | 495 | SL4.0ch09:3614647..3619878 | 6 | 164 | 18.175 | 8.58 | SD | 11-162 | Cytoplasmic | |
>Solyc12g038070 | 477 | SL4.0ch12:48149484..48150037 | 2 | 158 | 18.403 | 6.72 | SD | 2-74 | Nuclear | |
>Solyc12g038150 | 498 | SL4.0ch12:48601797..48602468 | 2 | 165 | 18.894 | 4.72 | SD | 1-97 | Nuclear | |
>Solyc01g096520 | 573 | SL4.0ch01:79859565..79864679 | 7 | 190 | 20.510 | 8.42 | SD | 26-189 | Cytoplasmic | |
>Solyc06g076970 | 624 | SL2.50ch06:47833488..47837300 | 7 | 207 | 22.249 | 9.19 | SD | 41-204 | Cytoplasmic | |
>Solyc12g038010 | 594 | SL4.0ch12:48040757..48041494 | 3 | 197 | 22.340 | 5.17 | SD | 35-125 | Cytoplasmic | |
>Solyc06g051650 | 678 | SL4.0ch06:32962272..32966959 | 8 | 225 | 24.469 | 8.91 | SD | 59-222 | Cytoplasmic | |
>Solyc12g038000 | 645 | SL2.50ch12:49574555..49575486 | 4 | 214 | 24.485 | 5.15 | MD | 156-211 | Nuclear | |
>Solyc01g111360 | 687 | SL4.0ch01:89991208..89995745 | 7 | 228 | 24.933 | 6.65 | SD | 49-213 | Mitochondrial | |
>Solyc01g010590 | 687 | SL4.0ch01:5632492..5638269 | 8 | 228 | 25.763 | 8.68 | SD | 35-192 | Cytoplasmic | |
>Solyc10g083930 | 693 | SL4.0ch10:62791267..62795709 | 7 | 230 | 26.055 | 9.30 | SD | 76-226 | Mitochondrial | |
>Solyc01g009990 | 747 | SL4.0ch01:4610206..4614220 | 6 | 248 | 26.535 | 9.20 | SD | 84-244 | Chloroplast | |
>Solyc09g008410 | 711 | SL4.0ch09:1895141..1903823 | 7 | 236 | 26.881 | 6.76 | SD | 82-232 | Cytoplasmic | |
>Solyc07g007110 | 894 | SL4.0ch07:1829661..1833568 | 2 | 297 | 32.380 | 8.68 | SD | 89-252 | Chloroplast | |
>Solyc02g061800 | 882 | SL4.0ch02:31311295..31312755 | 2 | 293 | 33.831 | 4.76 | MD | 2-165 | Cytoplasmic | |
>Solyc03g119860 | 954 | SL4.0ch03:62855114..62856604 | 2 | 317 | 34.575 | 8.74 | SD | 97-288 | Chloroplast | |
>Solyc08g077790 | 1032 | SL4.0ch08:59811302..59816547 | 5 | 343 | 37.275 | 5.15 | MD | 167-323 | Extracellular | |
>Solyc02g090480 | 1086 | SL4.0ch02:50052680..50058430 | 8 | 362 | 40.293 | 5.66 | MD | 7-172 | Cytoplasmic | |
>Solyc01g108340 | 1089 | SL4.0ch01:87996043..88000791 | 8 | 362 | 40.347 | 6.05 | MD | 7-172 | Cytoplasmic | |
>Solyc12g049430 | 1149 | SL4.0ch12:60709814..60710962 | 1 | 382 | 43.879 | 5.59 | MD | 2-164 | Nuclear | |
>Solyc12g013580 | 1356 | SL4.0ch12:4454376..4461071 | 12 | 451 | 49.078 | 5.95 | SD | 277-443 | Chloroplast | |
>Solyc02g086910 | 1356 | SL4.0ch02:47512351..47516135 | 7 | 451 | 49.287 | 5.00 | SD | 151-308 | Chloroplast | |
>Solyc08g062700 | 1479 | SL4.0ch08:49890458..49910307 | 10 | 492 | 54.985 | 8.40 | SD | 14-168 | Nuclear | |
>Solyc02g092380 | 1791 | SL4.0ch02:51497813..51503293 | 11 | 596 | 65.980 | 7.29 | MD | 260-443 | Cytoplasmic | |
>Solyc07g066420 | 1764 | SL4.0ch07:67698634..67708435 | 14 | 587 | 68.095 | 5.87 | MD | 2-161 | Nuclear | |
>Solyc11g067090 | 1869 | SL4.0ch11:50864889..50872734 | 13 | 622 | 70.038 | 6.62 | MD | 468-619 | Cytoplasmic | |
>Solyc09g065720 | 1983 | SL4.0ch09:60106675..60114638 | 13 | 660 | 73.006 | 10.69 | SD | 10-174 | Nuclear | |
>Solyc08g067090 | 2430 | SL4.0ch08:54099820..54111045 | 13 | 809 | 91.028 | 11.59 | SD | 9-175 | Nuclear |
CLDs are highly conserved among the members of Cyclophilin. Majority of the identified sequences have full length CLDs but some of the CYPs (e.g. SlCYP1, SlCYP2, and SlCYP3) contained partial CLDs, missing one or two essential residues or complete secondary structure and thus might be lacking PPIase activity (Fig. 1).
Figure 1 represents the conserved sequence of all identified tomato CLDs. CLD sequences of SlCYP aligned with the secondary structure of human cyclophilin A (hCYPA) as an external reference (Fig. 1). The hCYPA often referred to as the “archetypal” CYP consists of a β-barrel (eight antiparallel strands, β1 - β8) and two α-helices (on the top and bottom), respectively (Fig. 1).
Amino acid residues present in the structure are highly conserved and important for CYP function whereas gaps in the conserved regions denote insertions in individual residues of this family Exceptional insertion and deletion of amino acids were observed in plant CYPs in this study (Fig. 1, Supplementary Fig. 1).
Amino acid insertions from 8 to 11 molecules were observed in SlCYP8-1, SlCYP8-1, SlCYP9-1, SlCYP16-1, SlCYP16-2, SlCYP23, SlCYP24 in between α-helix-I and β-sheet-III (Fig. 1). Within the β-V and β-VI junction, SlCYP12, SlCYP14, SlCYP15, and SlCYP18-1 possessed 3-10 amino acids insertion. Romano et al. (2004) described the additional insertion of amino acids between α-helix-I and β-sheet-III, where 8 to 11 amino acids were inserted in several AtCYPs (Romano et al. 2004). Deletion of amino acids was also observed in some cases, for example-SlCYP12, SlCYP14, SlCYP15, and SlCYP24 showed this deletion (Fig. 1).
The amino acid residues that are essential for structure and function of CYP are highly conserved in hCYPA and other species. Among the conserved essential amino acids W121 (W158 in Fig. 1) was present in the 11 out of 35 SlCYP proteins. Further, R55 (R67 in Fig. 1), F60 (F72 in Fig. 1), H126 (H117 in Fig. 1) are the other highly conserved fundamental amino acids were present in 18, 22, 14 SLCYP proteins, respectively (Fig. 1). Another conserved motif VXGXV reported to be highly conserved in all AtCYP proteins. A total of 17 SlCYP proteins out of 35 contained this important VXGXV motif (Fig. 1).
An alignment of full-length protein sequences of SlCYPs with previously characterized CYPs from several different plants including human revealed considerable sequence identity among CYP proteins (Supplementary Fig. 1). The amino acid residues (arginine (R), phenylalanine (F) and histidine (H), which correspond to positions-62, -67 and -133 in AtCyp19-3 that critically essential for PPIase activity were present in most of the SlCYP proteins (Supplementary Fig. 1). The tryptophan (W121) is required for CsA binding activity was present in ten SlCYPs out of thirty five (Supplementary Fig. 1). Our observations are consistent with with previous reports in other species (Romano et al. 2004).
Most of the SlCYPs (24 out of 35) with a comparatively lower molecular weight encoded a protein with a single cyclophilin-like domain (SD) (Fig. 2; Table 1, Supplementary Table 1). The remaining 9 SlCYPs with comparatively high molecular weight possessed MD i.e., they contained CLD together with other functional domains, such as-TPR, RRM_SF, PAN_A, RING, RIN, WD40 and zinc finger domains (Fig. 2; Table 1, Supplementary Table 1). While there are mostly two SlCYPs homologs corresponding to each AtCYP with high sequence identity (80-86%), there are only a few exceptions with only one corresponding SlCYP homolog (Supplementary Table 2). As for example, SlCYP5-1 which is one of the
As shown in Figure 2, the MD CYPs, SlCYP16-1 and SlCYP16-2 are characterized by a TPR motif with CLD (Fig. 2). SlCYP16-1 and SlCYP16-2 being the homologs of AtCYP40 showed 68.3 and 76.4 identity, respectively (Supplementary Table 2). SlCYP22 contained a tryptophan-aspartic acid at the N-terminus and SlCYP22 showed 84.3% sequence identity to its
Variable number of exon-intron distribution were observed among the members of
Phylogenetic analysis of tomato,
However, the bootstrap value of the parent nodes as well as the nodes of the individual clades was found to be very low for some members (Fig. 4). According to the phylogenetic analysis all proteins from tomato,
As previously described for
Genomic distribution of
The duplication analysis of
Microsynteny analysis of 95
The expression data represents the RNA-seq reads from various tissues across several stages such as seed development (cotyledon, hypocotyl), vegetative (roots, young leaves, mature leaves, vegetative meristem), reproductive (young floral bud, anthesis flower) and fruit tissues 10 days post anthesis fruit1 (10 DPA1), 10 days post anthesis fruit2 (10 DPA2), 20 days post anthesis fruit (20 DPA), ripening fruit (33 DPA)}. The expression pattern indicates tissue-specific expression of most of the
Protein interaction network of tomato cyclophilin proteins was constructed with the
Cyclophilins are ubiquitous proteins found in a wide range of organisms those contain the conserved PPIase (peptidyl prolyl cis-trans isomerase) domain (Gasser et al. 1990). In this study, all of the 35 identified tomato cyclophilin proteins have the conserved PPIase domain but considerable variation was observed in their molecular weight, isoelectric point as well as sequence at the N and the C-terminal regions (Wang and Heitman 2005). The amino acid residues that are essential for structure and function of CYP are highly conserved in hCYPA. It has been demonstrated previously that W121 (W158 in Fig. 1) of hCYP is essential for CsA (Cyclosporin A) binding but that is not essential for PPIase activity (Peptidyl-prolyl isomerases) (Liu et al. 1991; Zydowsky et al. 1992). Further, R55 (R67 in Fig. 1), F60 (F72 in Fig. 1), H126 (H117 in Fig. 1) are the highly conserved fundamental amino acids for the PPIase activity of hCYPA (Zydowsky et al. 1992).
Since some of the SlCYPs proteins did not have all of these three essential amino acids therefore their functional mechanism on PPIase activity is a subject of further investigation. The cyclophilin proteins exhibited a huge diversity at the N and the C-terminal region of protein sequences whereas the amino and carboxyl terminal ends were most divergent (Supplementary Fig. 1). Most conserved sequence was seen at the central position with highly conserved PPIase domain supporting an observation made previously (Wang and Heitman 2005).
Cyclophilins are grouped as both single-domain (SD) and multi-domain (MD) forms based on the presenence of domain within the protein (He et al. 2004). Other than the cyclophilin domain (SD), several tomato cyclophilin possessed additional domains (MD) those are involved in mRNA splicing, RNA stabilization and processing and cell morphogenesis. It has been demonstrated that AtCYP18-2 is involved in the regulation of pre-mRNA splicing and because of the high homology its tomato homolog SlCYP6-1 might play a similar role in the nucleus of tomato cell (Romano et al. 2004). TPR motifs mediate protein-protein interactions and help in the assembly of multi-protein complexes (Blatch and Lässle 1999). AtCYP40 with its TPR motif forms a complex with small RNA duplex-bound AGO1 (ARGONAUTE family protein1) and HSP90 (heat shock protein 90) (Earley and Poethig 2011; Iki et al. 2012). The complex of AtCYP40 and HSP90-bound AGO1 plays a unique and important role in the assembly of plant RISC (RNA-induced silencing complex) where the activity of CYP40-HSP90 complex that facilitate RISC assembly is conserved between different species. Therefore, a similar function can be expected for the AtCYP40 homolog SlCYP16-1 and SlCYP16-2. WD40 usually developes a four stranded anti-parallel β-sheet and multiple copies of those sheets build a circular β-propeller structure promoting protein-protein interactions. SlCYP22 showed 84.3% sequence identity to its
The group of proteins that encoded an RRM in addition to the CLD is called cyclophilin-RNA interacting protein (CRIP) (Krzywicka et al. 2001). SlCYP9-2, SlCYP13, SlCYP17 and SlCYP21 contain a RNA recognition motif (RRM) in addition to CLD. Furthermore, AtCYP59 in
Intron is a major component of eukaryotic genomes and there is a correlation of intron size and genome size suggesting the possible evolution of some component of genome size within genes (McLysaght et al. 2000). It has also been found that intron size varies substantially between species, within species, among different genes even within a single gene which may reflect different functional properties they possess for the evolution of genomic and phenotypic traits (Zhang and Edwards 2012). Intron size varied considerably among the different
The putative motif distribution of
The phylogenetic analysis showed that the three clades included genes from both monocotyledons and dicotyledons indicated that the
Microsynteny analysis can be used to speculate the location of both orthologous genes and paralogous genes based on the whole-genome data of different species (Cao et al. 2016; Lin et al. 2014). These results speculated that
RNAseq expression analysis of the tomato
The role of cyclophilins in the regulation of different aspects of plant growth and development has been demonstrated by various recent studies. AtCYP19-1 (ROC3) was implicated in seed development, AtCYP71, resulted in compromised lateral organ formation and apical meristem activity, AtCYP40 was identified as a regulator of vegetative growth in
The present study performed a genome-wide identification of CYPs in an important vegetable crop
This work was carried out with the support of Sunchon National University Research Fund in 2021 (Grant number: 2021-0293) and “Cooperative Research Program for Agriculture Science & Technology Development (Project No. PJ01485801)” Rural Development Administration, Republic of Korea.
Table 1 . All the identified CYP family members in the tomato plant and their nomenclature, locus name, molecular weight, protein sequence length, chromosomal location, subcellular localization, theoretical isoelectric point, and predicted exons.
Gene name | Locus name | ORF | Chromosome location | Exon | Protein | |||||
---|---|---|---|---|---|---|---|---|---|---|
Length (aa) | MW (kDa) | pI | Domain information | CLD position | Subcellular localization | |||||
>Solyc12g038110 | 186 | SL4.0ch12:48323666..48323851 | 1 | 61 | 7.169 | 9.30 | SD | 2-61 | Extracellular | |
>Solyc12g089200 | 228 | SL4.0ch12:63881747..63881974 | 1 | 75 | 8.149 | 5.39 | SD | 1-75 | Nuclear | |
>Solyc12g038030 | 282 | SL4.0ch12:48064472..48064862 | 2 | 93 | 10.549 | 9.12 | SD | 2-52 | Mitochondrial | |
>Solyc11g006070 | 462 | SL4.0ch11:904408..904869 | 1 | 153 | 16.489 | 6.06 | SD | 1-149 | Cytoplasmic | |
>Solyc08g006090 | 483 | SL4.0ch08:848486..854332 | 4 | 160 | 17.530 | 7.01 | SD | 2-153 | Cytoplasmic | |
>Solyc10g054910 | 519 | SL4.0ch10:55122558..55123076 | 1 | 172 | 17.873 | 8.59 | SD | 7-170 | Cytoplasmic | |
>Solyc01g111170 | 516 | SL4.0ch01:89876816..89877826 | 1 | 171 | 17.910 | 8.83 | SD | 1-164 | Cytoplasmic | |
>Solyc09g010190 | 495 | SL4.0ch09:3614647..3619878 | 6 | 164 | 18.175 | 8.58 | SD | 11-162 | Cytoplasmic | |
>Solyc12g038070 | 477 | SL4.0ch12:48149484..48150037 | 2 | 158 | 18.403 | 6.72 | SD | 2-74 | Nuclear | |
>Solyc12g038150 | 498 | SL4.0ch12:48601797..48602468 | 2 | 165 | 18.894 | 4.72 | SD | 1-97 | Nuclear | |
>Solyc01g096520 | 573 | SL4.0ch01:79859565..79864679 | 7 | 190 | 20.510 | 8.42 | SD | 26-189 | Cytoplasmic | |
>Solyc06g076970 | 624 | SL2.50ch06:47833488..47837300 | 7 | 207 | 22.249 | 9.19 | SD | 41-204 | Cytoplasmic | |
>Solyc12g038010 | 594 | SL4.0ch12:48040757..48041494 | 3 | 197 | 22.340 | 5.17 | SD | 35-125 | Cytoplasmic | |
>Solyc06g051650 | 678 | SL4.0ch06:32962272..32966959 | 8 | 225 | 24.469 | 8.91 | SD | 59-222 | Cytoplasmic | |
>Solyc12g038000 | 645 | SL2.50ch12:49574555..49575486 | 4 | 214 | 24.485 | 5.15 | MD | 156-211 | Nuclear | |
>Solyc01g111360 | 687 | SL4.0ch01:89991208..89995745 | 7 | 228 | 24.933 | 6.65 | SD | 49-213 | Mitochondrial | |
>Solyc01g010590 | 687 | SL4.0ch01:5632492..5638269 | 8 | 228 | 25.763 | 8.68 | SD | 35-192 | Cytoplasmic | |
>Solyc10g083930 | 693 | SL4.0ch10:62791267..62795709 | 7 | 230 | 26.055 | 9.30 | SD | 76-226 | Mitochondrial | |
>Solyc01g009990 | 747 | SL4.0ch01:4610206..4614220 | 6 | 248 | 26.535 | 9.20 | SD | 84-244 | Chloroplast | |
>Solyc09g008410 | 711 | SL4.0ch09:1895141..1903823 | 7 | 236 | 26.881 | 6.76 | SD | 82-232 | Cytoplasmic | |
>Solyc07g007110 | 894 | SL4.0ch07:1829661..1833568 | 2 | 297 | 32.380 | 8.68 | SD | 89-252 | Chloroplast | |
>Solyc02g061800 | 882 | SL4.0ch02:31311295..31312755 | 2 | 293 | 33.831 | 4.76 | MD | 2-165 | Cytoplasmic | |
>Solyc03g119860 | 954 | SL4.0ch03:62855114..62856604 | 2 | 317 | 34.575 | 8.74 | SD | 97-288 | Chloroplast | |
>Solyc08g077790 | 1032 | SL4.0ch08:59811302..59816547 | 5 | 343 | 37.275 | 5.15 | MD | 167-323 | Extracellular | |
>Solyc02g090480 | 1086 | SL4.0ch02:50052680..50058430 | 8 | 362 | 40.293 | 5.66 | MD | 7-172 | Cytoplasmic | |
>Solyc01g108340 | 1089 | SL4.0ch01:87996043..88000791 | 8 | 362 | 40.347 | 6.05 | MD | 7-172 | Cytoplasmic | |
>Solyc12g049430 | 1149 | SL4.0ch12:60709814..60710962 | 1 | 382 | 43.879 | 5.59 | MD | 2-164 | Nuclear | |
>Solyc12g013580 | 1356 | SL4.0ch12:4454376..4461071 | 12 | 451 | 49.078 | 5.95 | SD | 277-443 | Chloroplast | |
>Solyc02g086910 | 1356 | SL4.0ch02:47512351..47516135 | 7 | 451 | 49.287 | 5.00 | SD | 151-308 | Chloroplast | |
>Solyc08g062700 | 1479 | SL4.0ch08:49890458..49910307 | 10 | 492 | 54.985 | 8.40 | SD | 14-168 | Nuclear | |
>Solyc02g092380 | 1791 | SL4.0ch02:51497813..51503293 | 11 | 596 | 65.980 | 7.29 | MD | 260-443 | Cytoplasmic | |
>Solyc07g066420 | 1764 | SL4.0ch07:67698634..67708435 | 14 | 587 | 68.095 | 5.87 | MD | 2-161 | Nuclear | |
>Solyc11g067090 | 1869 | SL4.0ch11:50864889..50872734 | 13 | 622 | 70.038 | 6.62 | MD | 468-619 | Cytoplasmic | |
>Solyc09g065720 | 1983 | SL4.0ch09:60106675..60114638 | 13 | 660 | 73.006 | 10.69 | SD | 10-174 | Nuclear | |
>Solyc08g067090 | 2430 | SL4.0ch08:54099820..54111045 | 13 | 809 | 91.028 | 11.59 | SD | 9-175 | Nuclear |
Journal of
Plant Biotechnology