Research Article

J Plant Biotechnol 2022; 49(1): 39-45

Published online March 31, 2022

© The Korean Society of Plant Biotechnology

Comparative analysis of AGPase proteins and conserved domains in sweetpotato (Ipomoea batatas (L.) Lam.) and its two wild relatives

Hualin Nie ·Sujung Kim ·Jongbo Kim·Suk-Yoon Kwon ·Sun-Hyung Kim

Department of Environmental Horticulture, University of Seoul, Seoul 02504, Korea
Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
Bioenergy Crop Research Institute, National Institute of Crop Science, Rural Development Administration, Muan 58545, Republic of Korea
Department of Biotechnology, College of Biomedical & Health Sciences, Global Campus. Konkuk University, ChoongJu, 27478, Korea

Received: 9 February 2022; Revised: 22 February 2022; Accepted: 22 February 2022

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Conserved domains are defined as recurring units in molecular evolution and are commonly used to interpret the molecular function and biochemical structure of proteins. Herein, the ADP-glucose pyrophosphorylase (AGPase) amino acid sequences of three species of the Ipomoea genus [Ipomoea trifida, I. triloba, and I. batatas (L.) Lam. (sweetpotato)] were identified to investigate their physicochemical and biochemical characteristics. The molecular weight, isoelectric point, instability index, and grand average of hyropathy markedly differed among the three species. The aliphatic index values of sweetpotato AGPase proteins were higher in the small subunit than in the large subunit. The AGPase proteins from sweetpotato were found to contain an LbH_G1P_AT_C domain in the C-terminal region and various domains (NTP_transferase, ADP_Glucose_PP, or Glyco_tranf_GTA) in the N-terminal region. Conversely, most of its two relatives (I. trifida and I. triloba) were found to only contain the NTP_transferase domain in the N-terminal region. These findings suggested that these conserved domains were species-specific and related to the subunit types of AGPase proteins. The study may enable research on the AGPase-related specific characteristics of sweetpotatoes that do not exist in the other two species, such as starch metabolism and tuberization mechanism.

Keywords ADP-glucose pyrophosphorylase, conserved domain, AGPase small subunit, AGPase large subunit, tuberization, sweetpotato

ADP-glucose pyrophosphorylase (AGPase; EC: is a regulatory enzyme that catalyzes the biosynthesis of alpha 1,4-glucans (glycogen or starch) in photosynthetic bacteria and plants (Smith-White and Preiss 1992). In higher plants, it is a heterotetramer composed of two different but closely related subunits (α2β2): “small” (α subunit, 50-54 kDa) and “large” subunits (β subunit, 51-60 kDa) based on the size difference (Ballicora et al. 2004; Smith-White and Preiss 1992). The small subunit is responsible for the catalytic activity, whereas the large subunit plays regulatory roles (Ballicora et al. 2004; Crevillén et al. 2003). These subunits are necessary for the optimal activity of the native enzyme in plants; a lack of one of the subunits will reduce the activity of the AGPase and influence the synthesis of starch (Li and Preiss 1992). In sweetpotato, AGPase is a key enzyme controlling starch synthesis and is considered an important determinant of the sink activity of the roots (Tsubone et al. 2000; Yatomi et al. 1996). Many AGPase genes have been cloned and studied in sweetpotatoes (Lee et al. 2000; Seo et al. 2015; Zhou et al. 2016).

The protein domains can be considered distinct functions and structural units of proteins that are usually identified as repeating (sequence or structural) units (Ingolfsson and Yona 2008; Li et al. 2012). In molecular evolution, these domains may have been reorganized in different arrangements in protein function annotation (Ingolfsson and Yona 2008), protein structure determination (Marchler-Bauer et al. 2012), and protein engineering (Guerois and Serrano 2001). Conserved domains are defined by a conserved domain database (CDD) as repeating units in molecular evolution, the extent of which can be determined by sequence and structural analysis (Marchler-Bauer et al. 2012).

Sweetpotato (Ipomoea batatas (L.) Lam.) is a hexaploid (2n = 6x = 90) perennial tuberization crop belonging to the family Convolvulaceae (Welbaum 2015). Two non-tuberization diploid Ipomoea species, I. trifida (H.B.K.) G. Don (2n = 2x = 30) and I. triloba L. (2n = 2x = 30), have been reported to be the putative progenitors of sweetpotato, which are commonly considered to be model species for sweetpotato research (Roullier et al. 2013; Wu et al. 2018). In this study, we aimed to screen the AGPase genes from sweetpotato and its two related species to investigate the conserved domains of the coding proteins. The differences in these domains can be used to confirm the molecular functions of the AGPase proteins in sweetpotato and its two relatives.

Identification of AGPase amino acid sequences

Sweetpotato Genomics Resource ( and NCBI databases ( were used to identify the AGPase domain-containing proteins in the three species. The amino acid sequence of the AGPase protein IbAGPa1 (BAF47744.2) was used as the driver sequence for BLAST-search.

The ProtParam ( of ExPASy (Expert protein analysis system, tool was used to compute the physicochemical characteristics of AGPase proteins in the three species, including the number of amino acids, molecular weight, theoretical isoelectric point (pI), instability (II) and aliphatic index (AI), and grand average of hydropathy (GRAVY) (Gasteiger et al. 2005).

Multiple-sequence alignment and phylogenetic tree structure

The amino acid sequences of the AGPase proteins in FASTA formats were used for multiple-sequence alignment using the CLC Sequence Viewer 7.6 software (CLC bio, Aarhus, Denmark). A neighbor-joining phylogenetic tree was constructed using MEGA X 10.1 software (Pennsylvania State University, US) with the following parameters: bootstrap analysis of 1,000 replicates, Poisson correction method, and pairwise deletion (Kumar et al. 2018).

Conserved domain analysis

Pfam (, SMART (, and CDD ( were used to explore the conserved domains of the AGPase proteins. The selected conserved domains were drawn using DOG 2.0.1 software (Ren et al. 2009).

Identification of AGPase proteins

Forty-five AGPase domain-containing proteins from I. batatas (26 accessions), I. trifida (10 accessions), and I. triloba (9 accessions) were identified and used for various analyses (Table 1). The sizes of these proteins were distinctly different; the amino acids ranged from 165 to 525 and the molecular weights (MW) ranged from 18.35 to 58.19 kDa.

Table 1 Biochemical and physicochemical characteristics of AGPase proteins in the three species

SpeciesAccession No.SubunitAmino acidsMolecular weight (MW)Isoelectric point (pI)Instability index (II)Aliphatic index (AI)Grand average of hydropathy (GRAVY)
I. batatasBAF47744.2Small52257155.246.7439.7991.24-0.178
I. batatasAFL55400.1Small52257143.196.7439.5090.48-0.188
I. batatasAAS66988.1Small52257188.326.7439.4291.23-0.166
I. batatasAAA19648.1Small30333530.515.5235.0696.30-0.129
I. batatasCAA86726.1Small30233374.325.3935.1496.62-0.115
I. batatasCAA58473.1Small42747300.226.1336.2997.12-0.119
I. batatasAFL55401.1Small52357164.198.0237.3890.15-0.194
I. batatasBAF47745.1Small52357178.218.0237.3890.34-0.190
I. batatasAAS66987.1Small52357179.248.0236.6490.52-0.183
I. batatasAFL55399.1Large52558055.438.9234.2988.44-0.164
I. batatasAGB85112.1Large52557990.318.8233.1487.80-0.158
I. batatasBAF47749.1Large52558117.468.9335.2687.50-0.164
I. batatasAFL55398.1Large51857269.406.3729.9785.08-0.178
I. batatasBAF47748.1Large51857269.366.2529.7385.08-0.177
I. batatasAGB85111.1Large51757376.526.4128.9984.29-0.190
I. batatasAFL55396.1Unknown51757577.747.0135.3286.36-0.245
I. batatasBAF47746.1Large51757616.786.6936.6187.31-0.234
I. batatasCAB52196.1Unknown45050090.215.3835.9489.04-0.168
I. batatasBAF47747.1Large51557562.137.0831.7488.99-0.204
I. batatasAFL55397.1Large51557485.946.4432.7888.80-0.194
I. batatasAGB85109.1Large51757527.646.4437.9787.50-0.237
I. batatasCAB55495.1Unknown49054707.537.1436.9789.33-0.227
I. batatasAGB85110.1Large51557559.036.3131.1389.55-0.212
I. batatasAAC21562.1Large51757686.947.5538.5586.92-0.234
I. batatasCAB55496.1Large38543443.495.3532.3085.82-0.224
I. batatasCAB51610.1Large30634636.485.1337.9686.63-0.300
I. trifidaitf11g03360.t1Unknown52257155.246.7439.7991.23-0.178
I. trifidaitf13g19620.t1Large52558186.579.0134.6587.89-0.170
I. trifidaitf02g13930.t1Unknown52357178.218.0237.4090.15-0.194
I. trifidaitf01g13780.t1Unknown35139640.799.5365.4893.02-0.191
I. trifidaitf00g32520.t1Unknown35139204.505.4046.3899.460.111
I. trifidaitf09g27040.t1Small47452547.386.1547.7685.99-0.240
I. trifidaitf06g21950.t1Large51757244.406.3728.9084.87-0.174
I. trifidaitf08g03850.t1Large51757594.298.5028.3685.98-0.201
I. trifidaitf05g24300.t1Unknown41646019.995.7633.9299.810.057
I. trifidaitf10g06320.t1Unknown42748406.645.6437.0999.530.111
I. trilobaitb02g09380.t1Unknown52357164.198.0237.3890.15-0.194
I. trilobaitb11g03360.t1Unknown52257155.246.7439.7991.23-0.178
I. trilobaitb13g23180.t1Large26629618.765.6832.9292.74-0.106
I. trilobaitb09g31010.t1Small47552687.576.1648.5686.63-0.236
I. trilobaitb06g20570.t1Large51757203.306.5129.7883.73-0.185
I. trilobaitb08g03970.t1Large51757626.358.5028.3685.42-0.206
I. trilobaitb09g17690.t1Unknown16518349.104.7132.4592.240.049
I. trilobaitb05g25020.t1Unknown41646032.995.7633.4699.570.050
I. trilobaitb11g22920.t4Unknown41545485.486.2341.54100.480.045

The isoelectric point (pI), which represents the average pH of the molecule without a net electrical charge or electrically neutrality, was 4.71-9.53 in all categories. The average pI of I. batatas, I. trifida, and I. triloba AGPase were 6.83, 7.11, and 6.47, respectively. The instability index (II), which represents the stability and instability of a polypeptide at ≤ 40 and > 40, respectively, indicated 40 or less in AGPase of I. batatas. In contrast, some AGPases of the I. trifida and I. triloba were 40 or more. The aliphatic index (AI), which represents the relative volume of the aliphatic side chains of a polypeptide, was similar in the three species, but there were differences between subunits of I. batatas AGPase. Higher AI values were observed for the small subunits than the large subunits of the I. batatas AGPase. The grand average of hydropathy (GRAVY), which was analyzed to determine the hydropathy of AGPase, showed that I. batatas had different characteristics from the other two species. All I. batatas AGPases showed negative values, whereas some of the I. trifida and I. triloba AGPases had positive values.

Conserved domain analysis

Six types of conserved domains that showed different distributions were included in the AGPase proteins of these three species (Fig. 1b, Table 2). Most of the I. trifida and I. triloba AGPases had only the NTP_transferase domain and some had two conserved domains: NTP_transferase at the N-terminal and Hexapep or Cpn60_TCP1 at the C-terminal. On the other hand, the I. batatas AGPase proteins had four types of conserved domains (NTP_transferase, LbH_G1P_AT_C, ADP_Glucose_PP, and Glyco_ tranf_GTA_type); each of them had two conserved domains. All of the I. batatas AGPase proteins had the LbH_G1P_ AT_C domain at the C-terminals, but the N-terminals differed according to the subunit. The N-terminal of all large subunits of I. batatas AGPase proteins has the NTP_ transferase domain only except for CAB51610.1, whereas all small subunits have ADP_Glucose_PP domain except for CAB55496.1, AAA19648.1, and CAA86726.1. The proteins with this exception all had partial sequences and had the Glyco_tranf_GTA_type domain at the C-terminals.

Table 2 Conserved domain prediction of the AGPase in the three species

SpeciesAccession No.Amino acidConserved domain 1Conserved domain 2


Fig. 1. Phylogenetic tree (a) and domain structure (b) of the AGPase proteins in Ipomoea batatas (black circles), I. trifida (red quadrangles), and I. triloba (green triangles). The numbers at the nodes indicate the bootstrap values. The conserved domains are indicated by colored blocks on the right. Gray, NTP_transferase; green, LbH_G1P_AT_C; blue, Glyco_tranf_GTA_type; purple, Hexapep; red, Cpn60_TCP1; orange, ADP_Glucose_PP

Phylogenetic analysis

The evolutionary history was inferred using the Neighbor-Joining method (Saitou and Nei 1987). Fig. 1a presents the optimal tree with the sum of the branch length = 29.09. This analysis involved 45 amino acid sequences and 512 positions. The conserved domains were labeled on the amino acid sequences (Fig. 1a). The length and type of the domain were different for each species. Based on the phylogenetic tree, AGPase proteins from these species were grouped together according to large and small subunit type.

AGPase is an important factor involved in the tuberous root of sweetpotatoes because it is a vital enzyme in starch synthesis (Tsubone et al. 2000; Yatomi et al. 1996). Although it is also present in I. trifida and I. triloba, as well as in plants of the genus Ipomoea, they all have different physiological properties from sweetpotatoes, such as non-tuberization. Therefore, AGPase is believed to have different structures or different functions in plants of the genus Ipomoea. The AGPase identification of sweetpotatoes and two non-tuberous Ipomoea species performed in this study is very important for understanding the relationship between plants of the genus Ipomoea and the functions of each species. Sweetpotato is a polyploid crop of I. trifida, but it is unclear if it is autopolyploidy or allopolyploidy (Roullier et al. 2013; Wu et al. 2018). The amount of AGPases increased by whole-genome duplication in sweetpotatoes from its relatives. This result is consistent with a study showing that the number of rboh genes in the polyploid plant, Gossypium hirsutum, was higher than its progenitor plants G. raimonddi and G. arboreum (Wang et al. 2020). Moreover, some AGPases in I. trifida and I. triloba exhibited an II value ≥ 40, which means an unstable state, but there was no AGPase representing an II value ≥ 40 in I. batatas (Table 1). This suggests that some of the genes that were unstable during the evolution of I. batatas may have been deleted.

A difference in the domain composition of AGPase was observed between sweetpotatoes and the other Ipomoea plants; I. batatas has a more complex composition (Fig. 1b). The N-terminal of the small subunit and the C-terminal in sweetpotatoes were composed differently from the domains of the two species. These results suggest that LbH_G1P_AT_C at the C-terminal and ADP_Glucose_PP and Glyco_tranf_GTA_type at the N-terminal of the small subunit contribute to the different functions and regulations than non-tuberous relative plants. Many studies have shown that genes can be orthologs or paralogs by domain architectures, such as the insertion and deletion of new domains during evolution (Björklund et al. 2005; Forslund et al. 2011). Although this study cannot confirm the homolog genes of each AGPase in the genus Ipomoea plants, the evolutionary process of the genome among these plants, including AGPase, is expected to be revealed through further studies.

Sweetpotato AGPases have relatively conserved domains compared to I. trifida and I. triloba. The small subunit of AGPase showed complex structures in sweetpotatoes compared to the other two species. Sweetpotato AGPase had the LbH_G1P_AT_C domain in the C-terminal region, which was not present in I. trifida and I. triloba. This suggests that the structure of AGPase in sweetpotato, which is different from the other two species, plays important roles in certain functions of sweetpotatoes, such as starch biosynthesis and tuber formation. More isolation studies and further examination of gene expression will be needed to clarify the functional role of sweetpotato- specific domains in tuberization.

