1. Data statistics of the experimental and potential TM proteins in topPTM. |
The TM proteins can be classified by structure as alpha-helical proteins and beta-barrel proteins. Alpha-helical TM proteins are a main class of membrane proteins; an estimated 27% of all human proteins are alpha-helical membrane proteins (Almen, Nordstrom et al. 2009). Beta-barrel TM proteins, which are found in the outer membranes of Gram-negative bacteria, in the cell walls of Gram-positive bacteria, and in the outer membranes of mitochondria and chloroplasts, participate in essential cellular functions by acting as porins, transporters, enzymes, virulence factors and receptors. Experimentally verified TM proteins annotated with membrane topology information were mainly collected from PDB_TM (Tusnady, Dosztanyi et al. 2005), OPM (Lomize, Lomize et al. 2006), TOPDB (Tusnady, Kalmar et al. 2008), and TMPad (Lo, Cheng et al. 2011). After the removal of redundant protein entries, a total of 5394 TM proteins containing experimentally curated annotations of transmembrane topology remained. A set of candidate TM proteins was also extracted from UniProtKB by choosing protein entries containing the keyword "TRANSMEM" in the feature ("FT") line, the localization of "membrane", and the transmembrane topology information. The candidate TM proteins were further filtered using HMMTOP (Tusnady and Simon 2001) and MEMSAT (Nugent and Jones 2009) to determine their transmembrane topologies. Following table shows that the filtering process obtained 69402 potential TM proteins with annotated topologies.
|
Resource | Number of experimentally verified TM proteins | Number of potential TM proteins | ||||
All | α-helical | β-barrel | All | α-helical | β-barrel | |
TMPad | 379 |
379 |
0 |
N/A |
N/A |
N/A |
OPM | 651 |
1435 |
44 |
N/A |
N/A |
N/A |
TOPDB | 1479 |
667 |
91 |
N/A |
N/A |
N/A |
PDBTM | 785 |
556 |
96 |
N/A |
N/A |
N/A |
UniProtKB |
4964 |
4920 |
139 |
69402 |
68545 |
856 |
Total |
5394 |
4991 |
170 |
69402 |
68545 |
856 |
2. Data statistics of the substrate sites according to PTM type. |
Due to an emerging evidence of MS/MS-based proteomics in identifying post-translational modifications, the site-specific modified peptides are manually extracted from approximately 500 MS/MS-associated research articles using a text mining approach. After removing the redundant PTM instances collected from a veriety of public resources, totally 4747 and 47358 experimental PTM sites are annotated on 1049 experimental and 8674 potential TM proteins, respectively. According to the data statistics of each PTM type shown in under table, protein phosphorylation contains the most abundant substrate sites on experimental TM proteins, including 2108 phosphoserines on 603 TM proteins, 645 phosphothreonines on 333 TM proteins, and 585 phosphotyrosines on 268 TM proteins. Otherwise, there are 25789 phosphoserines, 7510 phosphothreonines and 5939 phosphotyrosines on potential TM proteins.
|
PTM Instance Type | Number of PTM sites on TM proteins (experimental) | Number of PTM sites on TM proteins (poteintial) |
Phosphoserine | 2108 | 25789 |
Phosphothreonine | 645 | 7510 |
Phosphotyrosine | 585 | 5939 |
N-linked (GlcNAc...) | 593 | 2519 |
N6-acetyllysine | 114 | 1214 |
S-nitrosocysteine | 70 | 655 |
N-linked (Glc...) | 128 | 570 |
O-linked (GalNAc...) | 63 | 497 |
S-cysteinyl 3-(oxidosulfanyl)alanine (Cys-Cys) | 110 | 222 |
S-palmitoyl cysteine | 43 | 210 |
N-acetylalanine | 13 | 155 |
N-palmitoyl cysteine | 20 | 121 |
N-myristoyl glycine | 6 | 129 |
O-linked (GlcNAc) | 8 | 122 |
N-acetylserine | 20 | 93 |
N-acetylmethionine | 12 | 90 |
S-farnesyl cysteine | 0 | 93 |
Caspase cleavage aspartic acid | 6 | 81 |
Methionine sulfone | 4 | 78 |
N2,N2-dimethylarginine | 9 | 69 |
N6-(retinylidene)lysine | 57 | 18 |
5-methylarginine | 8 | 66 |
DePhosphotyrosine | 12 | 55 |
S-geranylgeranyl cysteine | 1 | 47 |
O-linked (GlcNAc...) | 1 | 46 |
Cysteine methyl ester | 1 | 45 |
4-hydroxyproline | 2 | 43 |
O-linked (Man) | 4 | 37 |
Asymmetric dimethylarginine | 0 | 39 |
Pyrrolidone carboxylic acid | 6 | 32 |
S-diacylglycerol cysteine | 2 | 36 |
N-acetylthreonine | 2 | 34 |
Sulfotyrosine | 11 | 25 |
N-formylmethionine | 12 | 21 |
DePhosphoserine | 4 | 27 |
N6,N6-dimethyllysine | 0 | 31 |
Glutamate methyl ester (Glu) | 2 | 27 |
Nitrated | 5 | 22 |
Omega-N-methylarginine | 3 | 23 |
Deamidated asparagine | 1 | 23 |
O-linked (Man...) | 0 | 23 |
N4-methylasparagine | 0 | 22 |
(3S)-3-hydroxyasparagine | 0 | 20 |
GPI-anchor amidated serine | 2 | 18 |
N6-methyllysine | 2 | 17 |
Omega-N-methylated arginine | 3 | 15 |
Glutamate methyl ester (Gln) | 2 | 14 |
N6-succinyllysine | 0 | 15 |
N-linked (Glc) | 0 | 15 |
C-linked (Man) | 2 | 12 |
Phosphohistidine | 2 | 12 |
Citrulline | 0 | 13 |
Deamidated glutamine | 0 | 13 |
Nitrated tyrosine | 4 | 9 |
O-linked (Xyl...) | 3 | 10 |
O-linked (Xyl...) (glycosaminoglycan) | 1 | 11 |
N-acetylglycine | 0 | 11 |
Hydroxyproline | 1 | 8 |
Blocked amino end (Met) | 3 | 5 |
GPI-anchor amidated asparagine | 0 | 8 |
N2-acetylarginine | 1 | 7 |
N6-malonyllysine | 0 | 8 |
O-linked (HexNAc) | 1 | 7 |
Symmetric dimethylarginine | 0 | 8 |
Dimethylated arginine | 3 | 4 |
Leucine amide | 0 | 7 |
N6-(pyridoxal phosphate)lysine | 0 | 7 |
Neddyllysine | 7 | 0 |
O-linked (Fuc) | 0 | 7 |
ADP-ribosylarginine | 0 | 6 |
N6,N6,N6-trimethyllysine | 1 | 5 |
O-AMP-tyrosine | 0 | 6 |
DePhosphothreonine | 0 | 5 |
GPI-anchor amidated aspartate | 1 | 4 |
Methylhistidine | 0 | 5 |
O-linked (Hex...) | 0 | 5 |
S-methylcysteine | 0 | 5 |
Hypusine | 0 | 4 |
Lysine amide | 0 | 4 |
N6-palmitoyl lysine | 0 | 4 |
none | 0 | 4 |
Phosphatidylethanolamine amidated glycine | 0 | 4 |
4-aspartylphosphate | 0 | 3 |
Alkylcysteine | 0 | 3 |
Blocked amino end (Gln) | 0 | 3 |
Carbamidation cysteine | 0 | 3 |
Glutamine amide | 0 | 3 |
GPI-anchor amidated glycine | 0 | 3 |
N6-myristoyl lysine | 2 | 1 |
N-acetyltyrosine | 0 | 3 |
O-AMP-threonine | 0 | 3 |
O-linked (Xyl...) (keratan sulfate) | 0 | 3 |
S-8alpha-FAD cysteine | 3 | 0 |
S-glutathionyl cysteine | 1 | 2 |
S-stearoyl cysteine | 0 | 3 |
Sulfoserine | 1 | 2 |
Tele-8alpha-FAD histidine | 3 | 0 |
(3S)-3-hydroxyhistidine | 0 | 2 |
5-hydroxylysine | 0 | 2 |
Arginine amide | 0 | 2 |
Blocked amino end (Ser) | 0 | 2 |
Cholesterol glycine ester | 0 | 2 |
FMN phosphoryl threonine | 0 | 2 |
Glycine amide | 0 | 2 |
Glycosylation alanine | 1 | 1 |
Glycosylation methionine | 0 | 2 |
GPI-anchor amidated alanine | 0 | 2 |
GPI-anchor amidated cysteine | 0 | 2 |
N6-carboxylysine | 0 | 2 |
N-acetylcysteine | 0 | 2 |
N-acetylproline | 0 | 2 |
N-acetylvaline | 0 | 2 |
O-(5'-phospho-RNA)-serine | 0 | 2 |
O-linked (Fuc...) | 0 | 2 |
O-linked (P-Man...) | 0 | 2 |
Oxidation arginine | 0 | 2 |
Phenylalanine amide | 0 | 2 |
Pros-methylhistidine | 0 | 2 |
S-12-hydroxyfarnesyl cysteine | 0 | 2 |
S-4a-FMN cysteine | 0 | 2 |
(3S)-3-hydroxyaspartate | 0 | 1 |
2',4',5'-topaquinone | 1 | 0 |
3',4'-dihydroxyphenylalanine | 0 | 1 |
3'-nitrotyrosine | 0 | 1 |
3-oxoalanine (Cys) | 1 | 0 |
4-carboxycysteine | 0 | 1 |
4-carboxytyrosine | 0 | 1 |
ADP-ribosylasparagine | 0 | 1 |
ADP-ribosylcysteine | 0 | 1 |
Alanine amide | 1 | 0 |
Alkyllysine | 0 | 1 |
Aspartic acid 1-[(3-aminopropyl)(5'-adenosyl)phosphono]amide | 0 | 1 |
Aspartyl aldehyde | 0 | 1 |
Blocked amino end (Ala) | 0 | 1 |
Blocked amino end (Thr) | 0 | 1 |
Blocked amino end (Xaa) | 0 | 1 |
Cysteine persulfide | 0 | 1 |
Glutamic acid 1-amide | 0 | 1 |
Glycosylation glutamine | 0 | 1 |
GPI-like-anchor amidated asparagine | 0 | 1 |
GPI-like-anchor amidated serine | 0 | 1 |
N6-murein peptidoglycan lysine | 0 | 1 |
N-acetylaspartate | 0 | 1 |
N-acetylglutamate | 1 | 0 |
N-D-glucuronoyl asparagine | 0 | 1 |
N-formylglycine | 0 | 1 |
N-linked (GalNAc...) | 0 | 1 |
N-linked (Glc) (glycation) | 0 | 1 |
N-palmitoyl glycine | 1 | 0 |
O-(2-cholinephosphoryl)serine | 0 | 1 |
O-(5'-phospho-RNA)-tyrosine | 0 | 1 |
O-acetylthreonine | 0 | 1 |
O-linked (HexNAc...) | 0 | 1 |
O-linked (Man6P...) | 0 | 1 |
O-linked (Xyl...) (heparan sulfate) | 0 | 1 |
O-palmitoyl threonine | 1 | 0 |
Phosphocysteine | 0 | 1 |
Phosphorproline | 0 | 1 |
Pyruvic acid (Ser) | 0 | 1 |
S-(15-deoxy-Delta12,14-prostaglandin J2-9-yl)cysteine | 0 | 1 |
S-archaeol cysteine | 0 | 1 |
S-farnesyl serine | 0 | 1 |
Tryptophan amide | 0 | 1 |
Valine amide | 0 | 1 |
Total |
4747 |
47358 |
3. The structural distribution of PTMs containing more than ten substrate sites on experimental transmembrane proteins. |
According to the information of experimentally verified PTMs collected in topPTM database, the structural distribution of PTMs containing more than ten substrate sites on experimental TM proteins is presented in under table. The structural topologies of a TM protein are mainly categorized into five types: Extracellular, Intracellular, Transmembrane, Other and Unknown regions.
|
PTM Type |
Number of substrate sites |
||||
Extracellular |
Cytoplasmic |
Transmembrane |
Other |
Unknown |
|
Phosphoserine |
72 |
1603 |
24 |
210 |
199 |
Phosphothreonine |
52 |
416 |
12 |
66 |
99 |
Phosphotyrosine |
53 |
374 |
21 |
88 |
49 |
N-linked (GlcNAc...) |
417 |
0 |
0 |
146 |
30 |
N6-acetyllysine |
4 |
48 |
8 |
41 |
13 |
S-nitrosocysteine |
8 |
26 |
6 |
12 |
18 |
N-linked (Glc...) |
101 |
0 |
1 |
21 |
5 |
O-linked (GalNAc...) |
57 |
0 |
0 |
6 |
0 |
S-cysteinyl 3-(oxidosulfanyl)alanine (Cys-Cys) |
92 |
0 |
0 |
16 |
2 |
S-palmitoyl cysteine |
0 |
32 |
4 |
1 |
6 |
N-acetylalanine |
0 |
4 |
0 |
1 |
8 |
N-palmitoyl cysteine |
0 |
17 |
1 |
0 |
2 |
N-myristoyl glycine |
0 |
1 |
0 |
5 |
0 |
O-linked (GlcNAc) |
3 |
4 |
0 |
0 |
1 |
N-acetylserine |
0 |
12 |
0 |
4 |
4 |
N-acetylmethionine |
1 |
5 |
1 |
1 |
4 |
S-farnesyl cysteine |
0 |
0 |
0 |
0 |
0 |
Caspase cleavage aspartic acid |
0 |
6 |
0 |
0 |
0 |
Methionine sulfone |
0 |
4 |
0 |
0 |
0 |
N2,N2-dimethylarginine |
1 |
4 |
0 |
4 |
0 |
N6-(retinylidene)lysine |
0 |
0 |
57 |
0 |
0 |
5-methylarginine |
1 |
3 |
0 |
4 |
0 |
DePhosphotyrosine |
0 |
12 |
0 |
0 |
0 |
S-geranylgeranyl cysteine |
0 |
0 |
0 |
0 |
1 |
O-linked (GlcNAc...) |
1 |
0 |
0 |
0 |
0 |
Cysteine methyl ester |
0 |
0 |
0 |
0 |
1 |
4-hydroxyproline |
0 |
2 |
0 |
0 |
0 |
O-linked (Man) |
4 |
0 |
0 |
0 |
0 |
Asymmetric dimethylarginine |
0 |
0 |
0 |
0 |
0 |
Pyrrolidone carboxylic acid |
3 |
0 |
0 |
1 |
2 |
S-diacylglycerol cysteine |
0 |
0 |
0 |
0 |
2 |
N-acetylthreonine |
0 |
0 |
0 |
0 |
2 |
Sulfotyrosine |
11 |
0 |
0 |
0 |
0 |
N-formylmethionine |
0 |
2 |
1 |
5 |
4 |
DePhosphoserine |
0 |
4 |
0 |
0 |
0 |
N6,N6-dimethyllysine |
0 |
0 |
0 |
0 |
0 |
Glutamate methyl ester (Glu) |
0 |
2 |
0 |
0 |
0 |
Nitrated |
0 |
3 |
1 |
0 |
1 |
Omega-N-methylarginine |
0 |
0 |
0 |
0 |
3 |
Deamidated asparagine |
0 |
1 |
0 |
0 |
0 |
O-linked (Man...) |
0 |
0 |
0 |
0 |
0 |
N4-methylasparagine |
0 |
0 |
0 |
0 |
0 |
(3S)-3-hydroxyasparagine |
0 |
0 |
0 |
0 |
0 |
GPI-anchor amidated serine |
1 |
0 |
0 |
0 |
1 |
N6-methyllysine |
0 |
2 |
0 |
0 |
0 |
Omega-N-methylated arginine |
0 |
3 |
0 |
0 |
0 |
Glutamate methyl ester (Gln) |
0 |
2 |
0 |
0 |
0 |
N6-succinyllysine |
0 |
0 |
0 |
0 |
0 |
N-linked (Glc) |
0 |
0 |
0 |
0 |
0 |
C-linked (Man) |
2 |
0 |
0 |
0 |
0 |
Phosphohistidine |
0 |
1 |
0 |
0 |
1 |
Citrulline |
0 |
0 |
0 |
0 |
0 |
Deamidated glutamine |
0 |
0 |
0 |
0 |
0 |
Nitrated tyrosine |
0 |
2 |
0 |
0 |
2 |
O-linked (Xyl...) |
3 |
0 |
0 |
0 |
0 |
O-linked (Xyl...) (glycosaminoglycan) |
0 |
0 |
0 |
1 |
0 |
N-acetylglycine |
0 |
0 |
0 |
0 |
0 |