Determination of O-glycan Sites by OpeRATOR® Digestion and LC-MS

The growing interest in understanding the biological role of O-glycosylation has been held back by the analytical challenges caused by inherent heterogeneity and the lack of a consensus motif. TheO-glycans are often situated in serine and threonine rich clusters making specific proteolysis difficult. The generated peptides from a standard trypsin digestion of a highly O-glycosylated protein tend to be larger and carry several O-glycans. O-glycopeptides often exhibit poor ionization efficiency and the larger peptides are difficult to analyze in detail.
 
OpeRATOR is a unique enzyme that catalyses the hydrolysis of peptide bonds on O-glycosylated proteins N-terminally of the serine and threonine glycosylation sites. This specificity has made it possible to develop analytical strategies for combined mapping of O-linked glycosylation sites and define glycans at those sites using the more widely applied fragmentation method Collision Induced Dissociation (CID).
 
Here, we present a robust workflow for O-glycan site determination using the biopharmaceutical human C1-inhibitor as an example. This protein has a highly O-glycosylated N-terminal region, carrying up to 26 O-glycans. The sialic acids and N-glycans of the glycoprotein were removed with SialEXO® and PNGaseF respectively. In the same overnight reaction, O-glycopeptides were formed by OpeRATOR digestion. The O-glycopeptide sample was cleaned-up on graphite spin columns and analysed by HILIC-MS. The workflow is also suitable for the analysis of other O-glycoproteins but might need some minor modifications depending on the nature of the protein.

OpeRATOR

OpeRATOR® Product Page

Read more about OpeRATOR.

Application Notes

Application Note

Download the OpeRATOR Application Note

push-shop

Order

OpeRATOR is available for ordering. Buy online or visit 'Place an Order'. 

Summary

  • The O-glycosylated sites of plasma-derived human C1-inhibitor were determined and compared to those published previously
  • OpeRATOR enables core 1 O-glycan site determination without the need for ETD mass spectrometry
  • OpeRATOR opens up a variety of new applications in the glycoproteomic field
  • Detailed methods for sample preparation, separation and analysis are described

Introduction

The most common type of mammalian O-glycosylation is GalNAcylation or mucin-type O-glycosylation. This type exists as eight different core structures, all of which have an α-linked GalNAc residue at the reducing end as a common feature. Core 1-4 structures may be considered common, of which core 1 and core 2 glycans are the most frequently occurring ones. Core 5-8 are rare (1).
 
An O-glycosylated protein consists of a population of variably glycosylated isoforms with heterogeneity in terms of number, sites of attachment, occupancy and composition. There is no known consensus motif for O-glycosylation, although it has been shown that besides serine and threonine also proline, alanine and glycine residues are overrepresented around the modification sites. Threonine residues are modified to a higher extent than serine (2). Since the O-glycans are often situated in clusters at serine- and threonine-rich stretches, site-specific O-glycan analysis is challenging.
 
Traditionally, several different specific and unspecific proteases and peptidases have been needed to digest a protein and yet the peptide fragments often contain more than one potential O-glycosylation site. In addition, automated data interpretation of peptides generated from unspecific digestion is usually quite unreliable. Several complementary fragmentation techniques might be necessary to accurately determine the specific O-glycosylated site. A combination of Collision Induced Dissociation (CID) or Higher energy Collision Dissociation (HCD), resulting in glycan sequence information, and Electron Transfer Dissociation (ETD), enabling amino acid sequence identification, has been proven to be successful in some cases. However, depending on the amino acid and glycan sequences, glycan/peptide size ratio and charge, there is still a lot of challenges to overcome even with these techniques (3,4).
 
An alternative method for site determination is now possible with the OpeRATOR enzyme. This endo-O-protease specifically digest peptides and proteins N-terminally of O-glycosylated serine or threonine residues. It is derived from the bacteria Akkermansia muciniphilia that inhabits the large intestine. The bacteria use mucin as a nutrient source and benefit from harbouring a mucin-degrading protease. OpeRATOR is recombinantly expressed in E. coli.
 
The ability to specifically digest O-glycoproteins and generate glycopeptides where the N-terminal amino acid is modified with an O-glycan, makes site determination a lot easier and enables site-specific analysis of O-glycosylation without the need for ETD fragmentation. As O-glycans are often exposed on the protein structure, OpeRATOR digestion is possible under native conditions. In some cases, the generated peptides contain more than one O-glycan site, i.e. OpeRATOR does not digest at all sites in every molecule. The missed digestion sites differ from molecule to molecule, leading to a population of overlapping O-glycopeptides. Taken together, the summarized data provides information about the identity of the O-glycosylated sites, including consecutive sites.
 
Here, we present a simple workflow for mapping glycosylation sites in an O- glycoprotein using OpeRATOR digestion (illustrated in Fig. 1). We demonstrate its usefulness by analyzing the O-glycosylation sites of the biopharmaceutical protein C1-inhibitor purified from human plasma, and compare the results with previously published O-glycosylated sites (5).

 

Figure 1. Schematic illustration of OpeRATOR activity. The sialic acids and N-glycans of the glycoprotein are enzymatically removed and OpeRATOR digests the protein N-terminally of the O-glycosylated sites. Missed digestion sites differ from molecule to molecule and the summarized data provide information of the modified sites.

Materials & Methods

O-glycoprotein Digestion using OpeRATOR

All reagents were reconstituted in MQ water according to Table 1. Digestion reactions of 100 μg were prepared by adding 2.5 μl C1-inhibitor (40 mg/ml), 2.5 μl SialEXO (40 U/μl), 2.5 μl OpeRATOR (40 U/μl), 5 μl PNGaseF (2 U/μl) and 37.5 μl 20 mM Tris buffer pH 7.5 to a final concentration of 2 mg/ml C1 inhibitor, 1 U/ug of SialEXO and OpeRATOR and 0.1 U/μg of PNGaseF. The samples were incubated at 37°C overnight.

 

Table 1. Reconstitution of Reagents

Reagent

Supplier / Cat. No

Units

MQ reconstitution volume (μl)

Concentration

OpeRATOR®

Genovis / G2-OP1-020

2000

50

40 units/μl

SialEXO®

Genovis / G1-SM1-020

2000

50

40 units/μl

PNGaseF

Sigma / F8435

300

150

2 units/μl

C1-inhibitor

commercially available drug

-

-

40 mg/ml

 

Note: There are no cysteine bridges in the O-glycosylated sequence of the C1-inhibitor. Depending on the nature of the protein, denaturation and reduction may be needed prior to analysis or even prior to digestion if the O-glycans are not accessible. We suggest a 1 h reaction at 37°C in 4-6 M urea or GdHCl and up to 100 mM DTT or TCEP. Make sure to dilute the sample below 1 M of the chaotropic agent of choice before adding the enzymes. In some cases, additional proteolytic enzymes might be necessary to get O-glycopeptides of suitable sizes.

 

Sample Clean-up on Graphite Spin Columns

To remove buffer salts and increase O-glycopeptide purity, clean-up was performed on Graphite Spin columns with a binding capacity of up to 100 μg (PierceTM cat. # 88302). Follow the steps described in Table 2 below. Just before applying the digested protein sample in step 5, the entire reaction volume (50 μl) was diluted with 50 μl 2.5% TFA.

 

Table 2. O-glycopeptode Clean-up on Graphite Spin Columns

Step

Solution

Speed

Time

Repeat step

1

Remove storage solution

-

2000 x g

1 min

-

2

Prepare graphite

100 μl 1 M NH4OH

2000 x g

1 min

1

3

Activate graphite

100 μl acetonitrile (ACN)

2000 x g

1 min

-

4

Equilibrate column

100 μl 1.0% TFA

2000 x g

1 min

1

5

Apply sample (50 μl digest + 50 μl 2.5% TFA)
& incubate for 10 min with periodic vortex mixing

-

-

10 min

-

6

Discard flow-through

-

1000 x g

3 min

-

7

Wash column

100 μl 1.0% TFA

2000 x g

1 min

1

8

Elute O-glycopeptides. Re-apply the same volume

50 μl 0.1% FA in 50% ACN

2000 x g

1 min

3

Option 1:
Dry the eluted O-glycopeptides in a vacuum evaporator, dissolve the peptides in 10μl MQ, then add 40μl 100% ACN + 0.25μl of 100% FA to reach high enough % of ACN for binding to the HILIC column [80% ACN 0.5% FA].

Option 2:
Adjust the eluted sample to ~starting conditions of HILIC LC. Add 30 μl of 100% ACN and 0.25 μl 100% FA [80% ACN 0.5% FA] to 20 μl of the eluted sample.

 

Analysis by HILIC and Mass Spectrometry

The intact O-glycopeptides were analyzed by HPLC-MS/MS using a 1260/1290 Infinity high performance liquid chromatography system (Agilent), and an Impact II Q-TOF mass spectrometer (Bruker) equipped with an electrospray ionization source. Acquiring data at different collision energies provided information on both the glycan structure based on the precursor ion mass and the oxonium ions present as well as the amino acid sequence (6). Instrument settings and the chromatography gradient are defined in the tables below.

 

Table 3. HPLC Setup

LC System:

Agilent 1260/1290 Infinity

LC software:

HyStar

Sample temperature:

8 °C

Flow rate:

0.2 ml/min

Injection volume:

20 μl

Column:

Acquity UPLC Glycoprotein Amide Column, 300Å, 1.7μm, 2.1 mm x 150 mm

Column temperature:

40 °C

Method time:

85 minutes

 

Table 4. HILIC Solvent Gradient

Time (minutes)

Solvent A (%): 0.5% formic acid in MQ

Solvent B (%): 0.5% formic acid in 95% ACN:5% MQ

0

10

90

0.5

10

90

71

50

50

72

95

5

74

95

5

75

10

90

85

10

90

 

Table 5. Mass Spectrometry Setup

Instrument:

Bruker Impact II

MS software:

Otof Control

Resolving power:

50,000

Mass range:

100-3,000 m/z

ESI source voltage:

4.5 kV

ESI source temperature:

220 °C

Nebulization gas pressure:

1.8 Bar

Nebulization gas flow rate:

8.0 l/min

MS spectra rate

2 Hz

Precursor selection range

300-3000

Precursor width

± 0.5 Da

Total cycle time range

3.5 s

Collision cell

7.0 eV

Pre-pulse storage

10 μs

Stepping

RF 800 Vpp, 100 μs
RF 2000 Vpp, 140 μs
50 and 100% of collision energy

Data analysis software:

Compass DataAnalysis 3.0

 

The MS/MS data was converted to .xml files in DataAnalysis v.3.0 and imported into Biopharma Compass where the Spectra Classifier, based on diagnostic MS/MS spectra features, detected glycopeptide spectra and calculated the mass of the peptide moiety. Using GlycoQuest, the classified spectra were searched against the CarbBank database for glycan composition and accession number. A theoretical digest against the known protein sequence was performed where the peptide mass from the glycopeptide classification was considered. If not already available, a “new enzyme-” with N-terminal digestion at S and T was added to the theoretical digest parameters. The number of missed digestion sites must be considered as there is not a glycan on every S or T and O-glycans often occur in clusters rich in S and T. Methionine oxidation was set as a dynamic modification. The theoretical digest and glycan search results were combined, and data was assessed based on protein and glycan scores >20, absolute intensities ≥ 1x104 and RSM90 < 30 (RSM90 = Deviation from predicted mass (root mean square value / root mean square 90% con dence value). An example of mass spectrum is presented in Fig. 2.

Results & Discussion

The specific digestion by OpeRATOR and the following graphite clean-up ensured that the remaining peptides that were separated by HILIC were O-glycopeptides. In contrast to a standard trypsin digestion, where the number of O-glycopeptides would be low and carry many O-glycans, we achieved many overlapping O-glycopeptides carrying one, and at the most three, O-glycans.

The total ion chromatogram trace of the MS analysis of the O-glycopeptides confirmed a good separation on the HILIC column. The similarity of the MS/MS trace compared to the extracted ion chromatogram (XIC) trace of the 366.1395 m/z oxonium ion (HexNAcHex) highlights the fact that most of the peptides carry O-glycans (Fig. 2). The MS trace could potentially be used as a simple O-glycopeptide fingerprinting assay for batch to batch or biosimilar comparison. Peptide spectrum matches and glycan search matches were combined to generate a glycopeptide result, see example in Fig. 3.

The theoretical digest match of the O-glycopeptide classified spectra against the known protein structure resulted in peptides with overlapping sequence coverage. Fig. 4 and Table 6 summarize the various identified peptides and some of these are detected with different degrees of glycosyl- ation.

 

Figure 2. Total ion chromatogram traces from MS (top) and MS/MS (middle) analysis of the peptides from OpeRATOR digestion of the C1-inhibitor protein. The XIC of HexNAcHex oxonium ion 366.1395 m/z (bottom) confirm that most of the peaks are O-glycopeptides.

 

Figure 3. Example of combined results from theoretical digest peptide annotation and GlycoQuest glycan annotation. From the acquired data, the O-glycosylated peptide TSSSSQDPESLQDRGEGKVAT (27-47) could be identified to be carrying HexNAcHex, and as a result of OpeRATOR specificity, the N-terminal threonine was defined as the site of modification. The identified oxonium ions are those of HexNAcHex (m/z 366.138) and (HexNAc m/z 204.085).

 

 

Figure 4. Schematic illustration of the O-glycan rich N-terminal amino acid region (aa 25-120) of the C1-inhibitor protein and the various O-glycopeptides generated by OpeRATOR digestion. The O-glycosylated serine (S) and threonine (T) sites are presented with bold letters. As no additional enzyme was used, the T118 and T119 sites are only verified by the C-terminus of the preceeding peptides.

Table 6. A selection of the identified peptide sequences and glycoforms verified by the corresponding MS/MS data

In a recent publication on O-glycosylation of the C1-inhibitor, a trypsin/ pronase/ proteinase K-based approach was used (5). Ten out of twenty-six O-glycosylation sites could be determined, but the specific sites of the 12-16 O-glycans situated between residues 81 and 122 could not be identified. Using the OpeRATOR enzyme, we were able to map all of the O-glycosylation sites within this sequence.

We have demonstrated that O-glycan site determination is possible using the OpeRATOR enzyme. We used the heavily O-glycosylated C1-inhibitor in a workflow including desilalylation, N-deglycosylation, OpeRATOR digestion, sample clean-up followed by HILIC separation and MS analysis. This protocol is applicable to other glycoproteins with optional adjustments like denaturation, reduction or addition of other proteases that might be needed to achieve O-glycopeptides of suitable sizes for MS analysis.

Two recent publications have demonstrated further developments of OpeRATOR-based O-glycan analysis. They showed workflows where tryptic peptides of O-glycoproteins were immobilized covalently onto a solid support and O-glycopeptides were specifically released by digestion with OpeRATOR (7,8). This enables analysis of many different O-glycoproteins and even complex samples like cell or tissue lysates.

References

  1. Varki A., Cummings R.D., Esko J.D., Freeze H.H., Stanley P., Bertozzi C.R., Hart G.W., and Etzler M.E. (2009). Essentials of Glycobiology, 2nd edition. Cold Spring Harbour (NY): Cold Spring Harbour Laboratory Press.
  2. Gill D J., Clausen H., and Bard F. (2011). Location location location: new insights intoO-GalNAc protein glycosylation. Trends cell biology 21, 149-158.
  3. Zsusanna Darula, Katalina F Medzihradszky. (2018). Analysis of mammalian O-glycopeptides- We have made a good start but there is a long way to go. Molecular and cellular proteomics 17.1.
  4. Kay-Hooi Khoo. (2019). Advances toward mapping the full extent of protein site-specific O-GalNAc glycosylation that better reflects underlying glycomic complexity. Current Opinion in Structural Biology, 56:146–154.
  5. Kathrin Stavenhagen, H. Mehmet Kayili, Staphanie Holst, Carolien A.M: Koleman, Ruchira Engel, Diana Wouters, Sacha Zeerleder, Bekir Salih and Manfred Wuhrer. (2018). N-and O-glycosylation Analysis of human C1-inhibitor Reveals extensive Mucin-type O-glycosylation. Molecular and Cellular Proteomics 17.6.
  6. Hinneburg H, Stavenhagen K, Schweiger-Hufnagel U, Pengelley S, Jabs W, Seeberger PH, Silva DV, Wuhrer M, Kolarich D.J. (2016). The Art of Destruction: Optimizing Collision Energies in Quadrupole-Time of Flight (Q-TOF) Instruments for Glycopeptide-Based Glycoproteomics. Am Soc Mass Spectrom. Mar;27(3):507-19. doi: 10.1007/s13361- 015-1308-6. Epub 2016 Jan 4.
  7. Shuang Yang, Philip Onigman, Wells W. Wu, Jonathan Sjögren, Helén Nyhlén, Rong-Fong Shen, and John Cipollo. (2018). Deciphering Protein O-Glycosylation: Solid-Phase Chemo enzymatic Cleavage and Enrichment. Analytical Chem., 90, 13, 8261-8269.
  8. Weiming Yang, Minghui Ao, Yingwei Hu, Qing Kay Li, Hui Zhang (2018). Mapping the Oglycoproteome using site specific extraction of O linked glycopeptides (EXoO). Molecular Systems Biology 14, e8486, DOI 10.15252/msb.20188486 | Published online 20.11.2018.

OpeRATOR

Watch the OpeRATOR® Movie

Learn more about how OpeRATOR works.

push-posters

Download OpeRATOR Posters

Download posters on OpeRATOR here.

push-o-glycans-02

Read more about our Enzymes for O-glycans

We have other enzymes for O-glycan analysis available. Read more here! 

Order