Oral Presentation HUPO 2019 - 18th Human Proteome Organization World Congress

Combining Mass Spectrometry with Machine Learning Algorithms to Classify Core and Outer Fucosylation in N-Glycoproteins (64013)

Heeyoun Hwang 1 , Hoi Keun Jeong 1 2 , Hyun Kyoung Lee 1 2 , GunWook Park 1 , Ju Yeon Lee 1 , Soo Youn Lee 1 , Hyun Joo An 2 , Jin Young Kim 1 , Jong Shin YOO 1 2
  1. Korea Basic Science Institute, Cheongju, CHUNGBUK, Rep. of Korea
  2. Graduate School of Analytical Science and Technology, Chungnam National University, Daejeon, Rep. of Korea

Classification of fucosylated N−glycoproteins including structural core− or outer−isoforms remains a challenge in Mass Spectrometry (MS). Here, we first report classification of N−glycopeptides as core− and outer−fucosylated types using MS/MS spectra and machine learning algorithms such as deep neural network (DNN) and support vector machine (SVM). Training and test sets of more than 800 MS/MS spectra of glycopeptides from immunoglobulin gamma and alpha 1−acid−glycoprotein standards were selected for classification of fucosylation types using supervised learning models. The best−performing model was selected by an accuracy more than 99% against manual characterization and area under curve values more than 0.99, which were calculated by probability scores from target and decoy datasets. Finally, this model was applied to classify fucosylated N−glycoproteins from human plasma. A total of 82 N−glycopeptides with 54 core, 24 outer, and 4 dual fucosylation types derived from 54 glycoproteins were commonly classified as the same type in both DNN and SVM. Specially, outer fucosylation was dominant in tri− and tetra−antennary N−glycopeptides, while core fucosylation was dominant in mono−, bi−antennary and hybrid types of glycoproteins in human plasma. Thus, combining DNN and SVM machine learning methods with MS/MS can be used to distinguish between different isoforms of fucosylated N−glycopeptides.