Inequality relations for NMR-based polymer homoblock analysis and extended application: Reanalysis of historical data on alginates, chitosans, homogalacturonans, and galactomannans

Xiaohui Xing; Kanglin Xing; Yves S Y Hsieh; D Wade Abbott

doi:10.1016/j.carres.2024.109189

Inequality relations for NMR-based polymer homoblock analysis and extended application: Reanalysis of historical data on alginates, chitosans, homogalacturonans, and galactomannans

Carbohydr Res. 2024 Aug:542:109189. doi: 10.1016/j.carres.2024.109189. Epub 2024 Jun 11.

Authors

Xiaohui Xing¹, Kanglin Xing², Yves S Y Hsieh³, D Wade Abbott⁴

Affiliations

¹ Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, 5403 1st Avenue South, Lethbridge, Alberta T1J 4B1, Canada. Electronic address: [email protected].
² Department of Mechanical Engineering, École de technologie Supérieure, 1100 Notre-Dame Street West, Montreal, Quebec H3C 1K3, Canada. Electronic address: [email protected].
³ Division of Glycoscience, Department of Chemistry, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, AlbaNova University Centre, Stockholm SE10691, Sweden; School of Pharmacy, College of Pharmacy, Taipei Medical University, 250 Wuxing Street, Taipei 11031, Taiwan. Electronic address: [email protected].
⁴ Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, 5403 1st Avenue South, Lethbridge, Alberta T1J 4B1, Canada. Electronic address: [email protected].

PMID: 38971003
DOI: 10.1016/j.carres.2024.109189

Abstract

There has been a long-standing bottleneck in the quantitative analysis of the frequencies of homoblock polyads beyond triads using ¹H and ¹³C NMR for linear polysaccharides, primarily because monosaccharides within a long homoblock share similar chemical environments due to identical neighboring units, resulting in indistinct NMR peaks. In this study, through rigorous mathematical induction, inequality relations were established that enabled the calculation of frequency ranges of homoblock polyads from historically reported NMR-derived frequency values of diads and/or triads of alginates, chitosans, homogalacturonans, and galactomannans. The calculated homoblock frequency ranges were then applied to evaluate three chain growth statistical models, including the Bernoulli chain, first-order Markov chain, and second-order Markov chain, for predicting homoblock frequencies in these polysaccharides. Furthermore, based on the mathematically derived inequality relations, a novel 2D array was constructed, enabling the graphical visualization of homoblock features in polysaccharides. It was demonstrated, as a proof of concept, that the novel 2D array, along with a 1D code generated from it, could serve as an effective feature engineering tool for polymer classification using machine learning algorithms.

Keywords: Feature engineering; Homoblock; Machine learning; Markov chain; NMR; Polysaccharide.

MeSH terms

Alginates* / chemistry
Galactose / analogs & derivatives
Galactose / chemistry
Magnetic Resonance Spectroscopy*
Mannans* / chemistry
Pectins

Substances

Mannans
Alginates
galactomannan
polygalacturonic acid
Galactose
Pectins