Research Insights

U24 – GlyGen growth and evolution into a central resource for glycans and glycoconjugates, Michael Tiemeyer

Post author By orandall
Post date March 22, 2023

GlyGen is a maturing (five years old) knowledgebase that accumulates data in the glycobiology domain and connects it with other data types. GlyGen is unique; no other present or prior informatic resource has undertaken such an integrative mission. In part, the lagging growth of accessible knowledge in the glycobiology domain, compared to other -omics or biomedical research fields, reflects the inherent complexity of glycan structures, which, unlike genes and proteins, exist in branched and isomeric forms whose biosynthesis is not attributable to well-characterized, template-driven processes such as transcription or translation. Rather, glycan biosynthesis is mediated by the regulated expression of ensembles of glycosyltransferases, substrate transporters, and secretory pathway regulatory mechanisms that together generate dynamic cell- and tissue-specific patterns of protein and lipid glycosylation. In addition, each glycosylation site on a glycoprotein may routinely be modified by one of an ensemble of glycan structures, a glycoprotein feature called microheterogeneity. Importantly, microheterogeneity is not random, but reflects the intrinsic biosynthetic capacity of specific cells and tissues and may be modified by disease. These structural and biosynthetic complexities are essential contributors to the tissue- and disease-specific functions of glycans and glycosylation, and, therefore, need to be captured and represented in knowledgebases in a way that they can be queried and linked to other types of data. GlyGen aims to expand its underlying data model to accommodate new and more complex datatypes, augmenting and integrating new data types, and implementing robust modeling, unified procedures, and tools to improve discovery and exploration of glycan and glycoconjugate data. Enhancement of the overall resource

functionality will be achieved through front-end improvements to accommodate user preferences and ensure exceptional data communication and visualization. Improving the interconnectivity of GlyGen and its partner databases as well as enhancing data-sharing across resources will continue to be core principles of the GlyGen project. All resulting harmonized data will be available through highly permissive licenses for easy integration into other resources, such as NCBI, EBI, SIB and other international efforts, as well as for easy repurposing by independent researchers, educators, bioinformaticians, and commercial entities. By the end of the next project period GlyGen expects to become the go-to, well-integrated

resource for glycoscience data, similar to existing protein and genomic resources and serving the same broad community of biomedical researchers.

Funder: NIH
Amount: $5,397,508
PI: Michael Tiemeyer