The Ritz Herald
University of California, Riverside

New Data Science Tool Speeds Up Molecular Analysis of Environment


UC Riverside-led team developed the tool through an international virtual research group

Published on September 21, 2024

A research team led by scientists at the University of California, Riverside, has developed a computational workflow for analyzing large data sets in the field of metabolomics, the study of small molecules found within cells, biofluids, tissues, and entire ecosystems.

Most recently, the team applied this new computational tool to analyze pollutants in seawater in Southern California. The team swiftly captured the chemical profiles of coastal environments and highlighted potential sources of pollution.

“We are interested in understanding how such pollutants get introduced in the ecosystem,” said Daniel Petras, an assistant professor of biochemistry at UC Riverside, who led the research team. “Figuring out which molecules in the ocean are important for environmental health is not straightforward because of the ocean’s sheer chemical diversity. The protocol we developed greatly speeds up this process. More efficient sorting of the data means we can understand problems related to ocean pollution faster.”

Petras and his colleagues report today in the journal Nature Protocols that their protocol is designed not only for experienced researchers but also for educational purposes, making it an ideal resource for students and early-career scientists. This computational workflow is accompanied by an accessible web application with a graphical user interface that makes metabolomics data analysis accessible for non-experts and enables them to gain statistical insights into their data within minutes.

“This tool is accessible to a broad range of researchers, from absolute beginners to experts, and is tailored for use in conjunction with the molecular networking software my group is developing,” said coauthor Mingxun Wang, an assistant professor of computer science and engineering at UCR. “For beginners, the guidelines and code we provide make it easier to understand common data processing and analysis steps. For experts, it accelerates reproducible data analysis, enabling them to share their statistical data analysis workflows and results.”

Petras explained the research paper is unique, serving as a large educational resource organized through a virtual research group called Virtual Multiomics Lab, or VMOL. With more than 50 scientists participating from around the world, VMOL is a community-driven, open-access community. It aims to simplify and democratize the chemical analysis process, making it accessible to researchers worldwide, regardless of their background or resources.

“I’m incredibly proud to see how this project evolved into something impactful, involving experts and students from across the globe,” said Abzer Pakkir Shah, a doctoral student in Petras’ group and the first author of the paper. “By removing physical and economic barriers, VMOL provides training in computational mass spectrometry and data science and aims to launch virtual research projects as a new form of collaborative science.”

Staff Writer