By combining the genetic sequencing and analysis of the microbes in a milk sample with artificial intelligence (AI), researchers were able to detect anomalies in milk production, such as contamination or unauthorized additives. The new approach could help improve dairy safety, according to the study authors from Penn State, Cornell University and IBM Research.
In findings published in mSystems, a journal of the American Society for Microbiology, the researchers reported that using shotgun metagenomics data and AI, they were able to detect antibiotic-treated milk that had been experimentally and randomly added to the bulk tank milk samples they collected. To validate their findings, the researchers also applied their explainable AI tool to publicly available, genetically sequenced datasets from bulk milk samples, further demonstrating the untargeted approach’s robustness.
“This was a proof of concept study,” said the study’s lead Erika Ganda, assistant professor of food animal microbiomes, Penn State College of Agricultural Sciences. “We can look at the data from the microbes in the raw milk and, using artificial intelligence, see if the microbes that are present reveal characteristics such as whether it is pre-pasteurization, post-pasteurization, or is from a cow that has been treated with antibiotics.”
The researchers collected 58 bulk tank milk samples and applied various AI algorithms to differentiate between baseline samples and those representing potential anomalies, such as milk from an outside farm or milk containing antibiotics. This study characterized raw milk metagenomes — collections of genomes from many individual microbes within a sample — in more sequencing depth than any other published work to date and demonstrated that there is a set of consensus microbes found to be stable elements across samples.
The study’s findings suggest that AI has the potential to significantly enhance the detection of anomalies in food production, providing a more comprehensive method that can be added to scientists’ toolkit for ensuring food safety, Ganda explained.
“Traditional analysis of microbial sequencing data, such as alpha and beta diversity metrics and clustering, were not as effective in differentiating between baseline and anomalous samples,” she said. “However, the integration of AI allowed for accurate classification and identification of microbial drivers associated with anomalies.”
Microbial systems and the food supply chain are an ideal application for AI since the interactions between microbes are complex and dynamic, according to the study’s first author Kristen Beck, senior research scientist from IBM Research.
“There are also a multitude of variables in the food supply chain that affect the signal we’re seeking to observe,” she said. “AI can help us untangle the signal from the noise.”
While focused on the dairy production, this research has implications for the wider food industry, Ganda noted, adding that milk was selected as a model because it is the sole ingredient used to produce fluid milk — a high-volume food with considerable concern for fraud, particularly in developing countries.
Issues in food quality and safety can have rippling effects through the supply chain, causing substantial health and economic damage, explained Ganda, so there is substantial interest in applying both targeted and untargeted methods to identify ingredients or food products that show an increased risk of food fraud, food quality and food safety issues.
“Untargeted methods characterize all molecules that can be identified to identify ingredients or products that deviate from a ‘baseline state’ that would be considered normal or under control,” she said. “Importantly, these untargeted methods are screening methods that do not define an ingredient or product as unsafe or adulterated, rather they suggest an aberration from the normal state that should trigger follow-up actions or investigations.”
The unique research collaboration leveraged each partner’s strength, Ganda pointed out. It featured IBM’s open-source AI technology, Automated Explainable AI for Omics, to process vast amounts of metagenomic data, or all the nucleotide sequences isolated and analyzed from all the microbes in bulk milk samples, enabling the identification of microbial signatures that traditional methods often can miss. The Cornell researchers’ expertise in dairy science elevated the practical relevance of the research and its applicability to the dairy industry, while Penn State’s One Health Microbiome Center in the Huck Institutes for the Life Sciences played a critical role in integrating microbial data for broader health and safety applications.
Also contributing to the research were Niina Haiminen,Akshay Agarwal, Anna Paola Carrieri, Matthew Madgwick, Jennifer Kelly and Ban Kawas, IBM Research; Victor Pylro, Federal University of Lavras, Brazil; and Martin Wiedmann, Cornell University.
The U.S. Department of Agriculture supported this work through Penn State.