Source: UNIVERSITY OF MISSOURI submitted to NRP
DSFAS: EXPLORING A MACHINE LEARNING-DRIVEN APPROACH FOR MULTIPLEX DETECTION OF FOOD CONTAMINANTS BY SERS
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
ACTIVE
Funding Source
Reporting Frequency
Annual
Accession No.
1030590
Grant No.
2023-67017-40165
Cumulative Award Amt.
$649,483.00
Proposal No.
2022-11567
Multistate No.
(N/A)
Project Start Date
Jul 1, 2023
Project End Date
Jun 30, 2026
Grant Year
2023
Program Code
[A1541]- Food and Agriculture Cyberinformatics and Tools
Recipient Organization
UNIVERSITY OF MISSOURI
(N/A)
COLUMBIA,MO 65211
Performing Department
(N/A)
Non Technical Summary
Better data analysis means better food safety. In recent years, artificial intelligence (AI)-powered solutions are emerging in agriculture and food science due to their superior pattern recognition and prediction capability. Integration of food science and AI can help address new challenges such asincreasing volume of data acquired from mixed samples and complex food matrices, which poses challenges to data analysis using traditional methods.The overall goal is tointegrate novel machine learning (ML) methods, such asattention-based deep networks,withsurface-enhanced Raman spectroscopy (SERS) platform formultiplexdetection and quantificationof food contaminants with high accuracy.Specific objectives are tosynthesize nanosubstrates and acquire SERS spectral data of different types and quantities of pesticides by SERS measurement;developattention-baseddeep learning prediction methods for qualitative and quantitative analysis of single food contaminants and multiple/mixed food contaminants; validate and assess SERS-ML techniques for multiplex detection of chemical contaminants in fresh produce; and establish protocols/databases for measuring food contaminants.This study will be the first systematicinvestigation of pesticides in fresh produce by SERS coupled with machine learning algorithms, which will be more accurate, sensitive, and reliable than current methods. The project will broaden the applications of data science in maintaining the sustainability of U.S. agriculture and food systems.
Animal Health Component
40%
Research Effort Categories
Basic
60%
Applied
40%
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
71152202000100%
Goals / Objectives
The overall objectives of this multi-disciplinary project are to develop and integrate novel machine learning algorithms with surface-enhanced Raman spectroscopy (SERS) platform for multiplex detection and quantification of food contaminants in fresh produce with high accuracy. Specific objectives of this project are to synthesize SERS substrates; acquire SERS spectral data of different types and quantities of pesticides by SERS measurement; developattention-based deep networksfor qualitative and quantitative analysis of food contaminants; and validate the SERS coupled with deep learning methods for multiplex detection of pesticides in fresh produce.
Project Methods
The methods for this project include: (1)synthesize novel SERS-active substrates, acquire and tag SERS spectral data of different types and concentrations of pesticides by SERS measurement and mapping; (2)conduct pre-processing of SERS spectral data (denoising, baseline correction, and normalization); (3)develop attention-based deep networksfor qualitative and quantitative analysis of SERS spectral data of single food contaminants; (4)develop deep learning prediction methods for qualitative and quantitative analysis of SERS spectral data of multiple/mixed food contaminants; (5)validate and assess SERS coupled with deep learning for multiplex detection of chemical contaminants in fresh produce; establish protocols/databases for measuring food contaminants.

Progress 07/01/24 to 06/30/25

Outputs
Target Audience:Target audience include students and faculty from academia, food scientists, professionals from the food industry, and regulators from government agencies. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?This project provided training for two doctoral students, one in food science, the other in computer science. How have the results been disseminated to communities of interest?The results of this project have been disseminated in two top-tier peer-reviewed journals:ACS Applied Materials & Interfaces, and Journal of Hazardous Materials. It will also be disseminatedat the professional conferences including the IFT conference, USDA PD meeting, and USDA Multistate NC-1194 annual meeting. What do you plan to do during the next reporting period to accomplish the goals?During the next reporting period, we will continue to test food samples contaminated with multiple mixed pesticides by SERS coupled with the data-processing capabilities of machine learning algorithms. The collaboration between SERS and machine learning will allow for the extraction of intricate spectral variances with unparalleled accuracy, facilitating the precise identification of pesticide compounds even in complex matrices.

Impacts
What was accomplished under these goals? The widespread use of pesticides in agriculture poses food safety and environmental risks, highlighting the need for rapid detection techniques to mitigate contamination. Surface-enhanced Raman spectroscopy (SERS) coupled with machine learning provides a powerful approach for the detection and quantification of multiple pesticides in agricultural products. This study introduced the SERSFormer-2.0 model, which excels in both multi-label classification and multi-regression tasks for pesticide analysis, leveraging the power of transformer-based machine learning architectures. The synthesis of Au-core Ag-shell nanoparticles was conducted. Raman spectral data of the samples were collected using a DXR2 Raman spectrometer. SERSFormer processes the SERS data from spinach and strawberry samples, each containing a mixture of five different pesticides: carbophenothion, coumaphos, oxamyl, phosphate, and thiabenzadole. Every food sample contains one or two of these pesticides in equal ratios. The dataset has varied concentration levels ranging from 0.5 to 10 ppm for each mixture. SERSFormer-2.0 employs novel multi-task learning approach with task specific feature representation layers, shared multi-head attention transformer encoder, and task-specific output layers to detect pesticides and estimate the precise concentrations of each pesticide simultaneously. By utilizing core-shell gold-silver nanoparticles, the model achieves near-perfect performance in identifying and quantifying pesticide residues, with multi-label metrics and regression accuracy demonstrating exceptional reliability (accuracy = 0.999; F1 score = 0.992; precision = 0.990; recall = 0.996).The model achieved anMean Absolute Square Errorof 0.136 and an R² score of 0.804, indicating strong predictive power.A detailed examination of the Raman spectra reveals the predominant influence of certain pesticides, and the mechanisms behind spectral dominance were elucidated. In this work, we have demonstrated the efficacy of the SERSFormer-2.0 model in both multi-label classification and multi-regression tasks related to pesticide detection and concentration estimation. The model's performance, characterized by near-perfect multi-label metrics and strong predictive accuracy in concentration estimation, highlights its potential as a reliable tool for real-world applications in agricultural and food safety. Our findings underscore the importance of accurate pesticide detection, particularly in complex scenarios where multiple pesticides may be present simultaneously. The characterization of the synthesized core-shell nanoparticles, coupled with the nuanced analysis of Raman spectra for pesticide pairs, provides insights into the underlying mechanisms driving spectral interactions and dominance. The implications of this work extend beyond the immediate scope of pesticide detection, offering a framework for improving analytical techniques in various fields where accurate and sensitive detection of multiple analytes is crucial.

Publications

  • Type: Peer Reviewed Journal Articles Status: Published Year Published: 2025 Citation: Hegde, A., Hajikhani, M., Snyder, J., Cheng, J., Lin, M. 2025. Leveraging SERS and Transformer Models for Simultaneous Detection of Multiple Pesticides in Fresh Produce. ACS Applied Materials & Interfaces. 17, 2018-2031.
  • Type: Peer Reviewed Journal Articles Status: Published Year Published: 2025 Citation: Kousheh, S., Lin, M. 2025. Recent advancements in SERS-based detection of micro- and nanoplastics in food and beverages: techniques, instruments, and machine learning integration. Trends in Food Sciece & Technology. 159, 104940.
  • Type: Conference Papers and Presentations Status: Awaiting Publication Year Published: 2025 Citation: Lin, M. 2025. Leveraging Machine Learning Models for Sensitive Detection of Toxic Pesticides in Fresh Produce by SERS. 2025 IFT Annual Meeting, Chicago, IL.


Progress 07/01/23 to 06/30/24

Outputs
Target Audience:Target audience include students and faculty from academia, food scientists, professionals from the food industry, and regulators from government agencies. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?This project provided training for two doctoral students, one in food science, the other in computer science. How have the results been disseminated to communities of interest?News release (Feb. 26, 2024): Interdisciplinary research team uses AI to create revolutionary food safety technology (https://cafnr.missouri.edu/stories/interdisciplinary-research-team-uses-ai-to-create-revolutionary-food-safety-technology/) The results of this project have been disseminated in a top-tier peer-reviewed journal -Journal of Hazardous Materials (Impact factor = 13.6). It will also be disseminatedat the professional conferences including 2024 IFT conference, USDA PD meeting, and USDA Multistate NC-1194 annual meeting. What do you plan to do during the next reporting period to accomplish the goals?Next, we will focus on testing food samples contaminated with mixed pesticides by harnessing the enhanced sensitivity and specificity of SERS, coupled with the data-processing capabilities of machine learning algorithms. The collaboration between SERS and machine learning allows for the extraction of intricate spectral variances with unparalleled accuracy, facilitating the precise identification of pesticide compounds even in complex matrices.

Impacts
What was accomplished under these goals? Machine learning, a key subfield of artificial intelligence (AI), develops algorithms that automatically learn hidden patterns and relationships from datawithout explicit programming.During the past reporting year, weintroduced an innovative strategy for the rapid and accurate identification of pesticide residues in agricultural products by combining surface-enhanced Raman spectroscopy (SERS) with a state-of-the-art transformer model, termed SERSFormer.SERSFormerisbased on cutting-edge transformer technology that has achieved great success in large language models (e.g., ChatGPT) for natural language processing, providing a powerful synergy for accurate and efficient identification of pesticide residues. We developed SERSFormerconsisting of three main components: a task-specific embedding layer, a multi-tasking weight-sharing transformer encoder, and dedicated Multilayer Perceptron (MLP) heads preceding the output layers. Notably, SERSFormer encompasses two distinct branches for classification and regression, each equipped with an individual convolutional neural network (CNN) embedder of a similar design. The Transformer Encoder comprises multiple Multi-head Attention Encoder layers strategically shared by both the regression and classification branches. The classification branch extends into a two-layered MLP head, predicting outcomes across six distinct classes. Gold-silver core-shell nanoparticles were synthesized and served as high-performance SERS substrates, which possess well-defined structures, uniform dispersion, and a core-shell composition with an average diameter of 21.44 ± 4.02 nm, as characterized by TEM-EDS. The SERSFormer model demonstrated exceptional proficiency in qualitative analysis, successfully classifying six categories, including five pesticides (coumaphos, oxamyl, carbophenothion, thiabendazole, and phosmet) and a control group of spinach data, with 98.4% accuracy. For quantitative analysis, the model accurately predicted pesticide concentrations (mean absolute error of 0.966, mean squared error = 1.826, and R2score = 0.849). This study synergizes cutting-edge machine learning models and advanced SERS techniques to rapidly and accurately detect pesticide residues in agricultural products. Integrating SERS and the SERSFormer model demonstrates the potential to transform pesticide analysis with high sensitivity, specificity, and efficiency. The SERSFormer serves as a versatile tool for both qualitative and quantitative analysis. Qualitatively, it accurately identified six pesticide categories, benefiting significantly from preprocessing techniques like noise reduction, baseline correction, and normalization. Quantitatively, the model excelled in predicting pesticide concentrations. The study underscores the significance of data normalization techniques in pesticide classification and quantification tasks, including log-min-max normalization, log-convolution, and confusion matrix analysis. This integrated approach, combining SERS with machine learning, offers a promising route for rapid, reliable pesticide detection, with significant implications for monitoring food safety in the agriculture and food sectors.

Publications

  • Type: Journal Articles Status: Published Year Published: 2024 Citation: Hajikhani, M., Hegde, A., Snyder, J., Cheng, J., Lin, M. 2024. Integrating transformer-based machine learning with SERS technology for the analysis of hazardous pesticides in spinach. Journal of Hazardous Materials. 134208. (https://doi.org/10.1016/j.jhazmat.2024.134208) (Impact factor = 13.6)