Recipient Organization
WESLIE KHOO
1014 S GREENWOOD AVE
BLOOMINGTON,IN 47401
Performing Department
(N/A)
Non Technical Summary
Despite the existence of well-established methods (e.g., 24-h dietary recalls) for collecting individual-level dietary data to identify dietary patterns (DP), its use is impractical in large observational studies due to feasibility issues and costs, training, and codifying data. New technologies such as supplementing food record methods with images of food has emerged, but more research is still needed in dietary assessment to appropriately model the complexity of diet to understand dietary patterns. In this Postdoctoral Fellowship application, I will investigate how machine learning (ML) techniques can improve objectivity from current methods of dietary assessment. I will be examining using photos and biomarkers as objective markers of dietary patterns. The specific aims are, (1) Develop ML techniques to identify dietary patterns from photos of foods for the adoption of photo-based dietary assessment in nutrition research, (2) Compare identification of biomarkers associated with various DP via conventional statistical models to machine learning approaches. This project will develop the academic and professional skills of the Project Director with the direction of the primary mentor, Dr. David Crandall, and collaborative mentors, Dr. Danielle Lemay and Dr. Lauren O'Connor. This project falls under 'Food safety, nutrition, and health' of AFRI Farm Bill priority area and will address the objectives under the Program Area, "Diet, Nutrition, and the Prevention of Chronic Diseases". I will be developing, implementing, and evaluating innovative research, through integrating ML into dietary assessment to assess dietary patterns, which could then guide policies that prevent and control diet-related chronic diseases.
Animal Health Component
30%
Research Effort Categories
Basic
70%
Applied
30%
Developmental
(N/A)
Goals / Objectives
The evaluation of food intake is important in scientific research and clinical practice to understand the relationship between diet and health conditions of an individual or a population. The overarching goal of this research proposal is to develop machine learning techniques to improve accuracy of subjective dietary intake with images, and discovering objective markers of dietary intake with biomarkers. Both of my approaches are error reduction strategies that can help mitigate these concerns of highly error-prone traditional dietary assessment techniques. Aim 1: Develop computer vision and machine learning techniques to identify dietary patterns from photos of foods for the adoption of photo-based dietary assessment in nutrition research. Aim 2: Compare identification of biomarkers associated with various dietary patterns via conventional statistical models to machine learning approaches.
Project Methods
For aim 1, to improve quality consistency of photos taken by users, I will first examine several pre-processing techniques such as adaptive white balance and contrast equalization to account for differences in the lighting environment. Once the images are transformed, we will use and improve existing segmentation algorithms, to specifically identify if the food exists with food packaging. If food packaging is present, we will exclude these images from further analysis. We recognize that people eat packaged foods, and our preliminary study shows that 12% of food images collected from a dietary app contains food packaging. Given that analyzing food without food packaging presents a greater challenge, we will focus more on that. Next, segmentation algorithms will be tested to identify the food and non-food regions in the image. We will use deep learning-based models to extract image features as predictors for dietary patterns. Traditional ML modeling approaches that create a generic catch-all model for analyzing dietary patterns are insufficient because dietary patterns are influenced by culture, environment, and personal preferences. Hence, we will approach this problem using personalization models built on machine learning principles that learn from an individual's eating patterns.For aim 2, I will pool the datasets from controlled feeding studies, process continuous predictors with min-max scaling, and transform categorical variables into integers using one-hot encoding. Exploratory data analysis using tools likeMissingno, will be used to generate matrices and plots of missing data, to identify patterns of missing-ness in the training set. Then, feature-selection algorithms, such as BoostARoota will be used to exclude the superfluous variables. Conventional statistical models (e.g., linear, logistic, Cox regression) will be performed to investigate the associations of biomarkers on dietary patterns. Then, a series of machine-learning techniques will be deployed for analysis, for example, stepwise logistic regression, LASSO, K nearest neighbors (KNN), gradient tree boosting (GTB), support vector machines (SVM), random forest, neural network. Model performance between conventional statistical and machine learning models will be assessed by metrics such as Greenwood-Nam-D'Agostino test and C-statistic.Major milestones of the project will be assessed with mentors after each quarter of the fellowship period and include progress and completion of proposed objectives, communication of research findings through publication and presentations, and the training/career development initiatives planned.