Recipient Organization
UNIV OF CONNECTICUT
438 WHITNEY RD EXTENSION UNIT 1133
STORRS,CT 06269
Performing Department
Allied Health Sciences
Non Technical Summary
America's food culture has been through a drastic transformation: more Americans choose to eat at a restaurant than cook at home. The generally denser calories of restaurant foods are criticized for contributing to the growing obesity issue. However, current measures of nutritional quality of restaurant foods are either inaccurate or time-consuming and cost-ineffective. To alleviate the issue, we propose to develop an innovative deep-learning-based food image recognition technique for nutrition assessment of restaurant foods. The built model, linked to a nutrient composition database, will identify the nutrition facts of the food within a food image. We will also cross-validate the derived nutrition facts with calorie information released by restaurants. In addition to the model development, we will apply the technology to evaluate the nutritional levels of all restaurants in Hartford, with food images crawled on Yelp. The nutrition information of these food images will be mapped in Geographic Information System (GIS) to better understand the nutritional landscape of restaurants and how it affects obesity-related health outcomes at the community scale. Furthermore, we willemploy structured interviews and surveys at local restaurants to understand how food consumers perceive the nutrition information of restaurant foods. This combined approach will provide an overarching view to unveiling the connectivity and differences between the computer-aided and the human-perceived community food environments. The goal of the project isto develop a reliable and cost-effective tool toevaluate nutrition information from food images,which can facilitate the assessment ofthe community food environment and achieve public health impact on a larger scale by improving the nutrition awareness of consumers and informing the design of health policy interventions to ameliorate neighborhoods at high risks of obesity and obesity-related comorbidities.
Animal Health Component
60%
Research Effort Categories
Basic
20%
Applied
60%
Developmental
20%
Goals / Objectives
The goal of this project is to develop a food image recognition technique forassessingthe nutrition information of the restaurant food, and to test the feasibility of using this tool to assess the community food environment. The objectives include (1) develop a deep learning model that canderivethe nutrition information fromfood images,(2) applythis tool to assess the community food environment using food images on social media from all restaurants in Hartford, CT, and (3)validate the deep-learning results throughstructured interviews and surveys at restaurants in socioeconomically stratified communities in Hartford.
Project Methods
To develop the deep-learning model, we will use the transfer learning technique in two steps. In the first step, we will use the most recent ResNet-152 (RESNEXT WSL 2019) pre-trained with ImageNet dataset (Deng et al. 2009), which is the most accurate open-access image classification model to date. In the second step, we will develop a proprietary deep learning model and concatenate it to ResNet-152 pre-trained model for the transfer learning, to achieve both high accuracy and short training time. To accelerate the deep learning algorithm and achieve real-time food image processing, we will develop a novel optimization technique for weight pruning using reweighted method, which is built upon one of the Co-Is' previous work (Ding et al., 2019). Using this model, we will be able to derive the name of a food item (i.e., "food tag") from the food image. Next, we will link the food tag to a nutrient composition database (i.e., Food and Nutrient Database for Dietary Studies, USDA 2019c) to derive its nutrition facts, including calories, macronutrients (e.g., carbohydrates), and micronutrients (e.g., sodium). Thus, the end product from the deep learning model would be the textualized food item along with its nutrition facts for a given food image. Furthermore, we will also cross-validate the learned nutrition facts with (1) other food image recognition API (e.g. Calorie Mama), and (2) nutrition information released by a list of major chain restaurants, which have been validated by previous studies (Urban et al., 2011). After the model is built, we will apply it to a case study for all restaurants in Hartford. The food images for these restaurants will be crawled on Yelp, as well as Google Places and Tripadvisor to avoid possible selection bias on any particular website. We will use a custom application programming interfaces (APIs) to obtain the food images and the nutrition information of the food images will be derived by the deep learning model and will be consolidated by restaurants for nutrition assessment. To account for the potentialbiasin the data collection and the modeling process (e.g., not every consumer posts their diets on social media), we will validate our study from a human perspective, employing structured interviews and surveys at restaurants in socioeconomically stratified communities in Hartford. Specifically, we will obtain a census-tract stratified sample of 80 subjects and conduct a one-hour structured interview and survey. We will ask about their perceptions of the local food environment, healthiness of the local restaurants, as well as key factors determining their food and restaurant selection. This triangulation (i.e. deep learning results, released restaurants nutrition facts, structured interviews and surveys) can further complement the deep-learning model and unveils the connectivity and the differences between the deep-learned and the human-perceived food environments. Finally, as an exploratory step we will apply the deep-learning results and assess how well it predicts regional obesity rates.We will derive consumer nutrition variablesfrom deep learning results (and controlfor socio-economicand physical exercise variablesfromexisting databases)and assess their associations with the community obesity rates at the census tract level in Hartford. Our results and policy recommendations will be communicated through scientific publications, conference presentations, policy briefings and community outreach. The validity of the deep-learning model and our expected outcomes will be evaluated by (1) model'sability (e.g. F1 score, ROC) to correctly predict nutritional information fromfood images, (2) theconsistency between the deep-learning resultsandinterview/survey results assessing the community food environment inHartford, and (3) theassociation between the deep-learning results andregional obesity rates.