Source: UNIV OF CONNECTICUT submitted to
DSFAS-CIN: HARNESSING MOBILITY BIG DATA AND ARTIFICIAL INTELLIGENCE THROUGH A TRANSDISCIPLINARY RESEARCH NETWORK IN FOOD PRODUCTION, PROCESSING, AND CONSUMPTION SYSTEMS
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
ACTIVE
Funding Source
Reporting Frequency
Annual
Accession No.
1028264
Grant No.
2022-67022-36603
Cumulative Award Amt.
$1,000,000.00
Proposal No.
2021-11528
Multistate No.
(N/A)
Project Start Date
Jan 15, 2022
Project End Date
Jan 14, 2026
Grant Year
2022
Program Code
[A1541]- Food and Agriculture Cyberinformatics and Tools
Project Director
Connolly, C.
Recipient Organization
UNIV OF CONNECTICUT
438 WHITNEY RD EXTENSION UNIT 1133
STORRS,CT 06269
Performing Department
Agricultural & Resource Eco
Non Technical Summary
The Coordinated Innovation Network (CIN) in big data and modern artificial intelligence (AI) will promote multidisciplinary research into food production, processing, and consumption systems in collaboration with several institutions, including academic, governmental, and international partners. The CIN will use big data to address food systems questions, exploit a unique mobility dataset of 200 million cell phone users, and complete three immediate research applications. Most agricultural AI innovations are limited to production data analysis, indicating a clear need for new research relying on big data beyond the farmgate and AI systems to answer research questions of national scope focusing on producer marketing, food retailing, and consumer choices. We will design a unique geospatial cyberinfrastructure that combines large-scale heterogeneous mobility data with multiple other sources. Spatial and temporal data filtering and indexing will facilitate frontier food systems research. We will construct location-based heterogeneous graphs to develop new insights regarding the producer, retailer, and consumer interactions and behaviors adopting modern AI systems. Trajectory mining will enable us to leverage data correlations and construct user visitation models. These novel data science applications will support multidisciplinary frontier research of CIN members. The CIN will be established and sustained through transdisciplinary training of undergraduate and graduate students, virtual networking, and an annual research symposium. Stakeholder engagement will be maintained in all project aspects. A user-friendly database that integrates visualization and AI tools will be made available through a dedicated web interface and GitHub.
Animal Health Component
35%
Research Effort Categories
Basic
30%
Applied
35%
Developmental
35%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
6036050301025%
6046110303030%
6076010303030%
6106230301015%
Goals / Objectives
The goals of this project fall under four main categories. The first involves network building. We aim to establish a Coordinated Innovation Network (CIN) that promotes big data and Artificial Intelligence (AI) systems to conduct food systems research. This CIN will build a network of collaborators interested in using big data to research questions related to food production, processing, and consumption systems, exploiting a unique dataset of 200 million cell phone users with the help of AI systems, and completing three immediate research applications. We will bring together experts from multiple disciplines, including agricultural economics, geography, and computer science, to identify a synergistic solution to siloing in the agricultural and food sciences. The network of collaborators will include internal researchers from UConn's College of Agriculture, Health and Natural Resources (CAHNR), the School of Engineering, the College of Liberal Arts and Sciences, and the School of Business, as well as external collaborators from academic institutions and governmental agencies in the U.S. and abroad.The second goal involves research. We will strategically build on the unique data science collaborations at the University of Connecticut (UConn) to create transformative research that improves academic and policymaker understanding of the U.S. agriculture and food system. Specifically, we will use mobility big data to conduct food systems research, combining multiple datasets to answer questions related to producer marketing decisions, retailer networks, and consumer behaviors. This innovative research will be conducted through the CIN by designing, integrating, analyzing, and interpreting AI and data mining applications to the food system. We will leverage AI to address critical questions in food and agriculture by automating processes that create scalable insights from big geospatial and mobility data. This includes addressing topics spanning from agriculture and food retailing to consumer choices and health.The third goal involves data science. We will design multiple novel AI systems to acquire knowledge, iteratively extract information and learn from the mesmerizing patterns in our high-dimensional input data. We will develop new deep learning approaches, formulating a flexible and powerful way to learn from the training data as a nested hierarchy of concepts, where more complicated and high-level concepts will build on simpler ones. This includes designing data analysis techniques that lead to methodological advances that will enable a more efficient and scalable platform for researchers and stakeholders to discover new producer, processor, and consumer patterns from large mobility datasets, such as identifying frequent visitations from large-scale datasets for food market decision support.The fourth goal involves stakeholders and students. We will combine transdisciplinary network building and large-scale cyberinfrastructure development for novel insights in both data and agricultural sciences, improve food system collaborations, and provide policy-relevant research findings. We will also incorporate a training program that engages graduate and undergraduate students in multiple disciplines, ensuring graduates with workforce-ready skills. Finally, we will ensure strong stakeholder involvement in all project elements through collaborations with UConn's Department of Extension.
Project Methods
A systems-based approach is necessary when studying food and agriculture due to the myriad of complex factors interacting throughout the food system. We will develop a Coordinated Innovation Network (CIN), through which we will pursue innovative techniques and create new knowledge in both data and agricultural sciences by applying big data and AI techniques to questions concerning food and agriculture. We will emphasize post-farmgate activities that have been understudied despite the increasing use of big data in production agriculture, with a primary focus on analyzing agricultural data by connecting multi-scale, multi-domain, and multi-format datasets. While we are building on pre-existing collaborations between ARE and CSE, which demonstrates the collaborative abilities of the Co-PIs, we will have additional internal and external collaborators and grow the network through annual research symposia, monthly virtual seminars, and a new multi-state Hatch project. We will collaborate with stakeholders to establish research priorities, and both undergraduate and graduate students will receive training on interdisciplinary research processes.As part of this project, we will complete three research applications in food retailing, the restaurant industry, and agricultural direct marketing (detailed below), with additional research projects forthcoming as collaboration increases. Using mobility big data, we can empirically investigate how retailers structure their supply chains by identifying storage and transportation patterns in the food supply chain network, thereby deriving deeper insights into how firms are managing the supply process of food products. Using an equity and social justice framework, we will employ AI systems to compare the neighborhood demographic and socioeconomic characteristics of warehouse vs retail locations and assess whether neighborhood-level racial/ethnic segregation and poverty are associated with individual consumer food shopping behavior and choice of grocery stores. Another highly innovative aspect of this research is the ability to define market size, assessing competition effects while also linking to questions concerning equity and access to SNAP and other food assistance resources.We will also take an interdisciplinary approach to analyze the restaurant industry. AI systems and mobility big data will be used to identify the effect of distance on choices, assess whether neighborhood-level racial/ethnic segregation and poverty are associated with consumer restaurant choice, and study the impact of COVID-19 on delivery apps (through driver identification) and changes in consumer patterns during and after the crisis. We will use machine learning to construct location-based heterogeneous graphs based on different types of business entities (supermarkets, grocery stores, fast-food restaurants, etc.), and by learning a generative representation of each node (business entities, mobile users) in these heterogeneous graphs, we can discover the hidden relationships among them. We will then use the dynamics of these heterogeneous graphs to predict the future status of each node (business) or edge (relationships).We will also use AI systems and mobility big data to answer questions concerning agricultural direct marketing not previously possible due to data constraints. Using trajectory mining, we can leverage the correlations between users, POIs, and activities by constructing the semantic relationships between user profiles, POI categories, and activity types. Using the correlations, given any two of the three components, we can build a market recommendation model predicting consumer interest in potential POI visitations. We will assess consumer behavior regarding distinct marketing arrangements on a national level, including attributes that impact consumer market choice and potential spillover effects. From a supply chain perspective, we will identify where producers are coming from and how many establishments they supply, allowing for the creation of local food network maps.As part of our dissemination and network-building strategy, we will publish our data through a dedicated webpage, ensuring stakeholder use and increased visibility for network growth. This web-based interface will be designed to interactively display and visualize the mobility data analysis results and engage relevant stakeholders (such as researchers, government officials, policymakers, and NGOs) in the decision-making process relevant to producer, processors, and consumers markets to understand and interpret essential considerations like supplier networks, retailer inventory and the neighborhood impacts of store location choice. We will use the web-based interface to interact with the location data and the deep learning modeling results to capture, manage, analyze, and display geographically referenced information.

Progress 01/15/23 to 01/14/24

Outputs
Target Audience:As part of the training component of this project 1undergraduate students from CSE has continued the work of last year's senior design team to develop thecyberinfrastructure. Additionally, two graduate students worked on improving the processing network for the database(including backend and front end of the platform). Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?As noted above, a key objective of the project involves student training. Undergraduate and graduate students received training on mobility big data as part of this project. Additionally, professional development opportunities occurred through a Mobility Big Data Workshop organized on campus. How have the results been disseminated to communities of interest?Initial results were disseminated through the big data workshop we hosted. Additionally, three publications in computer science have discussed the data component of our work. What do you plan to do during the next reporting period to accomplish the goals?Our primary work for the next year includes: Goal 1: Host the first annual meeting of our new multistate hatch project (NE 2204). Present research results at academic conferences. Goal 2: Continue progress on our current research project, with end goal of academic publications and corresponding publications for stakeholders through the Zwick Center. Begin new research projects based on new collaborations developed through our external outreach. Goal 3: Develop deep learning and neural network techniques to utilize the mobility data to enhance our understanding of consumer behavior. Develop a public-facing component through collaboration with CSE and geography students. Goal 4: Continue working with students on training opportunities, and develop an interdisciplinary team of undergraduates to work on this big data. Develop collaborations through our external meetings. Create publicly digestible versions of any social science research we publish.

Impacts
What was accomplished under these goals? Goal 1 (network building): We created a multistate hatch project (NE2204:A regional network of social, behavioral, and economic food systems research) to facilitateexternal networking on this project. We will be holding our first annual meeting in July 2024. We also hosted a workshop at the University of Connecticut to develop collaboration locally. Goal 2 (research): There are several ongoing research projects. The first is an overview paper on the implications of big data for applied economics research. The second looks at consumer choices for farmers' markets. The third considers consumer behavior around food retailing. The fourth looks at the impact of public policy in access to, and utilization of, health clinics. This work includes collaboration between faculty, postdoctoral researchers, graduate students and undergraduate students. Goal 3 (data science): While the raw data consists of 100s of terabytes of individual-level observations, we have worked to develop cyberinfrastructure to process and store the data. This allows for querying based on research study needs. Work is ongoing to develop deep learning programs that will allow for the analysis of consumer network behavior. Goal 4 (stakeholders and students): Several students have been trained through this project. Last year 5 students worked on developing a database of the mobility data as part of a senior design project. This year an undergraduate CSE student has improved on their initial results, developing a fully functioning database that can be queried. Two graduate students worked on improving the efficiency of this database as part of their Master's capstone project, and one will be continuing on the project this summer. Additionally, two postdoctoral researchers have been working on using the processed data in the four social science projects discussed above (in collaboration with faculty members). Initial results were disseminated at a workshop in February that was open to students and faculty at the University of Connecticut, as well as colleagues at other institutions. Dissemination and collaboration will continue at academic conferences this summer.

Publications

  • Type: Conference Papers and Presentations Status: Published Year Published: 2022 Citation: Zhang, Xikun, Dongjin Song, and Dacheng Tao. "Cglb: Benchmark tasks for continual graph learning." Advances in Neural Information Processing Systems 35 (2022): 13006-13021.
  • Type: Journal Articles Status: Published Year Published: 2022 Citation: Zhang, Xikun, Dongjin Song, and Dacheng Tao. "Hierarchical prototype networks for continual graph representation learning." IEEE Transactions on Pattern Analysis and Machine Intelligence 45.4 (2022): 4622-4636.
  • Type: Journal Articles Status: Published Year Published: 2023 Citation: Zhang, Xikun, Dongjin Song, and Dacheng Tao. "Ricci curvature-based graph sparsification for continual graph representation learning." IEEE Transactions on Neural Networks and Learning Systems (2023).


Progress 01/15/22 to 01/14/23

Outputs
Target Audience:As part of the training component of this project 5 undergraduate students from CSE did participate in the cyberinfrastructure development (including backend and front end of the platform). Additionally, one graduate student in CSE was mentored through this project in participating in the spatio-temporal mobility modeling research. Changes/Problems:This project experienced a major delay due to burdensome contract processes. We were not able to receive an analyzable data extract until August 2022 and did not receive the full dataset until October 2022. This has pushed our timeline back by about 6 months from the original start date of January 2022. What opportunities for training and professional development has the project provided?A team of five undergraduate students received training and experience on mobility data through a senior project collaboration with our researchers. How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals?We plan to expand the mobility data work to social scientists for applied research and to develop case studies that can be disseminated.

Impacts
What was accomplished under these goals? As part of network building,we initiated a multistate hatch project and proposed an academic article to discuss applications for mobility data in economics and promote the CIN. As part of cyberinfrastructure development, the development team has started to gain understanding in (1) analyzing the spatial distributions of large-scale POI categories, visit counts, user dwell time, and other related statistical measures; (2) constructing the backend platform in transmitting, storing, and managing the large-scale location data by the open-sourced software development (such as Virtual Machine (VM), PostgreSQL, Node.js and PostgREST); and (3) developing a web interface prototype to visualize the imported human mobility data. The goal was to gain the initial insights from the large-scale location data, identify the potential technical, developmental, and resource-related challenges, and develop conceptual framework of the needed cyberinfrastructure in collaboration with the UConn Information Technology Services (ITS). The development team has provided a proof-of-concept design that provides the Application Programming Interface (API) which can support extension to the next-stage cyberinfrastructure consolidation. As part of spatio-temporal mobility modeling research, the data science team has initiated the spatio-temporal modeling studies on (1) multi-head attention and transformer modeling to understand the mobility patterns across time periods and locations; and (2) human mobility time series modeling in response to the neighborhood events and other external factors. The goal of the spatio-temporal mobility modeling research was to gain understanding regarding the spatio-temporal and deep learning models in processing and retrieving potentially useful features from the location data to support mobility prediction (such as frequency visits, crowd flows, and transitions across location). The modeling codes and software will be documented for the next-stage implementation upon the processed and retrieved Veraset location data. As part of stakeholders and students we mentored a team of five undergraduates in cyberinfrastructure development. One CSE graduate student was also involved in the spatio-temporal mobility modeling research.

Publications