Source: CORNELL UNIVERSITY submitted to NRP
COLLECTIVE INDICATORS OF COMMUNITY: COMMUNITY SUPPORTED AGRICULTURE AND SOCIAL MEDIA
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
1004600
Grant No.
(N/A)
Cumulative Award Amt.
(N/A)
Proposal No.
(N/A)
Multistate No.
(N/A)
Project Start Date
Oct 24, 2014
Project End Date
Sep 30, 2016
Grant Year
(N/A)
Program Code
[(N/A)]- (N/A)
Recipient Organization
CORNELL UNIVERSITY
(N/A)
ITHACA,NY 14853
Performing Department
Communication
Non Technical Summary
The ProblemCommunity Supported Agriculture (CSA) is growing as a local strategy to produce safe food in an environmentally sustainable and economically vitalizing manner. Unfortunately, individual CSA's have highly variable rates of success. Income for CSAs varies substantially, with a few highly successful enterprises but a large number earning insufficiently to cover farmer costs. One explanation for the varied success of CSAs is the limited availability of information about the size of their "markets" in local areas. CSAs require customers to purchase produce in ways in which they are not accustomed, in particular, requiring them to pay in advance for produce of varieties they do not choose. While many CSA members eventually prefer this practice to buying produce at supermarkets, the novelty of the process both discourages many potentially happy customers and confuses many who sign up, leaving them to become quickly dissatisfied. Thus, a key factor in CSA success is whether the local population has the requisite knowledge, routines, and attitudes to support CSAs, i.e., whether it has the appropriate "appetite" for these novel arrangements. Measuring these properties is difficult, however, as they are functions of norms and behavior of the community as a whole rather than simply individual views and attitudes.A SolutionThis research takes a novel approach to estimate this communal "appetite" for CSAs by examining the extent to which public dialogue about food in a particular locale indicates a strong, normative, communal commitment to CSA principles: the production of fresh, safe food by local producers who are compensated at a fair price. Using data from public dialogue in social media about food and food practices, this research will compare quantifiable features of the public dialogue in counties with a large number of CSA and counties with highly successful CSAs to counties with fewer or less successful CSAs. The result of these statistical comparisons will be a community CSA receptivity or "appetite" index which indicates the extent that a locale in New York State is likely to be interested in additional investment in CSAs. This index can be used by farmers considering the formation of new CSAs or by those operating existing CSAs to determine appropriate pricing and investment strategies. Local officials who are interested in building and encouraging CSAs may also find the index useful in determining whether their community is ready for a CSA.
Animal Health Component
50%
Research Effort Categories
Basic
20%
Applied
50%
Developmental
30%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
6016210310050%
6036230308050%
Goals / Objectives
The goal of this project is to create a tool for assessing New York State counties regarding their receptivity to supporting Community Supported Agriculture cooperatives (CSAs). The study relies on data from social media, in particular the website Yelp, to develop metrics based on the public dialogue about food and food related businesses in different geographic locales to determine where communities have the norms and expectations in place to support CSAs. This broad goal has both research and extension components. Findings from the study will be communicated to CSA operators and local officials throughout NYS via the creation of a map of CSA receptivity or "appetite" that will be distributed on the web as well as through direct contact. This research will also enhance understanding of how CSAs are understood by communities in terms of customer expectations and demands. Finally, this research will extend theoretical work in organizational ecology by analyzing data related to processes of audience evaluation of social movement related organizational forms.Objectives of this research are:1. To assess the variation in features of food-related public dialogue related to food preferences, practices and evaluation within each local area (county) in New York State.2. To examine the correlation between features of consumer public dialogue and the number and success of CSA's in the local area (county).3. To use these correlations to produce a simple, update-able index of CSA receptivity that will be communicated to farmers and local officials to guide investment or policy decisions.4. To use quantitative variations across communities to determine areas where follow-up research might be performed to understand why CSA's thrive in some areas but not others.
Project Methods
Scientific MethodsThe study focuses on finding statistically significant relationships between two kinds of data: food related public dialogue and CSA density and competition.A county's food related public dialogue refers to ratings and comments made on social media about farms, grocery stores, farmer's markets, CSAs and restaurants in that county. Social media data will be drawn from Yelp as well as from social networking sites Facebook and Google+. Yelp is a social media website that permits users to rate and discuss local businesses, particularly those related to food such as restaurants and grocery stores. Data can be gathered at the level of each business - location, # of ratings, average rating -- as well as at the level of individual user comments - home location of the user, text of the comment, individual score given to the business. These data will be collected from Yelp from the Yelp academic dataset (https://www.yelp.com/academic_dataset) , through queries to the Yelp API, as well as from the Yelp website using the scrapeR package. When CSAs and other local food related businesses have public Facebook or Google+ pages commentary on these pages will be scraped or downloaded through the site's API. When available on the web, dialogue regarding CSA or food related issues published in the minutes of county legislature or committee meetings will also be obtained. All data drawn from social media and related web-pages will be stored on a Cornell virtual server to insure maximal security and back-up. These data will then be compiled into metrics of experience with, interest in, and diversity of views of CSAs and practices and attitudes consistent with the CSA mission (such as a preference for local food) within each county.CSA density and competition within a county refers to the number of CSAs, number of farmer's markets, and number of grocery stories in the county relative to its population. Using sources such as localharvest.org, the Cornell Small Farms database and pridofny.com a county by county count of CSAs and farmer's markets will be compiled for New York State as well as the 30 cities covered by the Yelp academic dataset. NYS farm related agencies, trade associations and organizations such as the NYS Department of Agriculture and Markets and the New York Farm Bureau will also be contacted to obtain data on existing CSA's and farmer's markets such as location, size and membership. The aim of this research is to uncover the features of public dialogue that correspond to the success of CSAs. The process of discovering which features are the best predictors will be conducted using regression analysis controlling for spatial auto-correlation. The output if this analysis is an assessment of how a community discusses food in relation to the success of CSAs in their local area.Outreach EffortsTo make the results of practical use to CSA operators and other stakeholders they will be converted to a CSA receptivity index or "appetite" assessment for each county. Specifically, each county will be scored on the following criteria:1) Does its public dialogue possess the features associated with the legitimation of CSAs (or successful CSAs if these are distinct)?2) Does the current number of CSAs in the county appear to pose a competitive threat to the creation of new CSAs or their expansion?These results can be plotted on a map of New York State that will be distributed to relevant stakeholders and posted on the web. Each county in NYS will be assessed in terms of its current appetite for CSAs. Specifically: does its public dialogue meet the criteria associated with successful CSA operation? If so, does its market already appear to be crowded with CSAs relative to the socio-economic resources of its citizens?Outreach will take place through direct contact with the trade associations and farm bureaus from which the data were gathered, as well as promoted via social media.EvaluationThe benefits of this research will be realized over time as farmers and community leaders re-allocate resources toward CSAs in areas where they are most likely to be successful. Nonetheless, the receptivity index can be roughly evaluated in the short term via dialogue with CSA operators and local officials regarding their predictions to determine whether they are plausible and fit with local experience regarding these markets. Over time, the extent to which the index is consulted by these stakeholders would also signal its success.

Progress 10/24/14 to 09/30/16

Outputs
Target Audience:Communication social scientists specializing in text-analytic methods Undergraduate students interested in doing research on social media and food related topics Members of the public interested in understanding attitudes about local food in their communities. Changes/Problems:Obtaining the full Yelp data for all businesses in New York State took substantially longer than anticipated. As described in the previous section, the data were both more poorly organized -- with overlapping town features -- and more restrictively provided -- based on Yelp's limits for web scraping, than initially anticipated. As a result, developing an appropriate script took longer than anticipated and required more sophisticated programming skills than initially thought. The script was not complete and tested until fall 2016. The scraping process itself also took several weeks. What opportunities for training and professional development has the project provided?During the project we have engaged: 1) Two masters students in information sciences to obtain the Yelp data. I worked with each of them to improve their understanding of how the data they gathered feed into social scientific inquiry. One student worked with me to develop the basis for measuring whether a reviewer was "local" to the business they were reviewing. 2) One doctoral student specializing in linguistic analysis/text analysis. The student worked in the summer 2015. He had previous experience working with text/linguistic analysis but did not have prior experience working with social media data or the R programming language. He developed skills in both of these areas during the project. 3) One doctoral student specializing in intergroup communication. The student worked in teh summer 2016. He had relevant theoretical experience. On the project he learned R programming as well as web scraping for meta-data (such as community zip codes). 4) One undergraduate student undertook the data visualization project. As part of the project he learned D3.js and produced an interactive web visualization of our data. 5) Two undergraduate students, one in communication and one in computer science, became interested in the project and have been conducting their own investigation of how people talk about food on social media looking at posts on Instagram about Ithaca food establishments, including Cornell dining halls. Their work was conducted as an independent study and did not require funds, however, their analyses may be used as part of the project depending on what they find. During the fall semester (they began work in November, 2015), they learned skills related to the collection and social scientific analysis of social media data. How have the results been disseminated to communities of interest?Scholarly communities: Results of our analyses of local vs. visiting reviewers were presented at the International Communication Association (ICA) conference in Fukuoaka, Japan, in June, 2016. Follow-up to this work is under-review at Communication Research, a leading journal in the field of communication. Results of our analysis of expectation and socio-economic status have been submitted to this year's ICA conference. Public: We have created a data visualization of Yelp review text by town/city in New York State. It is available on PI Margolin's website. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Goal #1 An assumption of our approach is that the audience for Yelp reviews has an influence on the reviewer in terms of the text that they choose to right. It is through this influence that we hope to infer community properties from the way that people write about objects (e.g. restaurants) in a community. Our research does uncover such a relationship. Consistent with our theoretical expectation, reviewers modify their language choices based on the extremity of their review. Specifically, more extremely positive and negative reviews are presented in rhetorically more defensible language, using more abstractions and first person constructions. Importantly, however, we find that these differences depend on who is writing the review. Inspection of Yelp data indicates, that a substantial number of reviews are written by individuals from "out of town." These reviews still reflect a normative sense of that community -- what is its identity and how does it relate to appropriate behaviors within and toward it -- but from a different perspective. We find that local reviewers tend to write reviews in a more dynamic, storytelling style than "visiting" reviewers. The signature of this style in text is the use of fewer definite articles and more abstract phrases. This work was submitted to Communication Research (A Multi-Theoretical Approach to Big Text Data: Comparing Expressive and Rhetorical Logics in Yelp Reviews) Goal#2 We ran several analyses comparing CSA density -- # of CSAs with 10 and 25 miles of a community where a restaurant is located -- to Yelp review data for that restaurant parsed for simple discourse metrics. These analyses did not reveal and strong correlations. We suspect this relates to our findings from Goal #1, having to do with the heterogeneity of users participating. In particular, our findings indicate that local reviewers are likely to respond differently than visiting reviewers. In addition, we observed some evidence that socio-economic status of the community may play a role. We have thus expanded our analysis of this factor (see #4 below). Overall, the Yelp discourse data is more heterogeneous and complex than we anticipated. This requires us to understand its properties more fully before being able to conclusively test ideas about their relationship to CSA density. Goal #3 While we have not yet discovered a meaningful index of receptivity, we have produced a data visualization of simple review metrics by community in New York State. The visualization is available to the public on the website here. http://margolin.cac.cornell.edu/Yelp_Projection_Porter/Yelp_Projection.html Goal #4 One feature that was important in our CSA analysis was the socioeconomic status of the community. Three competing social psychological theories predict that the group status (in this case socioeconomic) of an object (in this case a restaurant): Expectancy-violation theory, Black sheep effect, Extremity-complexity model. Each of these models relates a reviewer's expectation of an experience to their actual experience. Currently we are investigating which of these theories is dominant in the context of Yelp restaurant reviews and in which context. The goal of this research is to better understand how written discourse reflects the expectations that a reviewer had for a restaurant before experiencing it. Inferring these expectations is key to inferring the normative identity of the community. For example, when reviewers expect food to be "locally sourced," they will be disappointed when it is not. By contrast, when they have no such expectation, they may be pleasantly surprised when it is. These expectations influence the reviewer's experience and how they describe the review, but they also reflect a social understanding of that restaurant in its community. That is, in locations where no one expects the food to be local, it is unlikely that local food will be strongly valued (even if it is lauded when provided). Rather, local food should be prized in cases where reviewers write as though localness is expected, even if it is not delivered by a particular experience. Our preliminary findings have been submitted to the International Communication Association Conference: A Taste of Discrimination (2017). Submitted to International Communication Association Annual Conference (San Diego, California 25-29 May 2017)

Publications

  • Type: Journal Articles Status: Under Review Year Published: 2017 Citation: "A Multi-Theoretical Approach to Big Text Data: Comparing Expressive and Rhetorical Logics in Yelp Reviews (revision requested for resubmission). Communication Research.
  • Type: Conference Papers and Presentations Status: Submitted Year Published: 2017 Citation: A Taste of Discrimination (2017). Submitted to International Communication Association Annual Conference (San Diego, California 25-29 May 2017)
  • Type: Conference Papers and Presentations Status: Submitted Year Published: 2017 Citation: Tweeting Climate Change: Who or What Motivates Politicians to Address The Topic? (2017). Submitted to International Communication Association Annual Conference (San Diego, California 25-29 May 2017)
  • Type: Conference Papers and Presentations Status: Accepted Year Published: 2016 Citation: Margolin, D., & Markowitz, D. (2016). You Write What You Eat: Linguistic Style, Ratings, and Locale of Yelp Reviews. Presented to International Communication Association Annual Conference (Fukuoka Japan 9-13 June 2016).


Progress 10/24/14 to 09/30/15

Outputs
Target Audience:Target Audiences Communication social scientists specializing in text-analytic methods Undergraduate students interested in doing research on social media and food related topics Efforts -- Our goal is to identity meaningful indicators of cohesive local communities that value local products using social media data. Since social media data is primarily text-based, our first effort was to develop a means of detecting local community orientation with text-analytic methods. Thus our first year efforts were focused on getting feedback on our work from social scientists who specialize in these techniques. To address this audience we have prepared and submitted a manuscript "You Write What You Eat: Linguistic Style, Ratings, and Locale of Yelp Reviews" to the International Communication Association annual conference. We also engaged two undergraduate students who are interested in using social media data to better understand the relationship between social media communication and attitudes about food and food providing establishments, including restaurants and dining halls. They work with the PI as part of an independent study gathering data about Ithaca area restaurants and markets on Instagram. They are receiving training in scraping and analyzing social media data as part of this project. Changes/Problems:Obtaining the full Yelp data for all businesses in New York State took substantially longer than anticipated. As described in the previous section, the data were both more poorly organized -- with overlapping town features -- and more restrictively provided -- based on Yelp's limits for web scraping, than initially anticipated. As a result, developing an appropriate script took longer than anticipated and required more sophisticated programming skills than initially thought. The script was not complete and tested until fall 2016. The scraping process itself also took several weeks. The data have now been obtained, however. What opportunities for training and professional development has the project provided?During the project we have engaged: 1) Two masters students in information sciences to obtain the Yelp data. I have worked with each of them to improve their understanding of how the data they gathered feed into social scientific inquiry. One student worked with me to develop the basis for measuring whether a reviewer was "local" to the business they were reviewing. 2) One doctoral student specializing in linguistic analysis/text analysis. The student worked in the summer 2015. He had previous experience working with text/linguistic analysis but did not have prior experience working with social media data or the R programming language. He developed skills in both of these areas during the project. 3) Two undergraduate students, one in communication and one in computer science, became interested in the project and have been conducting their own investigation of how people talk about food on social media looking at posts on Instagram about Ithaca food establishments, including Cornell dining halls. Their work was conducted as an independent study and did not require funds, however, their analyses may be used as part of the project depending on what they find. During the fall semester (they began work in November, 2015), they learned skills related to the collection and social scientific analysis of social media data. How have the results been disseminated to communities of interest?Results of the analyses of self-presentation and CLT have been disseminated to the communication scholarly field by submission of a paper to the International Communication Association annual conference. What do you plan to do during the next reporting period to accomplish the goals?The next phase of the project entails three main steps: Applying the text-analytic techniques developed on the academic dataset to the (now complete) New York State dataset Comparing the text-analytic features of community to the population of CSAs in different local areas in New York State. Creation of the receptivity index.

Impacts
What was accomplished under these goals? Related to Goal #1 and Goal #4 Our first analysis was to look at variations in food-related dialogue across different communities. To do this we began with the Yelp academic dataset, which, contains reviews of businesses, primarily restaurants, across 26 communities around the country. Our goal was to identify the textual signature of allegiance to local establishments, as this should signal a preference for local food. We found significant effects in line with both self-presentation theory and construal level theory. According to self-presentation theory (Bazarova et al., 2012), people are more concerned with how they will be judged by audiences who are socially close to them. One of the important signatures of self-presentation is the injection of positive emotional words or "pro-social" words into communication. We analyzed the use of positive emotion words using the Linguistic Inquiry Word Count (LIWC) dictionary. We first control for review rating -- the score that reviewers give to a restaurant or other establishment -- as positive emotion is strongly correlated with experience. This allows us to see whether certain reviews contain more positive emotion than would be expected just based on their rating alone. These reviews are indicative of self-presentation processes -- that is, that the reviewers are adding "extra" positive emotion in order to appear a certain way to their audience. We then compare the behavior of local reviewers -- reviewers whose review history indicates they live in the same community as the business being reviewed -- to visiting reviewers in terms of this self-presentation, that is, the use of this "extra" positive emotion. The effect was significant. In each of the 26 communities, local reviewers tended to add more "extra" positive emotion to their reviews than visiting reviewers. This thus serves as our first linguistic feature indicator of the influence of "localness." We also found the influence of localness through Construal Level Theory (CLT) (Trope & Liberman, 2010). According to CLT, individuals will use more concrete, immediate language when talking about objects that are socially close to them. We find support for this in that reviewers tend to use more "present tense" verbs when talking about local businesses than when talking about businesses that are not in their primary community. This serves as our second linguistic feature indicator. Importantly, for each of these indicators we find significant variation across communities. That is, self-presentation and construal operate more strongly in some communities than in others. For example, self-presentation (the injection of extra positive emotion) is significantly weaker in Albany than in Ithaca. Reviewers local to Albany inject an extra .15% positive emotion words into their reviews of Albany businesses over non-Albany businesses, while Ithaca reviewers inject an extra .37 positive emotion words. This suggests that Ithacans are more concerned with how they will be judged by other Ithacans as it relates to their opinions. These findings set the stage for our next set of tests where we will compare these community-level variations in self-presentation and CLT to observable CSA data. We have also collected the full, publicly available set of Yelp reviews written about businesses in New York State. This was a challenge because Yelp does not serve data directly by state. A custom web-scraping script was written to query Yelp for the reviews written for businesses in each town with New York State. The script then had to check for duplications -- some businesses are listed under adjacent towns or hierarchies of areas (e.g. New York City, Bronx). The script also had to avoid overwhelming Yelp's servers by waiting/resting between queries. The development of a feasible script and the completion of the scraping took substantial effort. Collection of the complete New York State Yelp dataset was completed in November, 2015. Bazarova, N. N., Taft, J. G., Choi, Y. H., & Cosley, D. (2012). Managing impressions and relationships on Facebook: Self-presentational and relational concerns revealed through the analysis of language style. Journal of Language and Social Psychology, 0261927X12456384. Trope, Y., & Liberman, N. (2010). Construal-level theory of psychological distance. Psychological Review, 117(2), 440-463. http://doi.org/10.1037/a0018963

Publications

  • Type: Conference Papers and Presentations Status: Under Review Year Published: 2016 Citation: Margolin, D., & Markowitz, D. (2016). You Write What You Eat: Linguistic Style, Ratings, and Locale of Yelp Reviews. Submitted to International Communication Association Annual Conference (Fukuoka Japan 9-13 June 2016).