PARTNERSHIP: Integrating Locally-Weighted Meta-Regression and Machine Learning to Capture Spatial Complexity in Multi-Scale Benefit Transfer

Recipient Organization
Clark University
950 Main St.
Worcester,MA 01610

Performing Department
(N/A)

Non Technical Summary
The USDA spends billions of dollars per year on conservation to enhance environmental quality, ecosystem services and agricultural sustainability. The biophysical impacts of these programs (e.g., on soil retention and water quality) are relatively well understood and can be estimated using standard modeling approaches. Yet the economic benefits of these programs remain largely unknown, and credible information on non-market benefits is particularly lacking. Despite an extensive literature on non-market valuation, the methods from this literature are often impractical to use for the estimation of values provided by USDA and other conservation programs. Large-scale, applied valuation of this type almost universally requires benefit transfer, or BT. BT uses existing economic value estimates from prior studies at one or more locations to predict economic value estimates such as willingness to pay (WTP) at other, typically unstudied locations. Benefit transfer can produce economic value estimates for areas and ecosystem service improvements for which original economic valuation studies have not been conducted--thereby quantifying the economic value of large-scale agricultural conservation to the public. Yet BT methods to support reliable large-scale valuation are inadequately developed, particularly for applications such as resource conservation and water quality improvements with widespread, diffuse and patchy impacts. Due to the lack of sufficiently accurate BT methods, USDA and its partners struggle to produce credible estimates of non-market conservation benefits.Addressing this unresolved research and policy need requires a set of flexible, standardized BT approaches that are able to predict benefits for the large spatial scales over which conservation occurs, while simultaneously accounting for the important effects of localized, place-specific spatial and other dimensions on values for environmental improvements. Responding to this important gap in knowledge and methods for economic analysis, the present project will develop and evaluate BT procedures with a previously unattainable capacity to account for localized spatial heterogeneity of conservation-related environmental improvements over large spatial scales (such as water quality improvements), while identifying areas wherein improvements are most valued by target populations, including disadvantaged communities. Although applicable to any conservation outcome, methods will be illustrated for BTs that predict willingness to pay (WTP) for spatially diffuse water quality improvements.The project addresses USDA AFRI Environmental and Natural Resource Economics (A1651) program priorities, which call for "benefit transfer to inform benefit-cost calculations for conservation and natural resource policy design and implementation." These novel methods will integrate (1) locally weighted meta-regression models (LW MRMs) for WTP metadata that produce unique benefit functions for each site, (2) interactive map-based survey architecture that identifies highly valued ("salient") areas nationwide for specific environmental improvements, (3) a machine-learning spatial salience classification model (SCM) that uses these survey data to provide generalizable predictions, for any Census Block Group (CBG) nationwide, on the degree to which any potential watershed area is salient (or particularly important) to residents for water quality improvements, and (4) validated metadata on household WTP for water quality improvements in US waterbodies drawn from prior studies in the valuation literature, augmented with SCM results to support LW MRMs with enhanced WTP-prediction accuracy. The project will provide transformative yet standardized BT methods able to predict values due to diffuse conservation over large scales, together with a set of nationwide spatial salience data layers that can be used independently to identify areas wherein water quality improvements are most valued by residents of any CBG. These methods will enhance the accuracy of large-scale BT, by incorporating systematic information on the extent to which patchy environmental improvements occur in local (or non-local) areas that are important (or salient) to households. In doing so, these approaches will increase the capacity of USDA and others to quantify the economic values generated by agricultural conservation programs.

Animal Health Component

70%

Research Effort Categories

Basic

(N/A)

Applied

70%

Developmental

30%

Classification

Knowledge Area (KA)	Subject of Investigation (SOI)	Field of Science (FOS)	Percent
605	0399	3010	45%
605	0399	2090	40%
605	0399	2050	15%

Knowledge Area
605 - Natural Resource and Environmental Economics;

Subject Of Investigation
0399 - Watersheds and river basins, general;

Field Of Science
2090 - Statistics, econometrics, and biometrics; 2050 - Hydrology; 3010 - Economics;

Keywords

Goals / Objectives
This project will develop and evaluate benefit transfer (BT) procedures able to account for localized spatial heterogeneity of conservation-related environmental improvements over large spatial scales, while identifying areas wherein improvements are most valued by target populations, including disadvantaged communities. These novel methods will enhance the accuracy of large-scale BT, by introducing systematic information on the extent to which environmental improvements occur in local (or non-local) areas that are important (or salient) to households living in specific CBGs. Although applicable to any conservation outcome, methods will be illustrated for BTs that predict willingness to pay (WTP) for diffuse water quality improvements. These new BT methods will integrate (1) locally weighted meta-regression models (LW MRMs) for WTP metadata that produce unique benefit functions for each site across the landscape, (2) interactive map-based survey architecture that identifies highly valued ("salient") areas nationwide for specific types of environmental improvements, (3) a machine-learning (ML) spatial salience classification model (SCM) that uses these survey data to provide generalizable predictions, for any Census Block Group (CBG) nationwide, on the degree to which any potential watershed area is salient to residents for water quality improvements, (4) validated metadata on household WTP for water quality improvements in US waterbodies, augmented with SCM results to support LW MRMs with enhanced WTP-prediction accuracy. The project will provide standardized BT methods able to predict values due to diffuse conservation over large scales, together with a set of nationwide spatial salience data layers that can be used independently to identify areas wherein water quality improvements are most valued by residents of any CBG.The project is organized around eight primary objectives:1. Finalize the theory and procedures necessary to integrate newly developed methods for Bayesian LW MRMs with novel ML algorithms for spatial salience classification, with the integrated approach designed to support spatially explicit BTs of water quality benefits from heterogeneous and patchy resource conservation over large spatial scales.2. Develop and implement a large-sample spatial salience survey to elicit data on the degree to which US households value, or view as "salient," potential water quality improvements in different HUC 8 watersheds regionally and nationwide. Integrate survey data with supporting information from publicly available, nationwide geospatial and socioeconomic data layers.3. Using the integrated survey, geospatial and socioeconomic data, implement the ML spatial SCM to predict, for every CBG nationwide, the degree to which any nationwide HUC 8 watershed is considered to be relatively salient (or is prioritized) by CBG residents for water quality improvements.4. Using spatial salience predictions, retrospectively update the PDs existing metadata on per household WTP for water quality improvements to incorporate information on the extent to which the improvements that were valued by each underlying, primary study in the metadata likely occurred in salient or non-salient watersheds, for the originally sampled households.5. Using the extended metadata, specify and implement a novel LW MRM of per household WTP for water quality improvements, incorporating previously unavailable information on the localized spatial salience of the affected waterbodies.6. Using both out-of-sample cross validation (CV) and an applied, large-scale BT case study in the Mid-Atlantic US, evaluate the BT performance of the integrated SCM / LW MRM for predicting per household and aggregate WTP, and compare the accuracy of the new approach to that of prior MRM BTs that are unable to accommodate localized spatial salience.7. Using SCM results, further develop novel, nationwide spatial salience geospatial data layers that can be used to inform BTs and other valuation modeling, identifying the specific HUC 8 watersheds for which water quality improvements are predicted to be salient for households in any selected CBG across the continental US, including CBGs associated with disadvantaged communities or those with environmental justice concerns.8. Coordinate results to provide methods, guidance and protocols for meta-function BTs able to predict WTP over large spatial scales while capturing the effects of local heterogeneity in quality changes and benefits to marginalized communities.

Project Methods
Project methods are organized around eight primary tasks.1. We will first finalize the theory and procedures to integrate newly developed methods for Bayesian locally weighted meta-regression models (LW MRMs) with machine learning (ML) algorithms for spatial salience classification, with the integrated approach designed to support spatially explicit benefit transfers (BTs) of water quality benefits from heterogeneous and patchy resource conservation over large spatial scales. Our methodological point of departure is the LW MRM developed by the PDs for a prior USDA AFRI supported project. This LW MRM will be enhanced to account for the extent to which spatially heterogeneous water quality improvements occur in watershed areas (here, HUC 8s) that are salient or non-salient to households residing in each census block group (CBG), where the salience of particular watersheds can (and likely will) vary across households in different CBGs.2. Extending the spatially interactive survey architecture under development by the PDs, we will then develop and implement a large-sample spatial salience survey to elicit data on the degree to which US households value, or view as "salient," potential water quality improvements in different HUC 8 watersheds regionally and nationwide. These methods augment a traditional online survey instrument with integrated, dynamic GIS maps that offer a visually intuitive alternative to traditional survey questions aimed at eliciting features of interest in geographic space. Adapting these methods, we will develop, pretest, and implement a novel, nationwide, push-to-web spatial salience survey of randomly selected households across the continental US. Respondents will be asked, in a guided fashion and via map interactions, to identify HUCs of special importance across the country for water body uses and water quality improvements, e.g., due to recreational preferences, family ties, or for other use or non-use reasons. The interface will present each respondent with an interactive map of the contiguous US, overlaid with HUC 8 boundaries of affected watersheds (showing water bodies, key places, etc.) that allow survey respondents to pan and zoom to any desired location, as well as search for features using a natural language search tool, and then indicate the salience of the watershed containing these features. The survey data will be linked to supporting information from publicly available, nationwide geospatial and socioeconomic data layers.3. Using the integrated survey, geospatial and socioeconomic data, we will design and implement a ML spatial salience classification model (SCM) to predict, for every CBG nationwide, the degree to which any nationwide HUC 8 watershed is considered to be relatively salient (or is prioritized) by CBG residents for water quality improvements. For any CBG, we hypothesize that the salience indicators elicited in Task 2 will be related to multiple attributes of each HUC 8, each of which can ex ante be envisioned as affecting salience status, such as total river miles, area of lakes and reservoirs, special recreation areas (day use areas, state parks, etc.), total population residing within the HUC, and distance from the respondent's home CBG, among many others. Other explanatory variables will include Census (demographic) information from the respondent's home CBG, such as racial/ethnic category composition, age structure, educational attainment, and income. Grounded in these hypothesized relationships, among the core methodological contributions of the proposed research is development of an entirely new Salience Classification Model (SCM), using ML methods. The ultimate goal of the SCM is to determine, for any CBG / HUC8 combination in the contiguous US, if the considered HUC is deemed as salient or not, from the perspective of the specific CGB. The SCM will be estimated using an adaptation of Random Forests (RFs).4. The next task will retrospectively update the PDs extant metadata on per household WTP for water quality improvements to incorporate information on the extent to which the improvements that were valued by each underlying, primary study observation in the metadata occurred in salient or non-salient watersheds, for the originally sampled households, as predicted by the SCM. We will also review the literature for new studies that can potentially be added, thereby extending metadata size and coverage.5. The new set of spatial salience variables, calculated for each metadata source observation, will support an augmented specification of the baseline LW MRM. This extended MRM will produce a uniquely tailored benefit function for any desired "policy site" and water quality scenario for which benefit estimates are required. This extended MRM will predict per household benefits as a function of the degree to which spatially heterogeneous water quality improvements in the considered policy scenario occur in home, salient or non-salient areas for households living in each CBG within the BT market area, in addition to effects of the wide array of additional MRM variables (e.g., on distance, the aggregate scale of affected water bodies, affected landscapes, the scope of water quality change, income, affected uses, affected water body types, etc.). The integrated model will enable the first large-scale BT able to accommodate both continuous spatial effects (e.g., distance, affected area size, spatial substitutes) along with discrete preference discontinuities and WTP hot/cold spots due to localized resource salience and heterogeneous environmental effects.6. We will evaluate the performance of the new approach as applied to implement BT for a multi-state water quality improvement scenario. We anticipate an illustrative application with heterogeneous and diffuse changes in water quality over the multi-state Mid-Atlantic region. These evaluations will compare the accuracy of the new approach to that of prior MRM BTs that are unable to accommodate localized spatial salience, using out-of-sample cross validation. We will consider the extent to which applied results from the proposed approach differ from those produced by current methods, including the extent to which each type of BT is sensitive to the spatial distribution of water quality changes across the landscape. We will consider implications not only for per household and aggregate WTP estimates, but also how these values are distributed across CBGs within the case study area. We will further consider the capacity of the proposed BTs to identify water quality changes that are most valued by disadvantaged communities, and to estimate the benefits realized by these groups.7. This task will reformat SCM results into a set of nationwide spatial-salience data layers. These layers will identify the specific HUC 8 watersheds for which water quality improvements are predicted to be salient for households in any selected CBG across the contiguous US. These geospatial data layers will be designed for "plug and play" use within other, future BTs and policy studies, including those that do not rely on LW MRMs. Among other uses, this new data resource can be used by researchers and practitioners to draw insight into the relevant extent of the market for water quality changes, by showing the specific areas for which water quality changes are deemed to be most salient for residents of any given CBG nationwide (focusing on the greatest distance for which watersheds are predicted to be salient). It can also be used to identify areas wherein water quality changes will be most important to households in CBGs with disadvantaged communities or those with environmental justice concerns.8. The final task will coordinate project results to provide generalizable insight and guidance into practical methods, guidance and protocols for BT procedures using the proposed methods, along with the accuracy to be expected when applying these methods.