Progress 07/01/21 to 02/28/22
Outputs Target Audience: Public sector individuals who, under the authority of federal, state, local and tribal agencies act as stakeholders in the wildland fire community and have responsibility for or a critical role in managing wildfire mitigation and response. These include Incident Commanders, Operations Section Chiefs (and their subordinates), Planning Section Chiefs (and their subordinates), Fire Behavior Analysts, and Long Term Fire Analysts. At the national level, there are 5 federal land management agencies that are target users for Cornea's products: USDA Forest Service (Department of Agriculture); and Bureau of Land Management, US Fish and Wildlife Service, National Park Service, and Bureau of Indian Affairs (Department of the Interior). Each of these five agencies has subordinate regional or state offices organized on a geographic basis covering the entire country. Those serving the western and southern regions of the United States are the priority potential customers, as those regions have the largest, most numerous, and most damaging wildfires. They also have the largest concentrations of federal land and federal firefighting responsibility, the most fire-prone climate, and the largest number and acreages of prescribed fires each year. At the state level, potential users are the state forestry agencies. Specifically, the target audience includes agency administrators (e.g. state foresters, fire management officers, fuels managers) as well as the state level counterparts to those roles identified at the national level (incident commanders, operations and planning section chiefs, and incident management team fire behavior analysts). At the county and local level, our target audience includes county officials, fire department chief officers, and emergency management office directors. The private sector target audience includes utilities, insurers/reinsurers, and "insurtech" companies (defined as technology innovations to improve efficiency and reduce costs in the insurance industry). The utility industry in the western US is also a target user, as these companies need wildfire risk models and assessments to mitigate their exposure to the potential liability of starting or exacerbating a wildfire. As an indicator of the importance of this market, utility companies in California spent $11B on wildfire prevention in 2021, and $6B will be spent by PG&E alone in 2022 to reduce wildfire risk related to their infrastructure. Property and casualty insurance carriers need wildfire analytics for underwriting, risk pricing for policies, mitigation value (home hardening, vegetation removal), catastrophe response (evacuations), and post-catastrophe damage assessments and claims. Insurance carriers notoriously lack sophisticated technology and integrated systems, and with increasing claims due to wildfires, they would greatly benefit from risk analytics in order to understand their exposure, price accurately to avoid losses, and respond to wildfires and assess damages. The insured losses from wildfires from 2010 and 2020 was $45B, and the cost to insurers in 2020 from western US fires alone was $13B. Reinsurers invest heavily in limiting losses in the case of a catastrophe and employ more scientists than other types of insurers to understand risk models and ensure accuracy. They are focused on specific events and use stochastic models to understand and predict the random occurrence of an event. In our initial research, we have learned that some resinsurers may already understand and use fire behavior models, but they prioritize accuracy and would be ideal users for the more accurate models we produce as well as the specific data sets that will be incorporated into larger ensemble models. Insurtech companies provide online services that enable users to find and/or compare insurance policies, and technology based on artificial intelligence or machine learning is often used to process claims digitally. Insurtech is a target audience for our product, as they are technologically advanced and capable of integrating advanced analytics quickly. There are up to 3,475 insurtech companies globally, with 1,500 emerging in the last five years. Our target audience in this market focuses on newer insurtech companies tackling property and casualty insurance in catastrophe-prone areas in the US. Changes/Problems:During the course of our Phase I work, we were faced with unforeseen challenges that required a realignment of our time and personnel resource focus. To this end, we made the decision to deprioritize Technical Objective #2 in favor of increased focus on PCL accuracy improvements through automated hyperparameter tuning, model selection and data cleaning. This difficult, but ultimately correct decision facilitated our ability to achieve numerous and significant successes in our efforts under Technical Objective #1, #3 and #4, as detailed above. We still believe in the value of including dynamic weather as an input to our calculations and will continue to pursue this end goal following the completion of Phase I. What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?
Nothing Reported
What do you plan to do during the next reporting period to accomplish the goals?
Nothing Reported
Impacts What was accomplished under these goals?
Objective 1: Improve "unsupervised" fire-behavior modeling 1) Major activities completed / experiments conducted Creation of a catalog of vetted resources to support fire modeling. The vetting and categorization process have proven to be vital for the work completed against all 4 objectives. The incorporation of more efficient algorithms and tools for PCL/SDI generation and PCL model training. This included: Euclidean distance calculation optimization Distance calculation efficiencies for sparse matrices Migration from Pandas dataframes to numpy/scipy Bitmap rasterizing efficiencies Multi-core (2) feature generation Distributed worker nodes via Ray.io Upgrade from 2 to 8 core feature generation Improvements to security, scalability, & parameterization of our baseline systems that produce the fire behavior models. SDI generation architecture PCL generation architecture & iteration for improved accuracy Optimal ML model selection & fire behavior tooling Train/test data review Inclusion of additional fire behavior features Automated hyperparameter tuning for subsample rate, learning rate, & max depth 2) Data collected Literature review of relevant data sources in the space for further investigation. 3) Summary statistics & discussion of results Implementation of a scalable infrastructure for calculating PCLs, SDIs, and other indicators of fire behavior & control, with generation times for avg. sized pyromes at 1 hour, roughly a 50x improvement & cost/run on cloud services almost 0 from nearly $100/pyrome. Implementation of SDIs at parity with current published SDIs (via Risk Management Assistance Dashboard) with 98% pixel Pearson correlation. An increase in PCL accuracy (AUC) from 69% to 80.8%, a 17% improvement. 4) Key outcomes or other accomplishments realized Successfully improved the model and computational efficiency of PCLs and SDIs. Impact Statement Increasing the accuracy and speed of PCL & SDI calculations has the potential to have a meaningful impact on our target audience and the communities that they serve. By increasing accuracy in both calculations, we enable wildfire managers & decision makers to make more informed, timely and data driven decisions during pre-planning exercises, fuels treatment projects and active incident response scenarios. Currently, these products are only available utilizing static data with customization limited by resource constraints. Our ability to produce PCL and SDI products rapidly and at scale is a paradigm shift. Objective 2: Inclusion of dynamic weather 1) Major activities completed / experiments conducted Identification of viable dynamic weather data sources in both federal and state data repositories as well as several sources of weather forecast simulations that can be used for model training and validation. 2) Data collected N/A 3) Summary statistics & discussion of results Despite the data sources identified, each is unique in its spatiotemporal resolution, availability, output format, and quality. This has two impacts that we addressed. First, the integration of this data as effective sources of model input required a detailed assessment of these issues, resulting in the need to coregister these data, perform further quality control, and/or renormalize dataset formats. We determined that these tasks needed to be performed prior to ingestion into our data management system. Second, as model inputs may require spatial and temporal coverage not otherwise available from the datasets themselves, we investigated how to effectively downsample the data to provide hyperlocal estimates of relevant weather variables, bounded by the observed values. We determined that these specialized estimates also must be ingested into different database tables. 4) Key outcomes or other accomplishments realized As a result of our evaluation of both the weather data sources as well as the technical feasibility of inclusion into our models, we determined through the course of Phase I that it was not in the best interest of our scope of work to utilize these data sources at this time. We consider the evaluation of this technical objective successful in that our understanding of both our options and the technical requirements necessary (and limitations to account for) is substantially enhanced as compared to our initial understanding. Impact Statement N/A Objective 3: Experiment with other machine-learning approaches 1) Major activities completed / experiments conducted Review of relevant deep learning architectures including ResNet and various forms of Convolutional Neural Nets, focusing on methodologies for image segmentation, relevant for this task because they are designed to break down an image into its component objects and then classify those objects. Used a U-Net architecture, a state of the art CNN that has been shown to outperform standard sliding-window CNNs while also allowing for smaller training data sets via data augmentation technique usage. Note: Model selection across more standard machine learning models was conducted as part of Objective 1. 2) Data collected We created 351 images (each 400 x 400px) from our augmented model development region and performed image transformations to introduce more heterogeneity in the model data in line with the U-Net approach. This yielded 1755 images for training and validating the model. 3) Summary statistics & discussion of results Our work here remains preliminary and ongoing. To date, the AUC has been at 70%, representing little improvement over the baseline. Neural networks are notoriously difficult to train and tune, so we attribute this to the small sample size and time needed to perform the requisite hyperparameter tuning. 4) Key outcomes or other accomplishments realized Expertise in the study and science of neural net architectures. Expertise in the development, implementation and evaluation of neural net architectures. Assessment of possible impact and necessary work to create a scalable architecture and generate accurate results. Impact Statement While this did not yield accuracy gains, we established expertise and baseline to continue improvement if we choose to do so. Objective 4: Continue development & validation of a user interface 1) Major activities completed / experiments conducted We conducted 2+ months of interviews and generative research with our target audience. All provided direction for our design work, as well as validation for product priorities. We executed rapid prototype iterations on low-fi wireframes to ensure user satisfaction and conducted 3 sessions of user testing. 2) Data collected Through target audience interviews, we collected a robust dataset of requirements, pain points, tooling needs and UI design ideas that helped shape and refine our UI design. These datapoints were systematically tracked, cataloged and organized. 3) Summary statistics & discussion of results Achievement of a task-success rate of 10%, a 10% reduction in time on task for the tool's primary tasks, and a 10% improvement in comprehension of core features. 4) Key outcomes or other accomplishments realized Development of user-friendly application to view and understand SDIs/PCLs across data overlays. Ability to generate custom SDI/PCL via parameterized inputs. Development of a flexible microservices architecture for iterative and efficient continued development. Creation of a standard authentication and authorization framework. Impact Statement The science underlying PCL and SDI calculations is complex and have historically been inconsistently understood and applied by wildfire managers in the field. The development of a simple software tool that allows non-academic wildfire stakeholders to leverage this data and apply it to strategic, operational & tactical decision making represents a significant advancement in the mission of driving increased adoption of wildfire science by field-based stakeholders.
Publications
|