Research
During the years, I have touched several research areas, such as firm production analysis, China forest tenure reforms, recreation models, land use models, power market dispatch models, yield predictions, climate change and yield impacts, patent analysis, integrated assessment model of water quality. For a complete profile of my research products, you can see my google scholar profile.
Non-market Valuation
My non-market valuation work focuses on using discrete choice models, especially random utility maximization models, to model people's recreation site choices. In this series of the work, I have published several studies with collaborators, and generated several unpublished pieces of the work.
Journal Publication
- Modeling recreation demand when the access point is unknown with Joseph Herriges and Catherine Kling, American Journal of Agricultural Economics, 2016 98(3) pp 860 - 880
Definition of a recreation site is difficult, especially for river sites. A river segments can last more than 100 miles, the household recreation survey can not give detailed assess points by the users. Relying on the aggregation method, we proposed a method to treat one river segment as the combination of a set of pre-defined assess points. If our assumption is indeed true, the series of Monte Carlo simulations we conducted show that the aggregation method can recover the underlying preference parameters, better than previous methods.
In recent years, the smart phone based movement data may allow researchers completely bypass the issue, since in theory, the smart phone location tracking data can reveal exact points/areas, people visit the river sites. However, due to privacy concern, it may be feasible for researchers to build individual level choice models. At least, this tracking data allows researchers more choices other than conduction a household survey to learn and model people's recreation choices.
- Temporal reliability of welfare estimates from revealed preferences with David A. Keiser and Catherine L. Kling, Journal of the Association of Environmental and Resource Economists, 2020 7(4) pp 659 - 686
Time changes, people change. In the past, not so many revealed preference work, especially in recreational models, looked at this question, quite different from the literature of stated preference based work. We relied on the Iowa Lake Project data to look at whether the preferences we recovered from annual household surveys are temporal stable. The results indicate we need to pay attention to this temporal reliability issue, especially when you need to infer something from data and models long time ago.
After the pandemic, it is more likely people have changed their attitudes toward social life and outdoor resources. What is the implication? If we want to use results from studies conducted in 1990s or early 2000s, even 2010s, we may need to ask ourselves, how much do we believe the people will behavior like before? If we believe people changed, do we need conduct more researches to capture these changes in order to provide more future relevant advices for policy making? Personally, I believe we need conduct more studies on the same topics to get undated description about the population if we can not adjust previous results easily.
Physicists measured the Newton's gravitational constant once, we may bet this constant is accurate up to several digits after zero for decades or centuries to come. However, we can not assume the elasticity of income on food would be accurate in the same time period.
- Revenue and Distributional Consequences of Alternative Outdoor Recreation Pricing Mechanisms: Evidence from a Micropanel Data Set with David A. Keiser, Catherine L. Kling, and Daniel J. Phaneuf, Land Economics, 2022 98(2) pp 478 - 494
This paper is to investigate how the pricing schemes, essentially the entrance fee, affect the revenue and their distributional impacts with the multi-year household survey data collected in Iowa Lake Project in the past two decades. The results suggest the pricing scheme could be used to raise revenue, however, the distributional impacts are regressive, even with income-differentiated pricing scheme.
Unpublished work
- Modeling recreation with partial trip information with Joseph Herriges and Catharine L. Kling (presented at 2011 1st AERE annual meeting)
This is the extension of the river work. Due to resource constraint and feasibility, household surveys about recreational activities typically can only focus on a set of sites, usually, policy relevant sites. However, in the reality, the set of close substitute recreation sites includes more sites than those included in the survey questionnaire. In this work, we proposed a method which can be used in situations: 1. we knew the full set of possible recreational sites. 2. the sites are excluded from the survey due to resource constraints like budget constraints. 3. the preference has a simple nested structure. 4. no omitted site attributes (although the performance with omitted variables seem quite good).
In recent year, there are some methods which can deal with massive choice sets by random sampling originally proposed in McFadden (1979), which is originally proposed for fixed parameter models. In Guevara et al. (2016), the method is proved to be applicable in random parameter models. Together with cell-phone tracking data, in theory, a model which includes thousands or more sites in the choice set can be estimated.
- Water-based Recreation and Water Quality Indices: A Revealed Preference Approach with David A. Keiser. (presented at 2016 AAEA annual meeting)
Water quality index was proposed to comprehensively measure the water quality. In recreation models, typically, one or several individual physical measures which could be included in the National Sanitation Foundation's WQI, like Dissolved oxygen. In many studies, water clarity measures like Secchi Depth are found to be highly significant. However, many benefit transfer studies relied on the WQIs. In this paper, we compared the model fitting performance and welfare evaluations when we include WQI in the model. The WQIs (calculated with different aggregation schemes) are sometimes statistically significant in the model. However, the welfare measures are quite different. In some extreme cases, they can point to different directions.
This study inspired the temporal reliability work. Until recently, WQIs are still seldomly used in the recreational modeling, especially with revealed preference approaches. There are water quality programs which are targeting on a specific set of physical measures, like nutrients. To better address benefit transfer questions, researchers may need to balance the statistical performance vs policy evaluations, especially when there is no well accepted conversion formula between the variables used in the model directly and the variables directly targeted by the programs. If there is a discrepancy, we may introduce more noises when we used the estimation results to infer benefit transfer values.
Land Use Models
My land use work focuses on the crop choice in US midwestern states, especially on using discrete choice models to estimate farmers' crop choice with remote sensing data, such as Cropland Data Layer (CDL). The core of model is based on Lubowski et al. (2006) and with the emphasis on the dynamic effects naturally imbedded in the corn and soybean rotation system. The estimation method is based on conditional choice probability method proposed in Hotz, V. Joseph, and Robert A. Miller (1993).
Unpublished work
Crop choice and rotational effects: A dynamic model of land use in Iowa in recent years with Sergey Rabotyagov, Catherine L. Kling (presented at 2014 AAEA annual meeting)
In the typical Corn-Soybean row crop system in US Corn Belt states, there are natural benefits for corn after soybean. This creates the dynamic link between crop choice among different growing seasons. The discrete choice models have been used in the micro-land use modeling from early 2000s, typically, a static choice model or partial dynamic models, i.e., including the lagged crop choice. In this paper, we proposed a dynamic discrete choice land use model, based on the newly released remote sensing data sets like Cropland Data Layers and the CCP estimation method, an efficient method compared with the nested estimation method (Rust, 1997). When we apply the dynamic land use model to Iowa recent land use data, we found that: first, much higher payments are needed for farmers to foregone the corn-soybean crop option. Second, the dynamic models produce significantly different arc elasticity than the static model in a policy scenario when the corn price increases by 10 percent.
Crop Choice, Rotational Effects and Water Quality Consequence in Up-Mississippi River Basin: Connecting SWAT Model with Dynamic Land Use Model with Sergey Rabotyagov, and Catherine L. Kling (Poster presentation at 2015 AAEA annual meeting).
An extension of previous work and a connection to Soil and Water Assessment Tools (SWAT) models. Land use changes have implications beyond economic issues, specifically, land use changes have critical water quality implications. For example, the corn belts contributes most nutrients to the dead zone in the gulf of Mexico. This work is trying to investigate what we can learn if we allow dynamic links within in the corn-soybean system. The results suggested that the dynamic land use models do produce different land use patterns, however, the impacts on water quality seem similar as that from a static land use model. It suggests that the management practices and weather conditions play more roles than the land use patterns.
- Estimating adoption of cover crops using preferences revealed by a dynamic crop choice model with Adriana Valcu-Lisman and Sergey Rabotyagov (Presented at 2015 AAEA annual meeting)
Another extension of the dynamic land use model to consider the cover crop adoption scenarios. With the calibrated alternative constants, the model projected low levels of adoption under certain cost-share programs. The calibration relied on 2012 Ag census adoption information at county levels. Now, there are some field data sets on the locations of the adoption. With these new data sets, we can better understand the decision factors for farmers to adopt the cover crop.
Climate Change and Crop Yields
In this line of work, I am recently interested on the implication of machine learning models on the climate change - crop yield studies. When I use the traditional econometric models in yield analysis, I relied on the model specification used in Roberts et al. (2013) AJAE paper. With machine learning frameworks, currently, I used LSTM, attention-LSTM and Transformer (attention models).
Journal Publication
- Agricultural innovation and adaptation to climate change: Insights from US maize with Seungki Lee, and GianCarlo Moschini, Journal of the Agricultural and Applied Economics Association. 2022 1(2), page 165 - 179
In this paper, we combined the better state level adoption of GE maize seeds and county maize yield in US from 1981 to 2020 to understand how much research needs to compensate climate change yield effects measured by the yield contribution of GE maize seed. The results shown much larger research efforts are needed to fully compensate the negative climate change impacts.
Unpublished work
- What A Deep Learning Approach Say about Future US Soybean Yields with Tao Xiong, and Darren Ficklin. (Poster Presentation at 2020 annual meeting)
In this study, we used a Long-Short Term Memory (LSTM) model to predict soybean yields at 11 US midwestern states, subsequently, we used the best model specification to project future yields under 19 global climate models. Our model prediction performance is close to then similar yield prediction models. In term of future yield impacts, we found quite moderate yield impacts compared with results from econometric models. Similar results were also found in a corn yield prediction model with machine learning techniques, artificial neural networks.
Machine learning models continuously show the better yield prediction performance compared with econometric models. How should we look at the climate change results from these better prediction models?
- Spatial aggregation of weather variables and its implication in climate change analysis: The case of U.S. Corn and Soybean with Ruiqing Miao (Poster Presentation at 2021 annual meeting)
This study is to investigate how spatial aggregation of weather variables affect yield analysis results. Typically, yield information is aggregated at the county level. However, the weather variables come as station data sets or rasterized spatial cells. This leads to unavoidable aggregation of weather variables. We compared three different aggregation schemes. First, you average all the raster cells within certain distance from the county centers. Second, you average all the raster cells with their centers located within in the boundary of counties. Third, you average all the cells with overlapping areas with counties and use the overlapping area as the weights. Our results show that there are substantial difference in monthly weather variables and less discrepancy in seasonal weather variables. However, these differences between aggregation schemes did not manifested themselves in yield regression results. Namely, larger differences in X variables, negligible impacts on yield analysis like quantifying future yields under different climate scenarios. When we compare model with monthly weather variables versus models with seasonal weather variables, we find substantial discrepancies in the climate change scenarios. It suggested the temporal delamination of weather variables is critical in the yield climate change analysis which is reported by several recent papers.
Integrated Assessment Model - Water Quality
Integrated assessment models for water pollution, water quality, and nutrient problems is less studied compared with that of air pollution. I am currently in a project lead by David Keiser, Catherine Kling and Daniel Phaneuf. In the project, we are trying to quantify several impacts of nutrients in the surface water: recreational benefits, neighborhood housing benefits, drinking water treatment costs (revealed preference), willingness to pay on changed biological condition gradients (BCG), et al.
At the same time, I worked with my colleagues in Center for Agricultural and Rural Development (CARD), Iowa State University to build a regional integrated assessment model. Different from the national work, we will rely on the household survey data collected in the Iowa Lake Project to calculate the recreational benefits due to water quality changes. In stead of relying on meta-analysis of hedonic studies, we will use Zillow Ztrax data for the primary hedonic analysis about the impacts of water quality on Iowa neighborhood housing markets.
Innovation (Patent) Analysis
Patent databases are widely publically avaible, like USPTO, Google public patent database, et al. Currently, I am mainly working with patent meta information to see the landscape of patent universe, like who holds what type of patents, and how this landscape change in the past. Two sectors of patents, green patents and ag/env related patents (precision ag patents and water patents), are my current interests.
Published work
- The roots of agricultural innovation: Patent evidence of knowledge spillovers with Matthew S Clancy, Paul Heisey, and GianCarlo Moschini
This study investigates the extent to which agricultural innovations draw on ideas originating outside of agriculture. We identify a large set of US patents for agricultural technologies granted between 1976 and 2018. To measure knowledge spillovers to these patents, we rely on three proxies: patent citations to other patents, patent citations to the scientific literature, and a novel text analysis to identify and track new ideas in the patent text. We find that more than half of knowledge flows originate outside of agriculture. The majority of these knowledge inflows, however, still originate in domains that are close to agriculture.
In this study, I mainly worked on the data management part. Through this work, I have learned several public sources for patent information, and found the google public patent database, a great source if you do not have financial resources to secure propertiery datasets.
Ongoing work
- Landscape analysis of low-carbon patents from 1990 to 2020 with Jingbo Cui, Shen Lin, Junjie Zhang
- Landscape analysis of Precision Agriculture patents and contribution of land-grant universities.