## Abstract

Trees are known to provide various ecosystem services and disservices to urban communities, which can be quantified using models based on field and environmental data. It is often uncertain how tree structure and environmental variables impact model output. Here we perform a sensitivity analysis (SA) of i-Tree Eco, a common urban forest model, to analyze the relative impact of different model inputs on three module outputs: biogenic volatile organic compound (BVOC)(isoprene and monoterpenes) emissions, carbon storage and sequestration, and dry deposition of nitrogen dioxide, sulfur dioxide, and ozone. The SA methods included novel applications of the Morris one-at-a-time method and a variance-based decomposition method, which integrates Monte Carlo simulation with Latin hypercube sampling and Iman Conover analysis. A case study was performed in New York City, New York, USA, with field plot data collected in 2013. Genus has the largest influence on BVOC emissions by determining base emission rates and its high interactions with other input factors, and BVOC emissions are sensitive to leaf biomass in a concave manner and temperature in a convex manner, while isoprene emissions show a strong linear relationship with photosynthetically active radiation (PAR). Diameter at breast height plays the most important role for both carbon storage and sequestration estimators; crown light exposure and tree condition are also important for carbon sequestration. Dry deposition velocity is sensitive to leaf area index and relative humidity in a nearly linear way, while sensitive to temperature and PAR in a concave manner. The results provide guidance to facilitate future field plot campaigns and model development. The knowledge revealed by the SA is also beneficial for model uncertainty reduction, which in turn facilitates more effective urban forest management and decision-making.

## INTRODUCTION

As a demographic trend and land transformation process, urbanization creates many environmental problems (e.g., increased runoff and nutrient export, increased temperatures, and increased material consumption and energy use)(Duh et al. 2008; Poumanyvong and Kaneko 2010). One way to alleviate these impacts is through urban greening, and many cities have launched large urban tree planting initiatives (McPherson et al. 2011). Models of urban tree impacts on the environment can help develop more efficient and effective planting schemes (Morani et al. 2011), identify areas where existing forests should be maintained (Lin 2020), improve the overall management of urban forests, and better quantify the benefits of these forest resources (Nowak et al. 2008a). One popular model is i-Tree Eco (hereafter referred to as “Eco”)(https://www.itreetools.org), which can quantify the structure, function, and ecosystem benefits trees provide (Nowak and Crane 2000), and has been used by thousands of researchers, urban planners, and others around the world to advocate for the benefits of urban trees (Lin et al. 2019).

While this tool has been extremely beneficial to the planning and management of urban trees, this model makes assumptions that simplify the relationships between the structure and function of urban forests and the representation of urban landscapes (Nowak and Crane 2000). While such assumptions are often necessary to model these complex systems, they can increase the uncertainty of model output and hinder the efficient and effective management of urban forests. Several studies point out that the uncertainty regarding various aspects of urban forest models (e.g., mortality rates of trees, transpiration rates, and meteorological conditions) should be better addressed (Yang et al. 2005; Morani et al. 2011; Selmi et al. 2016). Current uncertainty estimation within Eco is limited to the standard error of the total number of trees within the study area (Nowak et al. 2008b) and its impact on selected model outputs (e.g., carbon storage)(Nowak et al. 2013). In addition, those standard errors usually come from sampling error instead of error of estimation (e.g., allometric equations) and therefore tend to underestimate the overall uncertainty (Nowak et al. 2013). This study focuses on developing an advanced sensitivity analysis (SA) of Eco, including assessing how changes in model outputs can be apportioned to different model inputs, differentiating the relative importance of different model inputs, and identifying specific input-output relationships.

SA can be used in different settings (e.g., variance cutting, factors prioritization and fixing)(Saltelli et al. 2004), and serves a variety of purposes (e.g., model development, scenario analyses, and comparative studies)(Saltelli et al. 2004; Hirabayashi et al. 2011; Steffens et al. 2012). A wide range of SA techniques have been developed, ranging from classical frequentist analyses to complex Bayesian inference, and from local (e.g., Morris one-at-a-time) to global (e.g., variance-based) methods (Marino et al. 2008; Saltelli et al. 2008). Different SA methods can also be integrated together to achieve a more complete understanding of the relationship between input and output variables (Van Griensven et al. 2006).

This study employs a Morris one-at-a-time method (MOAT) and a variance-based decomposition method (VD) which integrates Monte Carlo simulation with Latin hypercube sampling (LHS-MC) and the Iman Conover method to address the correlation structure of the input variables (Iman and Conover 1982). The novel applications of SA are illustrated with a case study in New York City (NYC) for 2013. The specific goals of this analysis are to: (1) determine the ranking and relative importance of input variables on Eco outputs; (2) improve the understanding of the input-output variable relationships in Eco (e.g., linear and additive effects, nonlinear and interaction effects, and threshold effects); and (3) explore the implications of the results for future research and urban forest management (e.g., areas where additional data collection and analyses may be needed).

## METHODS

### Model Description

Eco consists of modules which can quantify urban forest structure, function, and value using field plots, air pollution, and meteorological data as input variables (Nowak and Crane 2000; Nowak et al. 2008a) (Figure 1). The input module, Eco-A, characterizes urban forest “Anatomy,” or the structure and composition based on data from field plots. Other modules examined here include one ecosystem disservice and two ecosystem services of urban trees. The Eco-B module estimates biogenic volatile organic compounds (BVOCs) from trees based on tree species, leaf biomass, air temperature, and other environmental factors. Trees emit hundreds of species of BVOCs, but the two major BVOCs are isoprene and monoterpenes (Bonan 2015). The Eco-C module estimates total carbon storage and annual carbon sequestration based on allometric equations, tree growth, mortality, and decomposition rates. The Eco-D module estimates the hourly air pollution removal by urban forests based on dry deposition processes (Nowak et al. 2006). The interactions between input variables and these modules are outlined in Figure 1, and the outputs from Eco that will be examined in this study are listed in Table 1.

Eco-B estimates BVOC emissions (E) as: 1

where *B _{E}* is the base genus emission rate, which is defined as the emission level standardized to

*T*= 30 °C and PAR = 1000 μmol/m

^{−2}s

^{−1}(Nowak et al. 2008a),

*B*is species leaf dry weight biomass, and γ is the temperature and light correction factor. The i-Tree species database contains data for more than 6,500 tree and shrub species and their corresponding attributes; among these attributes,

*B*is compiled from the literature (Nowak et al. 2002). Users can also upload local site-specific

_{E}*B*values to the i-Tree database (this needs to be validated by the i-Tree team). For isoprene, γ is estimated as: 2

_{E}while for monoterpenes it is: 3

where *c _{T}*

_{1}= 95000 J × mol

^{−1},

*c*

_{T}_{2}= 230000 J × mol

^{−1},

*T*= 314 K, α = 0.0027 μmol

_{M}^{−1}/m

^{2}s,

*c*

_{L}_{1}= 1.066 (dimensionless), β = 0.09 K

^{−1}(empirical coefficient),

*R*is the ideal gas constant (8.314 J × K

^{−1}× mol

^{−1}),

*T*is a standard temperature (303 °K), and

_{S}*T*is leaf temperature (°K), which is assumed to be equal to air temperature (Guenther et al. 1995; Guenther 1997).

Eco-C estimates forest biomass (Bio) using allometric equations from the literature (Nowak et al. 2013). The two most commonly used equations have the forms: 4

where *A* and *B* are coefficients whose values vary based on species information. Species also determine the selection of equation forms. *X* is the predictor variable. Two forms of *X* employed in Eco-C are:
5

For Eco-D, detailed equations used to calculate aerodynamic resistance, boundary layer resistance, canopy resistance, and dry deposition velocity can be found in Hirabayashi et al. (2011).

### Study Area and Data Employed

A sensitivity analysis case study of three Eco modules was performed for New York City (NYC). Tree species, diameter at breast height (DBH), tree height, crown height and width, tree condition, crown light exposure (CLE), percent crown missing, and land use type associated with 1075 trees were obtained from 296 randomly sampled field plots measured in 2013, which is more than the 200 plots used by many i-Tree studies (Nowak et al. 2008b). Although some rare species are likely omitted by random sampling, random sampling produces statistically accurate estimators of urban forest structure (e.g., number of trees and tree sizes) with known bounds of error. Tree condition is estimated as the percent of the crown that is composed of dead branches (to nearest 5%), which has a total of 7 categories ranging from dead to excellent; CLE is the number of sides (4 cardinal directions and 1 top side) of the tree receiving sunlight from above (ranging from 0 to 5), which is employed to estimate light environment and consequently growth rates (Nowak et al. 2008a). Eight land use types were identified across the field plots; land use affects carbon storage and gross carbon sequestration, which is addressed by assigning land use biomass adjustment factors (Nowak et al. 2008a). Leaf area index (LAI) statistics for this study area were taken from Breuer et al. (2003)(Table 2), which summarize LAI for 26 temperate regions. Model meteorological inputs (e.g., temperature, relative humidity [RH], wind speed, air pressure, and photosynthetically active radiation [PAR]) were obtained or derived from data from the nearest airport (NCDC 2018). PAR, which is the visible part (400 to 700 nm) of the solar spectrum, is calculated as 46% of total solar irradiance (Hirabayashi et al. 2011). The summary statistics for each input parameter are presented in Table 2.

### Sensitivity Analyses

#### Morris One-at-a-Time Analysis

Two SA methods were employed in this study: MOAT and a variance-based decomposition (VD) method. MOAT and VD, along with regression analyses to support the SA, were conducted using the R statistical software package. MOAT is a local sensitivity method which is computationally inexpensive and can differentiate input variables as negligible, linear/nonlinear, or having interaction effects (Saltelli et al. 2004; 2008). The method starts by random sampling *k* parameters, *X*_{1}…, *X*_{i},…*X*_{k}, from predefined levels of all input variables and calculating the subsequent model output *Y*(*X*_{1}…, *X*_{i},…*X*_{k}). For different Eco modules, the number of model parameters for (*k*) varies (*k* = 3 for isoprene, 2 for monoterpenes, 3 for carbon storage, 5 for carbon sequestration, and 6 for dry deposition). In the second run, only one parameter, *X*_{i}, is increased by step size Δ with all other parameters remaining at their starting values, and model output is again calculated *Y*(*X*_{1}…, *X*_{i} + Δ,…*X*_{k}). From this, the elementary effect (EE)(Van Griensven et al. 2006; Saltelli et al. 2008) of the *i*^{th} input variable is estimated as:
6

*i* is an elasticity, i.e., the percent change of the output divided by the percent change of the input. For each random sample of parameters, every parameter is subsequently increased with all other parameters back at their starting values, and the EE for that parameter is calculated. The levels of the parameters are usually identified by the midpoint of 4, 6 or 8 divisions of the parameter range, with each division of equal probability (Saltelli et al. 2004). In this study, 8 divisions were used (so Δ = 1/8 × parameter range), and the above procedure was repeated 20 times, which leads to a total of 20(1 + *k*) runs. The mean (*μ*) and standard deviation (*σ*) of the EE for each parameter across all runs is then calculated and plotted. *μ* measures the overall influence individual parameters have on the output, while *σ* indicates whether the interaction between input and output are linear across the parameter space (low *σ*) or if nonlinear or interaction effects are present (high *σ*)(Saltelli et al. 2008). When positive and negative values of EE_{i} cancel each other out, a low *μ* may be obtained for parameters which have a large impact on the output. Campolongo et al. (2007) proposed to instead calculate the mean of the absolute value of the EE_{i} (*μ**) to avoid this problem; this approach is implemented in this study.

When applying MOAT to Eco-B and C, three things should be noted. First, we dropped the cases when the original model output *Y*(*X*_{1}…, *X*_{i},…*X*_{k}) = 0, because it will lead to an error (division by zero) when calculating EE_{i}. For example, when a tree is dead, the annual carbon sequestration is equal to zero; this also occurs when the BVOC emissions are estimated as zero for some species. Second, for the categorical input variables “land use,” “tree condition,” and “CLE,” which have limited categories and can be effectively represented using levels (each category is one level), was always set as 1 when changing individual input parameters from one level to the next in the EE_{i} calculation. Third, the unordered categorical input variables “species” was not examined because its levels cannot be effectively determined and chosen from hundreds of species in the study site; species were randomly chosen and fixed for each simulation.

#### Variance-Based Decomposition (VD) Analysis

A flowchart to perform VD by integrating Monte Carlo simulation with Latin hypercube sampling (LHS) and Iman Conover is presented in Figure 2. Stage 1 is to determine appropriate probability distribution functions (PDFs) to describe input variables, and then to perform LHS on the PDFs. Stage 2 is a Monte Carlo simulation where model outputs are estimated using input variables obtained from the LHS in Stage 1. Stage 3 is a variance decomposition analysis, which decomposes the variance of the output into different fractions that can be attributed to different inputs, as well as a quantile regression analysis to explore the general input-output relationships. Each of these stages is briefly described below.

Stage 1 has two components. First, probability distributions were fit to input variables. Specific probability distributions were identified based on either empirical data sets from the study site or literature recommendations (Table 2). The use of the Weibull distribution for wind speed and the beta distribution for relative humidity were suggested from the literature (Yao 1974; Celik et al. 2010), and the PDFs of other identified input variables were empirically identified and estimated based on observational data sets from the study site. The assessment of PDF fit was performed by employing a Kolmogorov-Smirnov goodness-of-fit test and quantile-quantile plots (Wasserman 2013). For LAI, a uniform distribution was assumed with the minimum and maximum values taken from Breuer et al. (2003). Ordered categorical input variables were assumed to follow a uniform distribution with an equal probability of being within each category. Table 2 contains the distribution used for each input variable and the *P*-values for the Kolmogorov-Smirnov tests, which were all less than 0.2 and generally less than 0.1.

Second, LHS, which combines the advantages of random sampling and stratified sampling (Helton and Davis 2003), was employed to sample probability distributions from Stage 1. In LHS, the parameter space of each input variable is divided into *N* (*N* = 1500 in this study) intervals of equal marginal probability, 1/*N*, and one sample of each variable is made randomly within each interval. Thus, *N* non-overlapping values for each input parameter are generated (Hirabayashi et al. 2011). The LHS method typically assumes that the sampling is performed independently for each parameter (Marino et al. 2008), though many of the input variables are correlated. To avoid this assumption, the Iman Conover method is used to enforce a dependence structure on input variables (Iman and Conover 1982). The Iman Conover method is based on rearranging the values of the parameters so that a desired correlation structure, which is derived from Spearman’s rank correlation coefficients of empirical data sets (Table 3), is imposed on the *k* parameters. This technique has advantages of being distribution free and can be used with any type of sampling scheme (Helton and Davis 2003). At Stage 2, Eco-B, C, and D were batch run using input variables from Stage 1 to generate model outputs.

Stage 3 has two components. First, VD was employed to characterize and quantify the relative importance of input variables, and then for those input variables which were identified as important, quantile regression models were fit to binned data to capture specific input-output relationships.

VD is a global SA which can decompose the variance of model output into fractions attributed to each model input and to input interactions (Saltelli et al. 2004; 2008). The effects of VD are commonly measured by two sensitivity indices: a first order index and a total effect index. Model output, *Y*, is a function of a vector of *k* model inputs, *X*_{1},…,*X*_{k}. The variance decomposition of *Y* can be expressed as:
7

where *V*_{i} = Var[∑(*Y*|*X*_{i})] is the contribution of *X*_{i} to the variance of *Y*, and *V*_{ij} = Var[∑ (*Y*|*X*_{i}, *X*_{j})] − *V*_{i} − *V*_{j} is a part of the total variance caused by the interaction between *X*_{i} and all other *X*_{j}, namely the joint impact of *X*_{i} and all *X*_{j} on the variance of the output minus their first-order effects (Saltelli et al. 2008). The first-order index (*S*_{i}) of *X*_{i} on *Y* is:
8

and represents the percentage of the total variance accounted by the first-order effect of *X*_{i}. The total effect index (*S*_{Ti}) is:
9

where *V*_{~i} is the total contribution of all parameters except for *X*_{i}. *S*_{Ti} accounts for the total contribution of *X*_{i} to the variance of model output (e.g., its first-order effect plus all higher-order effects due to interactions) (Saltelli et al. 2008). By definition, *S*_{i} ≤ *S*_{Ti}, ∑ *S*_{i} ≤ 1, and ∑ *S*_{Ti} ≥ 1.

Following Saltelli et al. (2010), two base matrices (*A* and *B*) with dimension (*N, k*) were generated by LHS, where *N* is the number of simulation (*N* = 1500) and *k* is the number of input variables. *S*_{i} and *S*_{Ti} are calculated as:
10
11
12

where represents all columns from *A*, except the *i*^{th} column which is from *B, f*() is a function to estimate a corresponding ecosystem service, and *f*(*A*)* _{j}* (or

*f*(

*B*)

*) is the estimate an ecosystem service using the*

_{j}*j*

^{th}row of base matrix

*A*(or

*B*) as input variables.

*f*

_{o}represents the expected value of the ecosystem service across all parameter simulations represented in

*A*, while the denominator of Equations 11 and 12 represents the total variance of the ecosystem service across all parameters in

*A*. The main disadvantage of VD is the computational cost (here

*N*(

*k*+ 2) simulations).

General input-output variable relationships were explored and identified using a binned quantile regression approach (Jucker et al. 2017). We divided the scatterplot points into ten bins with an equal number of points by calculating corresponding quantile points (10%, 20%, ……, 90%) and using these quantile points to separate scatterplot points; we then estimated the median values of output and input for each bin, and performed a regression exploring both linear and nonlinear relationships (Figure 3). Three possible forms of quantile regression for the medians (e.g., linear [*c* = 0], convex [*c* > 0], concave [*c* < 0]) were employed to differentiate different input-output relationships using:
13

Note that the binned quantile regression approach provides us with a “median” response which is less impacted by outliers.

The best form for each input-output relationship was selected based on the Akaike information criterion (Wasserman 2013), with the constraint that all the parameters should also be statistically significant. To compare the degree of concavity or convexity among different input-output relationships, regression models were built by standardizing the response and explanatory variables by subtracting the mean from each value and dividing by the standard deviation. A larger |*c*| value indicates greater concavity or convexity. This approach does more than simply explore the input-output relationships from the equations within Eco, but allows us to discover general relationships between correlated variables and model output. An example of the binned regression for the deposition velocity of NO_{2} as a function of PAR is shown in Figure 3.

## RESULTS

### MOAT Analysis

The MOAT analysis produced a mean absolute value (*μ**) and standard deviation (*σ*) of the output elementary effect for each input variable across the 20 simulations. For isoprene (Figure 4a), leaf biomass had the greatest *μ**, indicating its large impact on model output; temperature and PAR had around the same magnitude of *μ**, indicating their moderate effects on model output. For monoterpenes (Figure 4b), leaf biomass (*μ** = 6.74) was slightly more important than temperature (*μ** = 4.09), and both had high values of *σ*.

Among the three input variables examined for carbon storage (Figure 4c), the effect of DBH on the outputs (high *μ** and *σ*) was much larger than height and land use. The high *σ* for DBH indicated nonlinear effects of this input variable on carbon storage. For carbon sequestration (Figure 4d), two additional input variables related to tree health (condition) and site condition (CLE) were also examined. The five input variables can be grouped into three categories according to *μ** and *σ*: most important (DBH), moderate importance (condition, height, CLE), and negligible importance (land use).

Eco-D calculates dry deposition of air pollution to trees based on the dry deposition velocity and the air pollutant concentration and assumes that the pollutants do not damage plant functions. For dry deposition velocity (*V*_{d}) of NO_{2} (Figure 4e), temperature and LAI were the most important, as indicated by a high *μ**. While they had similar *μ**, temperature had a higher *σ* than LAI, indicating that its relationship to *V*_{d} of NO_{2} was much less linear than for LAI. RH, wind speed, and PAR all had smaller *μ** than temperature and LAI, indicating their more moderate and similar effects on *V*_{d} of NO_{2}; these variables all had a higher *σ* than LAI, indicating a less linear relationship to *V*_{d} of NO_{2}. Pressure had a negligible effect, as indicated by low *μ** and *σ*. For *V*_{d} of SO_{2} (Figure 4f) and O_{3} (Figure 4g), a similar pattern of ranking based on *μ** was displayed: temperature > RH > LAI > wind speed > PAR > pressure. Among these, temperature had the greatest *μ** and *σ*, indicating its large impact on model output and a highly nonlinear relationship with output; RH, LAI, wind speed, and PAR clustered in the middle of the (*μ**, *σ*) plane, while pressure had negligible effect.

### Variance-Based Decomposition (VD) Analysis

While MOAT is computationally cheap, it cannot fully explore the input space, has difficulty with categorical input variables, and assumes input variables are uncorrelated. To overcome these disadvantages, VD was employed in the study. In addition to four direct input variables (genus, leaf biomass, temperature, and PAR), the sensitivity of one indirect input variable (CLE) was also examined, because previous studies have demonstrated that altering CLE values could greatly affect BVOC emissions (Pace et al. 2018). For isoprene (Figure 5a), the order of importance measured by *S*_{Ti} was genus > leaf biomass > temperature > CLE > PAR. Genus played a dominant role, since base emission rates were assumed to be based on genera. The differences between *S*_{i} and *S*_{Ti} for all five input variables were large, indicating significant interaction effects between input variables. For monoterpenes (Figure 5b), the order of importance measured by *S*_{Ti} was genus > leaf biomass > temperature > CLE; large interaction effects also existed for all input variables as indicated by the large differences between *S*_{i} and *S*_{Ti}.

Among the four input variables that determine carbon storage (Figure 5c), DBH played the most dominant role, species had a smaller role, while height and land use had negligible effects. For carbon storage, the model interaction effects were small, as indicated by the small difference between *S*_{i} and *S*_{Ti}. For carbon sequestration (Figure 5d), DBH again had the largest effect, while condition and CLE had moderate effects; the other three input variables (species, height, and land use) all had negligible effects. The differences between *S*_{i} and *S*_{Ti} for DBH, condition, and CLE indicated the existence of important interaction effects between these variables and carbon sequestration.

For *V*_{d} of NO_{2} (Figure 5e), LAI and PAR had the most dominant effect, while RH, temperature, and wind speed had smaller effects, and pressure had a negligible effect. The small differences between *S*_{i} and *S*_{Ti} for these variables indicated minimal interaction effects between these input variables and the output. For *V*_{d} of SO_{2} (Figure 5f) and O_{3} (Figure 5g), the pattern of rankings of importance were similar: PAR had the largest effect, followed by RH, LAI, temperature, and wind speed with moderate effects, and pressure with a negligible effect. For PAR and RH, interaction effects were detected by the differences in S_{i} and *S*_{Ti}.

### Bin Regression Analysis

While VD can quantify the effects of individual input variables and their interaction impacts on output variability, it does not identify general input-output relationships. For those input variables identified as important in VD (unordered categorical input variables were excluded), a bin quantile regression analysis was employed, and the general input-output relationships at the aggregated level were described using regression form, regression coefficients, and the adjusted *R*^{2} (Table 4). Note that by binning the data we do inflate the adjusted *R*^{2} of the models; our goal here is to generally describe the median model response.

Leaf biomass showed a concave relationship with both emissions of isoprene and monoterpenes (Eco-B outputs), as indicated by a significant negative parameter on *X*^{2} (a significant positive parameter would be convex). With an increase in leaf biomass, the emissions increased at a faster rate initially, then at a slower rate, and finally became relatively insensitive to the change in leaf biomass. The concavity of the monoterpenes vs. leaf biomass relationship was greater than that of the isoprene vs. leaf biomass relationship. Although leaf biomass of an individual genus at a specific temperature affected BVOC emissions in a linear manner (Equation 1), with different base rates from different genera and the nonlinear relationship between temperature and the temperature correction factors (Equations 2 and 3), the combined effect across all genera and temperature in the simulation produced a concave relationship. CLE showed a linear relationship with both isoprene and monoterpenes emissions, and their adjusted *R*^{2} values were lower than that of leaf biomass, indicating a weaker relationship with BVOC emissions when compared with the leaf biomass and BVOC relationship. By contrast, temperature showed a convex relationship with both BVOC emissions, indicating that the output responded relatively insensitively at low and moderate values of temperature, but then increased strongly as the temperature increased; the degrees of convexity, measured by | *c* | were 0.49 and 0.39 for isoprene and monoterpenes, respectively. This result makes sense, since emission increased nearly logarithmically at higher temperatures. Unlike leaf biomass and temperature, PAR showed a strong linear relationship with isoprene emissions. The relationships between binned median inputs and median Eco-B output variables were generally strong, with adjusted *R*^{2} values ranging from 0.87 to 1.00.

DBH showed a convex relationship with carbon storage, while it had a linear relationship with gross carbon sequestration; these relationships had a high adjusted *R*^{2}. By contrast, the linear form used to describe the relationships between tree condition and CLE to gross carbon sequestration had a relatively low adjusted *R*^{2}, indicating either a weaker input-output relationship, or the model we proposed to represent the binned quantile regression (Equation 8) was inadequate to capture this relationship.

PAR showed concave relationships with *V*_{d} of NO_{2}, SO_{2}, and O_{3}, with the degree of concavity of NO_{2} > O_{3} > SO_{2}; *V*_{d} of all three gases also showed concave relationships with temperature with similar degrees of concavity. For all other input variables, linear relationships were detected. LAI showed the strongest linear relationship with *V*_{d} (high adjusted *R*^{2}), while RH showed the weakest linear relationship (low adjusted *R*^{2}). This may indicate *V*_{d} responds to RH and interacts with other input variables in more complex ways. The dry deposition processes of all three gases showed similar regression patterns with input variables.

## DISCUSSION

### The Emissions of Isoprene and Monoterpenes

The amount of BVOCs emitted was affected by tree physiology (e.g., genus, leaf biomass) and environmental factors (e.g., temperature). The VD analysis indicated that genus had the largest influence on the emissions of isoprene and monoterpenes, and there were strong interaction effects among input variables, as indicated by the large differences between *S*_{i} and *S*_{Ti} and the high *σ* in the MOAT analysis. In Eco-B, each genus has a specific *B _{E}* value, varying from 0 (e.g.,

*Pyrus*) to 70 (e.g.,

*Liquidambar*) ug/g

^{−1}/hr

^{−1}for isoprene emissions, and from 0 (e.g.,

*Pyrus*) to 7.9 (e.g.,

*Pistacia*) ug/g

^{−1}/hr

^{−1}for monoterpenes emissions. Misidentifications of tree species may or may not result in large changes in BVOC emissions; however, misidentifications of tree genera could produce large deviations in emissions, as demonstrated in one study in Munich, Germany, that showed isoprene emissions decreased and monoterpene emissions increased primarily due to misclassifications of tree species (Pace et al. 2018). Genus-level BVOC emission rates are used. If species-specific emissions are demonstrated to provide more accurate estimates, the model can be updated to use species-specific BVOC emission rates. Note that in the MOAT analysis,

*μ** for both Eco-B outputs were much larger in magnitude than for Eco-C and Eco-D outputs. This difference may be because a strong interaction existed among these input variables (high

*σ*). High VOC-emitting genera can increase the magnitude of EE.

The roles of other factors (e.g., leaf biomass, temperature, PAR, and CLE) were highly affected by the selection of a specific genus. As expected, the effects of CLE were smaller than leaf biomass, because CLE’s role is mainly through its effect on leaf biomass. Eco-A calculates leaf biomass using different approaches when CLE changes between classes (e.g., 0~1, 2~3, and 4~5). The same environmental factors may have a smaller influence if low-emitting genera are selected, and a larger influence when high-emitting genera are selected. Leaf biomass affected BVOC emissions in a concave manner (Table 4), which contradicts the linear relationship expressed in Equation 1. This is probably because in the SA, we resampled values of genus, leaf biomass, PAR, and temperature across the entire parameter space, and therefore these other parameters, in addition to leaf biomass, also varied across different simulations. The concave relationships may be due to the simultaneous changes of all the variables and the interactions between the variables across the sample space. The effect of tree physiology (e.g., genus and leaf biomass) is greater than that of environmental factors (e.g., temperature, sunlight), which indicates planting of low-emitting genus is a good strategy to help prevent BVOC emissions (Nowak and Crane 2000). Temperature showed a convex relationship with both BVOC emissions, and extreme temperature can strongly increase BVOC emissions (Sharkey et al. 1991; Calfapietra et al. 2013). This may be due to the physiological functions of plants to protect against heat stress (Bonan 2016), and cooler environments can also reduce BVOC emissions. While genus and temperature both impact BVOC emissions, an urban planner generally has limited control over temperature; thus, maximizing the use of low VOC-emitting trees is an efficient strategy to prevent and reduce BVOC emissions.

### Carbon Storage and Sequestration

DBH played a dominant role for both carbon storage and sequestration (high *μ** and *S*_{Ti}) while land use and tree height had negligible effects (low *μ** and *S*_{Ti}). These findings indicate that in terms of data collection, increasing the accuracy of DBH measurements is an effective way to reduce model output uncertainty. DBH is important because allometric equations typically have a nonlinear relationship (e.g., exponential or log-log) between DBH and carbon storage, which means that even small errors in DBH measurements can lead to inaccurate carbon estimators (Jucker et al. 2017). These results suggest large trees play a more important role in estimators of carbon than small trees; large trees (DBH > 77 cm) store approximately 1,000 times more carbon than small trees (DBH < 7 cm)(Nowak 1994). Sustaining the health of large trees, and accurately measuring the DBH of large trees, is critical for carbon management and estimation.

Given the high correlation between DBH and height, it is counterintuitive that height had a small influence on carbon estimation (low *μ** and *S*_{Ti}). This difference is likely because some allometric equations do not use height as an explanatory variable (Equation 5). To compete for sunlight, trees often initially allocate resources to maximize tree height growth and approach their maximum height rapidly and then invest resources in diameter growth (Jucker et al. 2017). This makes carbon estimation based on height alone problematic, because trees of similar height can have very different woody biomass. In addition, urban trees are more open-grown than trees in forested stands. However, McPherson et al. (2016) state that allometric equations using both DBH and height as predictor variables typically have higher accuracy than corresponding allometric equations employing only DBH. Although not as large as previously thought, species also affected the estimations of carbon storage (*S*_{Ti} = 5.4%) and sequestration (*S*_{Ti} = 6.5%) mainly through the selection of allometric equations of different forms and coefficients, which reflects differences among wood density and the water content of species (McPherson et al. 2013). This finding indicates that employing species-specific allometric equations could improve the accuracy of carbon estimators. Some studies also indicate that developing urban-specific allometric relationships is necessary (McHale et al. 2009). However, urban-specific allometric equations are scarce and location-specific, and the very few that exist are usually developed based on street trees (McPherson et al. 2016). Employing allometric equations developed for street trees for other urban trees may be problematic, as growing conditions vary widely across the urban environment. The current approach adopted in Eco-C, namely to use forest-based allometric equations and apply a biomass correction factor if it is an open-grown environment (Nowak et al. 2013), is popular in urban forestry, although some studies criticize that this simple correction results in conservative estimates of biomass (Aguaron and McPherson 2012). Future studies to develop urban-specific allometric equations and to perform uncertainty analyses of the biomass correction factor are needed.

For carbon sequestration, two additional input factors (CLE and condition) were examined. CLE represent site characteristics that impact sunlight to the tree crown, while condition is a measure of tree health. These two factors affect carbon estimators by adjusting tree growth rates. Both MOAT and VD showed that CLE and condition played moderate roles in model output (Figures 4 and 5), and the bin quantile regression analysis showed that carbon sequestration tended to increase linearly with the increase of CLE and condition (Table 4). These linear relationships may inadequately capture input-output relationships, as indicated by the low values of adjusted *R*^{2} (0.59 and 0.78). Consistent with the results from the bin quantile regression, VD also showed that the effects of CLE and condition may depend on the selection of other input factors, as indicated by the large difference between *S*_{i} and *S*_{Ti}. This difference may be due to the responses of different species, which may have different tolerances to abiotic and biotic stressors. This finding suggests that adaptive management approaches are needed to enhance and sustain forest health to maximize carbon sequestration and storage.

### Air Pollution Removal by Dry Deposition Processes

Both MOAT and VD analyses revealed that LAI was the most important factor for *V*_{d} of NO_{2}, and an important factor for *V*_{d} of SO_{2} and O_{3} (Figures 4 and 5). Bin quantile regression analysis showed that dry deposition of all three gases tended to respond linearly to LAI, which is consistent with the analysis performed by Hirabayashi et al. (2011). However, this linear relationship is unlikely to be maintained with an increase of LAI. As LAI increases, there is more chance of overlay among the leaf distribution, as well as resource (e.g., nutrients) and microenvironment (e.g., light) limitations (Chapin et al. 2011). Although trees adapt to these limitations, the leaves tend to behave at a lower efficiency as LAI increases, and thus are unlikely to maintain a linear relationship with *V*_{d} at higher LAI values (Van der Zande et al. 2009).

RH was an important factor for *V*_{d} of NO_{2}, SO_{2}, and O_{3} (Figures 4 and 5). Eco calculates stomatal conductance by employing the Ball-Berry model, which is based on empirical relationships between stomatal conduction and photosynthesis from numerous leaf gas exchange measurements (Medlyn et al. 2011). During the process, Eco assumes RH is equivalent to the relative humidity at the leaf surface (Hirabayashi et al. 2011). Therefore, RH affects the stomatal conduction directly as an input to the Ball-Berry model, as well as indirectly through the influence of net leaf photosynthesis. Dry deposition of all three gases showed linear behavior with the change of RH (Table 4), which is consistent with the result obtained by Hirabayashi et al. (2011). However, a linear model may be inadequate to fully capture the response of *V*_{d} to RH, as indicated by relatively low adjusted *R*^{2} values (Table 4). Bonan (2015) shows that stomatal conductance responds to vapor pressure deficit linearly but at different rates with changing temperature, which indicates the linear model is not completely justified to represent the response of *V*_{d} to RH. The large differences between *S*_{i} and *S*_{Ti} also indicate that the response of *V*_{d} to RH may depend on other factors. Our conclusion regarding the relationship between RH and *V*_{d} is based on applying SA to the Ball-Berry model. However, the Ball-Berry model has been criticized for employing leaf surface RH as a direct input, because stomata sense and respond directly to transpiration water fluxes (Medlyn et al. 2011). Therefore, it is the leaf-to-air vapor pressure deficit rather than RH that actually drives this process. Future model development should directly incorporate vapor pressure deficit into the calculation of the stomatal conductance.

Wind speed and pressure were among the least important factors for dry deposition (Figures 4 and 5). Wind speed is important for determining aerodynamic and quasi-laminar boundary layer resistances, because high winds reduce the thickness of the boundary layer of still air around each leaf and produce steeper gradients for the exchange of elements (e.g., water vapor, carbon dioxide) between the leaf surface and the atmosphere (Chapin et al. 2011). However, canopy resistance, which is the main driver of dry deposition, does not appear to be affected by wind speed (Hirabayashi et al. 2011); as a result, wind speed plays a minor role in determining dry deposition. Pressure also appeared to have a limited relationship with *V*_{d} (Figures 4 and 5). Pressure affects stomatal conduction indirectly through the computation of the direct and diffuse PAR (Hirabayashi et al. 2011).

Temperature and PAR affect the opening and closing of stomata during photosynthesis and transpiration (Bonan 2015). For the photosynthetic process, PAR mainly affects light-harvesting reactions, while temperature mainly influences carbon-fixing reactions through the control of enzymes (Bernacchi et al. 2013). For transpiration processes, water is taken up by the tree’s roots, transported vertically by the xylem, and evaporates from the leaf surface to the ambient atmosphere through a water potential gradient. The entire process is driven by vapor pressure deficit and radiative forcing, which is closely correlated to temperature and light intensity (Will et al. 2013). The MOAT and VD analyses gave different conclusions about the *V*_{d} response to temperature and PAR. MOAT indicated that temperature was the most important factor, while VD indicated that PAR had the largest effect. The differences may be due to the different approaches of the two methods, as well as the nonlinear response curves of *V*_{d} to temperature and PAR. MOAT is based on changing one factor at a time across its entire range while fixing all other factors at certain values, and therefore may not fully explore the parameter space (a 6-dimensional space in this case). VD is a global method based on the simultaneous change of all input factors among their PDFs. The typical response curve of stomatal conduction to PAR for many species is that conduction increases with increasing PAR up to about 300 to 400 W/m^{2}, and is relatively insensitive to further increases in PAR (Jones 2013). When the randomly selected levels used to calculate EE lie outside the range of 300 to 400 W/m^{2}, a conclusion that conduction is insensitive to PAR may be obtained. The response curve of *V*_{d} to PAR was concave in the bin quantile regression analysis (Table 4). *V*_{d} also responded to temperature in a concave way but with less concavity than with PAR (Table 4). Bonan (2015) shows that there exists an optimal temperature for the response of stomatal conductance, with the values being 15 to 25 °C for most C_{3} plants (plants that don’t have photosynthetic adaptations to reduce photorespiration). The different response mechanisms of *V*_{d} to PAR and temperature indicate potential model inadequacies to differentiate these response curves. More systematic modeling practices, which can capture different *V*_{d} response mechanisms (e.g., threshold and optimal points), are needed in the future.

PAR, LAI, RH, and temperature all play important roles in dry deposition processes. The VD analysis showed that PAR and RH were the most sensitive variables for *V*_{d} of SO_{2} and O_{3}, while they were less sensitive for *V*_{d} of NO_{2} (Figure 5). This may be because there were greater changes in *V*_{d} for SO_{2} and O_{3} than for NO_{2} when input variables were altered similarly. For example, when increasing PAR from 50 to 600 W/m^{−2}, *V*_{d} of SO_{2} and O_{3} increased by 0.31 and 0.33 cm/s^{−1}, respectively, which is larger than that of NO_{2} (0.18 cm/s^{−1}). The variations of *V*_{d} with respect to PAR for these three gases are detailed in Figure 6. Hirabayashi et al. (2011) also observed similar patterns in their sensitivity analysis of Eco-D in Baltimore. These different degrees of sensitivity may be due to different parametrization schemes for the calculation of the quasi-laminar boundary layer and canopy resistances, where NO_{2} shows higher values for mesophyll and cuticular resistances compared with SO_{2} and O_{3} (Hirabayashi et al. 2015). For PAR and LAI, Eco employs one of the best available processes in the literature to scale up from the leaf to canopy, and the interaction between PAR and LAI is fully captured by different components of PAR (e.g., direct, diffuse) and LAI (e.g., sunlit, shaded)(Hirabayashi et al. 2011). For RH and temperature, single values, instead of vertical profiles around canopy height, are employed in Eco, which may constrain the model performance. In addition, the model assumes that leaf temperature is equal to air temperature, while leaf traits (e.g., hair) and properties (e.g., latent heat exchange) may make leaf temperature differ from air temperature (Yu et al. 2015). Future model development may focus on the improved representation of RH and leaf temperature to reduce the uncertainties of model outputs.

## CONCLUSIONS

In this study, sensitivity analyses (SAs) were performed to investigate how the characteristics of the Eco inputs impact the ecosystem services and disservices of urban trees predicted by this model. Here the focus is on the inputs to three Eco modules: BVOC emissions (Eco-B), carbon storage and sequestration (Eco-C), and dry deposition velocity of air pollutants (Eco-D). Two SAs with different theoretical foundations are employed. Morris one-at-a-time (MOAT) is based on changing one factor at a time across its entire range to see what effect it has on the output, while variance decomposition (VD) is based on decomposing the variance of the output into different fractions which can be attributed to different inputs or their interactions. The results provide useful information for future urban Forest Inventory and Analysis (FIA) data collections (https://www.nrs.fs.fed.us/fia/urban), model uncertainty analyses, and urban forest management.

Genus has the largest influence on BVOC emissions by determining base emission rates and the interaction between genus and other input factors. Temperature shows a convex relationship with both BVOC emissions, indicating that BVOCs increase at a greater rate with temperature as temperature increases (Sharkey et al. 1991). High temperatures can strongly increase BVOC emissions. Leaf biomass has a concave relationship with BVOCs, indicating that the emissions increase at a faster rate initially, then at a slower rate, and finally become relatively insensitive to the change in leaf biomass; understanding these inflection points from sensitive to insensitive is important to control BVOC emissions while maximizing other leaf related ecosystem services. PAR has a linear relationship with isoprene emissions. These findings indicate that maximizing the use of low VOC-emitting trees is an efficient strategy to prevent and reduce BVOC emissions, and maintaining cooler environments (e.g., through tree transpiration) can also help to reduce BVOC emissions.

DBH has the greatest influence on carbon storage and sequestration provided by urban trees, and carbon storage tends to increase in a convex manner as DBH increases. Unlike relationships among input variables and BVOC emissions, which show strong interactions, the combined interactions between DBH and the other input variables and their influence on carbon storage and sequestration appears minimal. These results indicate that increasing the accuracy of DBH measurements, especially for larger trees, is critical for accurate carbon estimators. Employing species-specific allometric equations can also improve the accuracy of carbon estimators. Effort should be spent on improving current or developing new biomass equations in urban environments whenever possible. By contrast, tree height and land use appear to play minimal roles in carbon storage and sequestration. For carbon sequestration, tree condition and CLE are also important. Maintaining a good site environment and tree health are critical to maximizing carbon storage and sequestration provided by trees.

For the dry deposition processes, PAR, LAI, RH, and temperature all play important roles, with PAR and LAI generally having the largest influence. Dry deposition velocity is sensitive to LAI and RH in a nearly linear manner, while it is sensitive to temperature and PAR in a concave manner. Air pressure has almost no influence on dry deposition, while wind speed has a minimal influence. The interaction between the input variables and their influence on dry deposition velocity is also minimal. There exists an optimal temperature for maximum dry deposition velocity, while PAR affects dry deposition velocity up to a certain threshold value. Representation of RH and temperature around the entire canopy space as a single value may constrain model performance. Future model development should focus on the improved representative of RH and leaf temperature.

Trees provide important ecosystem services and disservices to urban communities. This analysis provided insight into the relationships between tree characteristics, environmental factors, and model parameters on the estimated ecosystem services and disservices of urban trees. Results identified the most important input variables to i-Tree Eco, and future field plot campaigns should emphasize accurate measurements for these variables. In addition, several gaps regarding the representativeness of input variables and model parameters in Eco were identified, which may help to assist future model development. The knowledge gained from this analysis is beneficial to subsequent model uncertainty analyses, which in turn contributes to improved urban forest planning and management.

## ACKNOWLEDGMENTS

This research was supported by the United States Department of Agriculture (USDA) Forest Service with grant 16-DG-11132544-034 recommended by the National Urban and Community Forestry Advisory Council, grant 11-JV-11242308-112 from the USDA Forest Service Northern Research Station, and additional support from the SUNY ESF Department of Environmental Resources Engineering. Special thanks to Robert Hoehn for helping with i-Tree Eco runs.

## Footnotes

**Conflicts of Interest:**David J. Nowak reports originally developing and overseeing the current development of i-Tree Eco, a public domain model that is distributed for free. The remaining authors made no disclosures.

- © 2020, International Society of Arboriculture. All rights reserved.