Abstract
Austrian pine (Pinus nigra) is a valuable component of the urban landscape in the Midwestern USA. In this area, it is impacted by the fungal pathogen Diplodia sapinea, which causes a tip blight and canker on infected trees. While the disease can be managed through the application of fungicides and/or by preventing environmental conditions that are favorable for the pathogen, these practices only temporarily alleviate the problem. A more sustainable solution is to use resistant trees. The objective of this study was to evaluate whether Fourier-transform infrared (FT-IR) spectroscopy combined with chemometric analysis can distinguish between trees that vary in susceptibility to D. sapinea. Trees were phenotyped for resistance to D. sapinea by artificially inoculating shoots and measuring ensuing lesions seven days following inoculation. Then, three different chemometric approaches, including a type of machine learning called support vector machine (SVM), were used to evaluate whether or not trees that varied in susceptibility could be distinguished. Trees that varied in susceptibility could be discriminated based on FT-IR spectra collected prior to pathogen infection using the three chemometric approaches: soft independent modeling of class analogy, partial least squares regression, and SVM. While further validation of the predictive models is needed, the results suggest that the approach may be useful as a tool for screening and breeding Austrian pine for resistance to D. sapinea. Furthermore, this approach may have wide applicability in other tree/plant pathosystems of concern and economic value to the nursery and ornamental industries.
INTRODUCTION
Austrian pine (Pinus nigra) is an ecologically and economically important conifer tree species with a natural distribution around the Mediterranean basin. It is separated into six different subspecies spanning from western North Africa and the Iberian peninsula to the west, to Turkey and all the way into the Crimean peninsula to the east (Richardson and Rundel 1998). The type P. nigra subsp. nigra is mainly present down the Italian peninsula and spans the region that includes the Italian/Austrian Alps and the Dalmatian region (Richardson and Rundel 1998). Pinus nigra subsp. nigra and P. nigra subsp. pallasiana (mainly from Turkey) are widely planted in North America, particularly in the Midwestern USA, both as ornamental trees due to their deep green foliage and as windbreaks along major roadways due to desirable traits like deicing salt tolerance.
Like all trees, Austrian pine is susceptible to several fungal diseases, particularly under drought stress, and in the Midwestern USA the most significant disease is Diplodia tip blight and canker, caused by the ascomycete Diplodia sapinea (Fr.) Fuckel 1870 (syn. Diplodia pinea [Desm.] Kickx., Sphaeropsis sapinea [Fr.: Fr.][Dyko & Sutton]). In greenhouse experiments, drought stressed Austrian pine were more susceptible to D. sapinea following artificial inoculation of the shoots (Sherwood et al. 2015), and the effects of water-stress on the severity of symptoms associated with D. sapinea infection have also been demonstrated in red pine (P. resinosa)(Blodgett et al. 1997a). Diplodia tip blight and canker rarely kills trees in ornamental settings, although it is known to do so in both nursery and plantation settings, for example in red pine in the North Central states (Stanosz and Carlson 1996; Haugen and Ostry 2013). However, this pathogen often causes severe disfigurement that makes the ornamental use of Austrian and other susceptible pine species, such as Scots pine (P. sylvestris), problematic or impossible. Oftentimes, such disfigured trees must be removed from the landscape well before the end of their expected service life.
Integrated management for Diplodia tip blight and canker in ornamental settings includes the use of fungicides during the growing season and the alleviation of environmental conditions that favor the disease, such as insufficient water availability (Blodgett et al. 1997b; Stanosz et al. 2001). Both of these measures are only temporary and must be repeated on a regular basis, which can be costly and impractical, especially when dealing with larger size trees. A much better option would be to plant resistant trees (Boyd et al. 2013), i.e., those that have the ability to limit initial infection as well as tissue invasion by the pathogen once infected and hence express disease symptoms at much lower levels.
In other canker diseases, e.g., those produced by the pathogenic ascomycete Fusarium circinatum (cause of pitch canker on many pine species) and oomycete Phytophthora ramorum (cause of sudden oak death, a canker disease on oak and tanoak species), pathogen resistance can be assessed by measuring the length of lesions produced by artificial inoculation of the respective tree species. For example, it has been determined that if lesions produced by F. circinatum in Monterey pine (P. radiata) are limited by the host, the branches never become girdled and therefore the trees can be considered resistant (Gordon et al. 1998). Similarly, our group has shown that resistant coast live oak (Quercus agrifolia) can be identified by measuring the extent of lesions produced by P. ramorum (McPherson et al. 2014; Conrad et al. 2019).
Access to Diplodia-resistant Austrian pine would be a major advantage for the ornamental industry, where both producers and end users would benefit significantly. While inoculation-based approaches could be used to phenotype Austrian pine for resistance to D. sapinea, a nondestructive approach would be much more preferable, especially if it could be carried out in a considerably more rapid way.
Our group has recently demonstrated that trees can be successfully phenotyped for resistance before they become infected by using vibrational spectroscopic techniques, such as Fourier-transform infrared (FT-IR) spectroscopy, followed by chemometric analysis (i.e., multivariate statistical analysis of spectral data), in at least two pathosystems of high significance: sudden oak death of coast live oak in California and ash dieback of European ash (Fraxinus excelsior) in Europe (Conrad et al. 2014; Villari et al. 2018).
FT-IR spectroscopy is a chemical fingerprinting technique that measures the absorption of infrared radiation deriving from the vibration of molecular bonds in different chemical functional groups. The resulting spectra reflect the total biochemical composition of analyzed samples in a given tissue and at a given time and can be used in multivariate statistical models to identify qualitative and quantitative differences among different groups of samples. In particular, soft independent modeling of class analogy (SIMCA) can be used to classify qualitative phenotypic traits, while partial least squares regression (PLSR) can be used to classify quantitative phenotypic traits (Conrad and Bonello 2016). Support vector machine (SVM), a type of machine learning, provides another option for classifying trees based on spectral differences (Abdel-Rahman et al. 2014).
The main goal of this study was to test if this technique can be used to phenotype ornamental Austrian pine for resistance to D. sapinea.
MATERIALS AND METHODS
Plant Material
One hundred and twenty, four-year-old, open pollinated Austrian pine trees growing in one-gallon (3.7-L) plastic pots (plants kindly donated by Willoway Nursery, Madison, OH, USA) were moved from a gravel lot located on The Ohio State University (OSU) campus in Columbus, OH, to the OSU Department of Plant Pathology greenhouse on March 11, 2015. Trees were given approximately 2 L of water daily at 8:00 am and received supplemental lighting on a 14-hour regimen beginning at 6:00 am. Trees were fertilized to pot capacity with 200 ppm of Jack’s Professional 20-10-20 N:P:K water-soluble fertilizer (JR Peters INC, Allentown, PA, USA) on March 12, March 23, and April 6, 2015. Trees began to break dormancy around April 6, 2015, as noted by budbreak. On April 29, 2015, 103 trees were selected for further experimentation based on plant uniformity, appearance, and bud phenological stage (i.e., selecting those currently in the elongation phase). Trees showing poor growth, trees with only a few viable shoots, and trees with shoot damage (e.g., from previous D. sapinea infections) were not selected.
Candles were inoculated on April 30, 2015. Using a sterile scalpel, a small wound was created at a fascicle scar (around 1 mm in size) approximately 8 cm from the shoot tip (or at the midway point on the shoot for smaller shoots). Then, a 3-mm plug of D. sapinea growing on potato dextrose agar, taken from the margin of an actively growing colony, was placed mycelium-side down on the wound, and the inoculation site was sealed with parafilm M (Structure Probe, Inc., West Chester, PA, USA) to minimize contamination and desiccation. The strain of D. sapinea used in this experiment was the same used in previous, published work (Sherwood and Bonello 2013) and was originally isolated from symptomatic pine cones of an Austrian pine tree growing on the OSU campus in Columbus, OH. Each tree was inoculated on two separate shoots, even if only one shoot was eventually used for analyses. The dominant apical leader shoot was never selected, but shoots were otherwise chosen randomly. At the same time, two other noninoculated shoots were removed for spectroscopy and chemometric analysis of the constitutive chemical composition, excluding again the dominant apical shoot from the selection process. Harvested tissue was immediately frozen in liquid nitrogen and stored at −80 °C. Of the 103 trees that were artificially inoculated with D. sapinea, tissue from only 79 trees was analyzed with FT-IR spectroscopy due to an insufficient amount of available tissue from the excluded trees.
At the time of inoculation, stem diameter at 15 cm above the soil line and total plant height (not including the dominant apical leader shoot) were measured. The length of all inoculated and harvested shoots was measured, as was the length of the dominant apical leader shoot.
Assessment of Relative Resistance
Seven days post inoculation, one of the inoculated shoots selected at random was excised, and the epidermis was removed to measure lesion lengths as a proxy of resistance. At this time, approximately 50% of the trees were beginning to show symptoms of infection (needle chlorosis and visible necrosis around the inoculation site).
To build the SIMCA chemometric model and for the SVM analysis, we selected only those trees corresponding to the quartiles of the lesion length distribution that represented the most susceptible and most resistant trees, i.e., the 25% with the longest lesions (quartile four; hereafter referred to as susceptible) and the 25% with the shortest lesions (quartile one; hereafter referred to as resistant), corresponding to 42 samples. To build the PLSR chemometric model, all 79 samples, comprising even those samples showing an intermediate phenotype (i.e., second and third quartile in the lesion length distribution; hereafter referred to as intermediate), were included in the analysis. In addition, a Welch’s two sample t-test, with unequal variance, was performed to compare average lesion lengths between susceptible and resistant trees. Finally, a one-way ANOVA with Type III sum of squares was used to confirm that there were no significant differences in stem diameter or tree height between resistant, susceptible, and intermediate groups using the R package “Car” (Fox and Weisberg 2011). Assumptions of normality and homogeneity of variance were tested with the Shapiro-Wilk and Levene’s tests, respectively. All statistical tests were performed in R version 3.5.2 (R Core Team 2018).
Tissue Extraction for Chemical Fingerprinting
Shoot tissue corresponding to the position of the inoculations on inoculated shoots was excised from non-inoculated shoots harvested at the time of inoculation and ground in liquid nitrogen using a mortar and pestle to a fine and homogeneous powder. Aliquots of the ground tissue (200 ± 1 mg) were transferred to individual 2-mL microcentrifuge tubes and stored at −80 °C until extraction, which was carried out according to Wrolstad (2005) with modifications (see Villari et al. 2018 for details). This extract was then used for chemical fingerprinting analysis.
Chemical Fingerprinting
Seven μL of each purified extract were analyzed on an Excalibur 3500GX FT-IR benchtop spectrometer (Digilab, Randolph, MA, USA), equipped with a potassium bromide beamsplitter and a MIRacle triple-bounce zinc selenide crystal (Pike Technologies, Madison, WI, USA) attenuated total reflectance (ATR) accessory. Extracts were vacuum dried to dryness on the surface of the ATR crystal, and spectra were collected over a wavenumber range of 700 to 4000 cm−1, which corresponds to the mid-infrared (mid-IR) region. Instrumental settings were as follows: resolution, 4 cm−1; number of scans co-added per interferogram, 64. Spectra were visualized using Win-IR Pro 288 Software (Agilent Technologies, Santa Clara, CA, USA)(Villari et al. 2018). For SIMCA and PLSR, chemometric analysis was carried out using the modeling software Pirouette (v. 4.5, Infometrix Inc., Bothell, WA, USA), and spectral data were normalized, smoothed, and transformed into their second derivative prior to analysis to remove multiplicative scatter and particle size interference, resolve overlapping peaks, and increase the signal-to-noise ratio (Savitzky and Golay 1964; Barnes et al. 1989; Conrad and Bonello 2016). Support vector machine (SVM) analysis was carried out using the R package “e1071” (Meyer et al. 2019).
The SIMCA model, built only with the 21 most susceptible and 21 most resistant trees, was optimized by initially including all data collected within the mid-IR region and by progressively excluding those regions with lower discriminating power between the most susceptible and most resistant trees, so that only those regions with high discriminating power were eventually included in the model. Incrementally refined models were each visualized in the SIMCA 3D class projection and Coomans plots (Coomans and Broeckaert 1986) and evaluated by observing clustering patterns of the different phenotypes. At the same time, outliers identified visually and using the outliers diagnostics plot were excluded. This included spectra with lower absorbance, which may have resulted from technical variation associated with the collection of individual FT-IR spectra and/or the extract itself. The total number of samples included in the final model was 29 (18 resistant and 11 susceptible).
The PLSR model, built on all 79 samples, was optimized with a similar strategy: visualization of the spectra and evaluation of the incrementally refined models was performed on the loading and scores plots, and outliers were identified as in Wilkerson et al. (2013). Model performance was also evaluated based on the outlier diagnostics, number of factors included, and leave-one-out cross-validation (Wilkerson et al. 2013; Conrad et al. 2014). The final model included 56 spectra (16 resistant, 30 intermediate, and 10 susceptible).
The SVM model, built using scaled raw spectral data collected across the mid-IR spectrum from the most resistant and most susceptible trees (N = 42), was optimized by using 10-fold cross-validation to identify the cost parameter that minimized the error rate. The cost parameter defines the size of the margin separating resistant from susceptible trees in the SVM model; models with smaller cost parameters have decreased tolerance for misclassified samples in the training data set (James et al. 2014). Due to the large number of spectral variables relative to the number of trees (biological replicates), a linear kernel was used (James et al. 2014). To assess model performance, 10-fold and 5-fold cross-validation were performed, and the accuracy and area under the receiver operating characteristic (ROC) curve from prediction scores were calculated using the R packages “e1071,” “MLmetrics,” and “ROCR,” respectively (Sing et al. 2005; Yan 2016; Meyer et al. 2019).
RESULTS
Relative Resistance
Lesion lengths varied quantitatively, ranging from basically no lesion to lesions that were almost 80 mm in length (Figure 1). Trees belonging to the bottom and the top quartiles of lesion lengths (first and fourth quartiles, i.e., resistant and susceptible, respectively) were selected to build qualitative chemometric models. The average lesion lengths of resistant and susceptible trees were highly, significantly different, with susceptible trees having mean lesions approximately six times as long as resistant trees (Figure 2)(Welch’s two sample t-test, t = −16.0, P < 0.0001, DF = 21.8). No significant differences in stem diameter or tree height were detected between resistant, susceptible, or intermediate trees (one-way ANOVA, P > 0.05).
Chemometrics
Soft Independent Modeling of Class Analogy (SIMCA)
SIMCA analysis of transformed FT-IR spectra—normalized, smoothed (35 points), and with the second derivative (35 points) transformed—in the range of 698.6 to 1868.0 cm−1 could be used to distinguish between resistant trees, those with the smallest lesion lengths (first quartile), and susceptible trees, those with the largest lesion lengths (fourth quartile)(Figures 3 and 4). FT-IR spectra were collected from extracts of noninoculated shoots and thus reflect constitutive or preinfection composition and levels of plant chemicals. The model correctly classified 89.7% of trees in the trimmed data set (N = 29), with 100% (N = 18) of resistant trees correctly classified, and 72.7% (N = 11) of susceptible trees correctly classified, with an interclass distance of 2.07. The larger the interclass distance, the less likely the model is to classify a sample as both resistant and susceptible.
Partial Least Squares Regression (PLSR)
PLSR analysis with four factors and leave-one-out cross-validation of transformed FT-IR spectra—normalized, smoothed (25 points), and second derivative (25 points) transformed—from 802.8 to 1852.6 cm−1 could be used to predict the length of D. sapinea lesions on Austrian pine (N = 56)(Figure 5). The standard error of cross-validation (SECV) was 12.23 mm, and the correlation coefficient of cross-validation (rval) was 0.60, supporting a positive relationship between measured and predicted (based on FT-IR spectra) lesion lengths.
DISCUSSION
Fourier-transform infrared spectroscopy combined with chemometrics shows great promise as an alternative tool for phenotyping trees for disease resistance and could provide a more rapid and high-throughput method for use in tree breeding programs, particularly in the ornamental and nursery industries. In this study, we show that the tool can be used to distinguish between Austrian pine trees that varied in susceptibility to D. sapinea, the causal agent of Diplodia tip blight, using three different chemometric approaches: SIMCA, PLSR, and SVM. The latter approach uses machine learning, a form of artificial intelligence, to identify the most optimal model parameters for distinguishing between groups, specifically resistant and susceptible trees (Cortes and Vapnik 1995; Singh et al. 2016).
Inoculated Austrian pine trees showed varying levels of susceptibility (i.e., varying lesion lengths) to D. sapinea, indicative of a quantitative resistance response. Since susceptibility varied quantitatively, trees were separated into resistant (quartile one, smallest lesion lengths), intermediate (quartiles two and three, intermediate lesion lengths), and susceptible (quartile four, largest lesion lengths) groups. To develop classification-based predictive models for disease resistance, we focused on the tail ends of the lesion length distribution—resistant and susceptible trees. We have used this approach previously, with success, in other forest pathosystems (Conrad et al. 2014; Villari et al. 2018). Using SIMCA and SVM we were able to distinguish between these two groups. However, a non-negligible number of trees had to be trimmed from the final, optimized SIMCA model (13 out of a total N = 42), most of which belonged to the susceptible group. This suggests that there was more variation in the FT-IR spectra of trees classified as susceptible compared to resistant, which may have impacted the classification performance, particularly that of susceptible trees, and level of bias (due to outlier removal) of the SIMCA model. In contrast, with SVM, 100% of trees (N = 42) were included in the final, optimized model. Model accuracy was comparable, in particular for resistant trees, in both cases: 89.7% (N = 29) and 92.9% (N = 42) of trees in total were correctly classified in the SIMCA and SVM models, respectively. Taken together, these results suggest that SVM is a more desirable approach for the development of classification-based disease predictive models using FT-IR spectra, although the analysis of additional biological samples may help to improve the classification performance of both models. Ultimately, the level of allowable misclassification or the classification threshold should be based on the goals of the disease management and/or resistance screening program.
In this study, PLSR was also used to predict susceptibility of trees, via lesion length, based on preinfection FT-IR spectra. This method is an alternative to classification-based predictive approaches and has been used extensively in the literature, for example to predict the susceptibility of eucalyptus (Eucalyptus grandis) to Leptocybe invasa, a gall wasp (Naidoo et al. 2018), and to predict decay resistance in Scots pine (P. sylvestris) to Poria placenta, a brown rot fungus (Flæte and Haartveit 2004). As with SIMCA, some trees had to be trimmed from the final, optimized PLSR model (29% of N = 79), although in contrast to SIMCA and SVM, intermediate trees were also included. This suggests that the technique is also capable of detecting minor quantitative differences in susceptibility and not only differences between extreme groups (in our case, resistant versus susceptible).
Since the number of trees included in this study was limited, further evaluation and validation of predictive models is needed before the approach can be implemented widely and on a production scale. Nonetheless the combination of FT-IR spectroscopy with chemometrics shows great promise as a tool for resistance screening efforts and may be widely applicable in the ornamental and nursery industries in the future. In support of this statement, FT-IR spectroscopy has been used to differentiate between elms that differed in susceptibility to Dutch elm disease (Martin et al. 2005) and in our own work could be used to distinguish between resistant and susceptible coast live oak to sudden oak death (Conrad et al. 2014) and European ash to ash dieback (Villari et al. 2018).
This approach provides an alternative to traditional, inoculation-based methods for screening Austrian pine for resistance to D. sapinea and has the potential to allow for more rapid and high-throughput phenotyping in the future. Furthermore, the approach may be useful not only for trees, but for other woody perennials, such as boxwood (Buxus spp.), which is greatly impacted by boxwood blight, a disease caused by the fungus Calonectria pseudonaviculata. In this case, an FT-IR and chemometrics approach could be used to supplement efforts aimed at assessing the susceptibility of boxwood cultivars (e.g., Guo et al. 2016) and for future resistance breeding programs. Therefore, the approach may have wide applicability in the ornamental and nursery industries, since identifying disease resistant trees and plants for use in the urban landscape is critical to the economic success of those industries.
ACKNOWLEDGMENTS
The authors wish to thank Stephen Opiyo for providing feedback on the statistical analysis, Bethany Kyre for laboratory assistance, and Dr. Luis Rodriguez-Saona for allowing us access to the FT-IR spectrometer. Funding for this project was provided by state and federal funds appropriated to The Ohio State University, College of Food, Agricultural, and Environmental Sciences, Ohio Agricultural Research and Development Center.
Footnotes
↵* current address
Conflicts of Interest:
The authors reported no conflicts of interest.
- © 2020, International Society of Arboriculture. All rights reserved.