Potential use of hyperspectral data to monitor sugarcane nitrogen status

Nitrogen management in crops is a key activity for agricultural production. Methods that can determine the levels of this element in plants in a quick and non-invasive way are extremely important for improving production systems. Within several fronts of study on this subject, proximal and remote sensing methods are promising techniques. In this regard, this research sought to demonstrate the relationships between variations in leaf nitrogen content (LNC) and sugarcane spectral behaviour. The work was carried out in three experimental areas in São Paulo State, Brazil, with different soils, varieties and nitrogen rates during the 2012/13 and 2013/14 seasons. A significant correlation was observed between the LNC and variations in the sugarcane spectra. The green and red-edge spectral bands were the most consistent and stable predictors of LNC among the evaluated harvests. Stepwise multiple linear regression analysis (MSLR) generated better models for LNC estimation when calibrated with experimental area, independent of the variety. The present research demonstrates that specific wavelengths are associated with the variation in LNC in sugarcane, and these are reported in the green region (near 550 nm) and in the red-edge wavelengths (680 to 720 nm). These results may help in future research on the direct in situ application of nitrogen fertilizers.


Introduction
The application of nitrogen (N) fertilizers to soil or plants during plant development is an important technique for agricultural production. This form of chemical management is commonly used for most crops, including sugarcane. Typically, the application of this element occurs in uniform and pre-fixed doses, which aim to supply the nutrient demand to obtain a predetermined productivity, disregarding the existing amount of the element in the plant or how much N will be available in the soil during the crop cycle.
In addition to the possibility of environmental liabilities, this methodology may under or overestimate the N application because it is a very dynamic element. Nitrogen also has a high cost and great demand in agricultural production. It was estimated that in 2017, more than 113 million tons of nitrogen were used worldwide, and demand is projected to increase in the coming years (FAO, 2017). Therefore, it is imperative to optimize nitrogen application to ensure the long-term viability of agricultural activities.
Sugarcane is a semi-perennial crop that is highly responsive to N fertilization, in which the inadequate administration of this nutrient results in productivity losses in the current season and has perceptible longterm effects (Vitti et al., 2007). In addition, an excess of this nutrient causes negative effects, reducing the sucrose concentration and delaying maturation (Thorburn, Meier, & Probert, 2003).
Existing methods for nitrogen monitoring in crops, such as a chlorophyll metre or soil and plant tissue analysis, are difficult to use in large areas. The use of chlorophyll metres is limited by the low agility of the process, while soil or plant tissue analysis is time-consuming, invasive and expensive because large numbers of samples become necessary to represent the spatial variability (Ranjan, Chopra, Sahoo, Singh, & Pradhan, 2012).
In the search for alternatives that accelerate the process of N analysis in plants, remote and proximal sensing methods have been studied. Spectroscopy is widely cited as supporting the improved efficiency of agricultural production systems because it is an agile, efficient and low-cost tool in crop monitoring. Studies on the use of these sensors for N monitoring have gained prominence in the last two or three decades (Cammarano, Fitzgerald, Casa, & Basso, 2014;Pradhan et al., 2014).
Several studies have used remote sensing (RS) and proximal sensing (PS) to evaluate productive fields through active and passive sensors at terrestrial (Mahajan, Sahoo, Pandey, Gupta, & Kumar, 2014), aerial (Lebourgeois, Bégué, Labbé, Houlès, & Martiné, 2012) and orbital levels (Herrmann et al., 2011). Some studies have been carried out in productive fields of sugarcane in Brazil, with an emphasis on the administration of nitrogen fertilization (Amaral, Molin, Portz, Finazzi, & Cortinove, 2014;Amaral, Molin, & Schepers, 2015;Rosa, Amaral, Molin, & Cantarella, 2015). In fact, terrestrial analyses may be used not only as a method for N analysis in plants but also as a basis for embedded sensor calibration in aerial or orbital platforms.
Research conducted under Brazilian production conditions has basically adopted the active sensor technologies previously developed for other agricultural crops and regions of the globe, using only a few bands in the visible and near-infrared spectra. This research seeks to understand the following: how well does the hyperspectral data monitor changes in the N nutritional state in sugarcane?
To answer this question, the present research objectives were to i) explore the relationships between the sugarcane spectral response related to variations in LNC when submitted to different N application rates and ii) define regression models using crop reflectance data to estimate the nutritional status of N. Knowing that energy interacts with the existing elements in plants, it was expected that the leaf reflectance for plots that received different doses of N would express differentiated vigour responses that could be quantified.

Material and methods
The present work was conducted in three locations in São Paulo State, Brazil (Table 1). According to the Köppen climatic classification, all areas have a humid subtropical climate (CWa), with annual mean rainfall below 1,400 mm, wet summers and dry winters. All experimental areas were planted in 2010 using a randomized complete block design with subdivided plots, where plots were differentiated by variety and subplots by four N doses ( Table 2). The study areas received the specified nitrogen rates since the first cycle; however, the evaluations were carried out in the 2012/13 and 2013/14 seasons during the third and fourth productive cycles (Table 2). The SP 81-3250 variety was common for the three experimental fields, which made it possible to compare the effect of the environment on the same genetic material. All other varieties were selected according to their adaptability to different soil fertility conditions. Each subplot was composed of five 10-m long rows of sugarcane spaced at 1.5 m, where the three central rows were considered as the evaluation area, and 1 m was discarded from each end to avoid border effects.
The initial and annual soil corrections were performed according to the recommendations for the crop, diagnosed and recommended based on routine soil analyses. The nitrogen doses were applied using ammonium nitrate, distributed on the sugar cane straw in a single dose at the beginning of each cycle. All other phytosanitary treatments followed the standards of the regional production system adopted by sugarcane producers.
For the spectral and leaf nitrogen analysis, 10 leaves per plot were collected near the maximum vegetative development stage for the crop, which for sugarcane, coincides approximately four months after the ratoon sprouts (120 to 140 days after harvest). The collected leaves are described in the literature as "+1" leaves, as these are the first leaves in which a separation point on the leaf blade and the sheath is completely visible, according to Kuijper's system (Casagrande, 1991). After collection, the leaves were immediately stored in plastic bags and acclimated in thermal iceboxes (without direct contact) to minimize moisture losses and then sent to the laboratory for spectral analyses.
The sugarcane spectral curves were obtained using the FieldSpec 3 spectroradiometer (ASD -Spectral Devices Inc., Boulder, CO, USA), which operates in the spectral region from 350 to 2500 nm, with a spectral resolution of 1.4 nm between 350 and 1050 nm and 2 nm between 1,050 at 2,500 nm, coupled to an integrating sphere, RTS-3ZC (ASD -Analytical Spectral Devices Inc., Boulder, CO, USA), configured to perform reflectance readings of leaves. All the data were automatically interpolated to a spectral resolution of 1 nm.
The spectral readings were performed in the middle third of the collected leaves. After readings, the average spectral curves were calculated by subplot using all 10 collected leaves. These samples were then dried in a forced circulation oven at 65ºC, ground in a Wiley mill and analysed for the macronutrient content (N, P, K, Ca, Mg, and S), according to Malavolta, Vitti, and Oliveira (1997).
To explore the effects of nitrogen application on the spectral response of sugarcane, the Pearson correlation coefficients between each wavelength and leaf nitrogen content (LNC) were calculated. The correlation significance was evaluated using the t-test (p ≤ 0.05).
Later, in the phase of multivariate analysis and generation of regression models, the mean spectra of each subplot were used for the first derivative spectra (FDS) calculation using Equation 1. S where: R (j) is the reflectance at wavelength j, R (j+1) is the reflectance at wavelength j + 1, and is the difference (nm) between wavelengths j and j + 1.
A major challenge in working with multivariate data is that the number of observations is often less than the number of predictor variables, and these are generally highly correlated. The sparse partial least square (SPLS) methodology has as its central principle the imposition of sparsity on the predictor variables by means of partial least squares principles, allowing efficient variable reduction and selection (Chun & Keles, 2010). This methodology was implemented using the sPLS analysis package, developed for statistical software R (Chung & Chun, 2012).
All the results of the variable selection phase for model generation were obtained using data from the 2012/13 season. Coefficients that represented the importance of each predictor variable (wavelengths) were generated for estimation of LNC. The most important wavelengths were selected and submitted to a stepwise multiple linear regression (SMLR) analysis, resulting in models that could be used to estimate the sugarcane LNC. The prediction models were generated for the reflectance and FDS with spectral data obtained from the 2012/13 season and then validated with independent data collected in the 2013/14 harvest.
The stepwise technique started with a model containing all variables, and variables were gradually removed according to the statistical significance of each one. This process occurred until the remaining variables were all important (statistically relevant), that is, until there was no improvement in the model significance or there were no additional variables to be withdrawn. This technique assumes that some variables do not contribute significantly to the response of the whole dataset (Darvishzadeh et al., 2008).
The best models for each area were selected based on the lowest values for the Akaike information criterion (AIC), which considers the penalization of the regression statistical parameters in relation to the insertion or exclusion of one more predictive variable (Estes, Okin, Mwangi, & Shugart, 2008). The AIC analysis does not directly represent the quality of the final model but enables choosing the best model from a range of possible combinations of variables. The accuracy of all generated models was evaluated based on the coefficient of determination (R²), root mean square error (RMSE) and relative prediction error (RE), of which the latter two were calculated by Equations 2 and 3, respectively.

Results and discussion
The results for the nitrogen analyses as a function of the applied ammonium nitrate rates ( Figure 1) show that the data variation was lower in the 2012/13 season than in the 2013/14 season. However, there was a similar pattern of variation in LNC between the two seasons, and the levels of N increased as the doses increase. The SP 81 3250 variety was the only one showing a tendency of LNC saturation, both in field 1 in the 2012/13 crop and in field 3 for both seasons, presenting lower LNC values at doses of 150 kg ha -1 than at 100 kg ha -1 . Field 2 presented the smallest variations in LNC among the three cultivated varieties for both evaluated seasons (Figure 1).
According to Raij, Cantarella, Quaggio, and Furlani (1997), the normal values of sugarcane LNC are between 18 and 25 g kg -1 . Following these parameters, the observed results indicate that the culture was subjected to a certain level of stress, mainly in the treatments with doses of 50 kg ha -1 or lower. In addition, it is interesting to note that the LNC observed in the present study indicates differences in the mean values between the varieties. Thus, the critical levels require more detailed studies to identify the accurate stress ranges that can be proposed for different varieties or environmental conditions (Santos et al., 2013).
To better understand the climatic conditions for crop development during the seasons under evaluation and their possible influence on the results obtained, the water balance was calculated ( Figure 2) based on the methodology proposed by the FAO (Allen et al., 1998). he horizontal axis represents the time (on a daily scale), and the vertical axis shows the daily values of rainfall, soil water depletion and limit of the depletion where the water is readily available for the crop (RAW). Depletion values above the red line indicate that the culture was subjected to some level of stress.
For the three areas under study, the amount of rainfall from the harvest to the date of leaf sampling was higher in the 2012/13 season than in the 2013/14 season, and the accumulated values were 890.0 and 541.6 mm for field 1, 565.6 and 466.1 mm for field 2, and 682.4 and 555.4 mm for field 3, respectively. According to the water balance results, none of the areas were in a water stress condition at the time of sampling in the 2012/13 season; however, in the 2013/14 season, all of the areas presented some level of stress (Figure 2). Considering that almost all of the nitrogen absorbed by plants occurs through mass flow, the occurrence of water stress can explain the greater variability of the observed LNC in the 2013/14 season when compared to the 2012/13 season. At the same time, the water scarcity in soil can cause changes in the sugarcane nitrogen uptake dynamics, and water stress significantly changes the spectral behaviour of the vegetation throughout the range of the spectrum evaluated in the present study (Zygielbaum, Gitelson, Arkebauer, & Rundquist, 2009;Bandyopadyyay et al., 2014).
Water scarcity can cause changes in the arrangements of the photosynthetically active structures (Zygielbaum, Arkebauer, Walter-Shea, & Scoby, 2012), which govern the reflectance, mainly of the visible and red-edge region, that are closely related to the plant nitrogen concentration (Cammarano et al., 2014).
In Figure 3, the average spectral curves for the nitrogen rates of 0 and 100 kg ha -1 are presented. It is possible to observe differences between the spectral behaviour of the sugarcane leaves when the average spectra of the treatments with and without nitrogen application are compared. For the visible spectral bands (400 to 700 nm), the leaves of fertilized plants showed lower reflectance values, with an inverse behaviour for the near-infrared bands (750 to 950 nm). Despite the persistent behaviour, these curves present different intensities between the areas and seasons analysed. For both years, the experimental area of Piracicaba (field 2), which contained the most clayey soil of all the experimental fields (Table 1), presented the smallest visual reflectance changes in the spectra between the two treatments. The largest differences were observed at Santa Maria da Serra (field 3), which had sandy soil, low natural fertility and low potential of water retention.
The visible spectral region is sensitive to changes in the leaf pigment concentration, where chlorophyll is the most abundant in green plants (Xue & Yang, 2009). Healthy leaves have a typical positive reflectance feature centred at 550 nm (green region). Green light is absorbed with lower intensity than blue and red light; therefore, green light is predominant in relation to plant reflectance for visible spectra under normal conditions. However, green light absorption by plants is considerable and increases proportionally to leaf chlorophyll concentration, which in turn, is directly sensitive to plant nitrogen uptake (Terashima et al., 2009).
Thus, plants stressed due to nitrogen insufficiency tend to present leaves with light green to yellowish shades, while well-supplemented plants tend to present leaves with dark green shades. The concentrations of other pigments, such as carotenoids, xanthophylls and anthocyanins, are also sources of spectral behaviour changes in the visible region.
The leaf pigments are almost "transparent" to electromagnetic radiation in the near-infrared region, and the spectral behaviour of this region is highly correlated to the structural arrangement of the spongy mesophyll within the cells (Knipling, 1970). Several nutrients can influence the structures and arrangements of these cells, with nitrogen being one of these elements.
Much of the plant nitrogen is allocated in the chloroplast composition. Another important part of this element includes the free amino acids, proteins or other nitrogenous component structures in the plant, such as nitrogenous bases (purines and pyrimidines) and nucleic acids (DNA and RNA), which are responsible for 10% of the total plant nitrogen. Other soluble amino forms can represent up to 5% of the plant nitrogen (Conn, Stumpf, Bruening, & Doi, 1987;Mengel & Kirkby, 1987).
Because nitrogen is related to the most important physiological processes that occur in plants, the stress caused by this nutrient can affect the cellular structure and, consequently, the near-infrared reflectance. In this work, it was observed that an increase in the nitrogen doses caused an increase in the reflectance in the near-infrared region ( Figure 3); however, no systematic changes were observed in the shortwave infrared region (950 to 2,200 nm), as this spectral region is described in the literature as primarily responsive to variations in the leaf water content (Ceccato, Flasse, Tarantola, Jacquemoud, & Grégoire, 2001).
The observed results of the Pearson correlation coefficients between each wavelength and the LNC are presented in Figure 4. This analysis showed that there was a significant and consistent correlation between the LNC and the leaf reflectance for the visible region, especially in the green light bands centred at 550 nm, where the correlation coefficients ranged from -0.40 to -0.48 in the 2012/13 season and from -0.31 to -0.64 in the 2013/14 season. The red edge is characterized by the rapid increase in the reflectance observed in the vegetation spectra, and it is located in the transition between visible and infrared bands, which can also be observed in Figure 3. The red edge also presented a consistent and significant correlation with the LNC, with correlation coefficient values between -0.36 and -0.52 in the 2012/13 season and -0.35 and -0.70 in the 2013/14 season. Both the green and red-edge spectral bands have been consistently cited as being sensitive to plant pigment variation, especially chlorophyll. Therefore, spectral indices and models have been developed using these spectral regions (Yao et al., 2013;Cammarano et al., 2014).
The near-infrared region for the 2012/13 season presented a significant correlation with the LNC only for field 1; however, the coefficients were low, always above 0.30 (Figure 4). For the 2013/14 crop, both fields 1 and 2 presented significant correlations with infrared bands but also with low values that were close to those previously mentioned.
The shortwave infrared region (SWIR) presented inconsistent results regarding the correlation between LNC and reflectance ( Figure 4). These results indicate that SWIR bands are more sensitive to other phenomena or types of stresses in plants. Even so, they may correlate with LNC when other factors of production do not reach limiting levels, such as other nutrient deficiencies, diseases, pests and water stress. The most likely stressor in this case was the variation in soil water availability because irrigation technologies were not used for all experimental areas, which is a more common situation for the sugarcane production system in Brazil.
Based on the results observed in correlation analysis (Figure 4), only the sugarcane spectral data from 500 to 1,000 nm were used in the regression models. Wavelengths shorter than 500 nm obtained by the RTS-3ZC integration sphere presented a high noise level, and its use negatively influenced the stability of the final models. However, wavelengths greater than 1,000 nm did not show significant improvements to the final models. Figure 5 shows the average reflectance and FDS curves used in the present work. The results of the sPLS coefficients are presented in Figure 6. Although the calibration phase generated models by single area and for all areas together, in the variable selection phase, the complete dataset for the 2012/13 season was used (216 samples). This action aimed to select wavelengths that showed global relationships with the LNC variations based on the practical idea that it is feasible to develop sensors that have some bands, with different combinations of these for each cultivation condition; however, the development of sensors for specific conditions is not reasonable due to the limitation of their application. Figure 6. Results for sparse partial least squares (sPLS) coefficients, size reduction and variable selection for prediction of sugarcane leaf nitrogen content using reflectance and first derivative spectrum (FDS). Figure 6 shows that the visible and red-edge regions are the most important for the reflectance coefficient result, and the wavelength centred at 690 nm presented the highest value of the sPLS coefficient. The red region and the wavelengths ranging from 840 to 960 nm were not significant, while the reflectance at wavelengths at 985 nm seemed to be a significant variable for the prediction of sugarcane LNC (Figure 6).

a. b.
Acta Scientiarum. Agronomy, v. 43, e47632, 2021 The coefficients calculated for the FDS data were smaller than those calculated for the reflectance ( Figure 6). However, most important spectral regions were the same, with green bands of 525 nm and 565 nm and red-edge regions of 680 and 715 nm, respectively. In addition, other features at 790, 800, 855, and 980 nm showed high values for the sPLS coefficients ( Figure 6).
There are few studies in the literature that used the sPLS methodology to select variables with hyperspectral data, an even more restricted number in regard to vegetation, and no studies were found for the selection of variables for nitrogen estimation in tropical crops.
Among the limited research on the application of sPLS in the study of the spectral response of vegetation was a study performed by Peerbhay, Mutanga, and Ismail (2014), who compared the traditional partial least squares (PLS) method with sPLS in the selection of variables for discriminant analysis of pine varieties in South Africa. The efficiency of the discriminant analysis increased from 71.88% using PLS to 80.21% using sPLS, as observed by Abdel-Rahman et al. (2014), who compared the two methodologies for variable selection and model generation to predict the oleraceous yield, also in South Africa.
The highest values observed for the sPLS coefficients showed coherence in relation to the spectral regions already mentioned in this work and in the literature regarding the sensitivity to variations in the vegetation pigment content and to the leaf and canopy nitrogen content (Miphokasap, Honda, Vaiphasa, Souris, & Nagai, 2012;Ramoelo et al., 2013). The efficiency of the best variables presented in Figure 6 for the prediction of the TFN in sugarcane can be observed in the analyses presented below.
Data from the 2012/13 season were used in the calibration of the stepwise multiple linear regression (SMLR) for the prediction of the sugarcane leaf nitrogen content. The results obtained for the models of reflectance and FDS data presented very similar performances, with the R² values adjusted to approximately 0.70 when analysing individual fields and 0.60 for the model that incorporated the entire dataset (Table 3). It was observed that the models calibrated by area (Table 3) had some variations in relation to the bands used, and it was necessary to use between five and six bands for the calibration of the best model, both for the reflectance and for the FDS data. The general model, which incorporated data from all fields, required a larger number of bands, including 10 for the reflectance data and nine for the FDS data, suggesting that a more complex dataset requires a greater number of variables for model stability. Even so, there was a significant worsening in the R², RMSE, and RE values for the general model.
The most sensitive wavelengths to variations in LNC were in the green and red-edge regions (Table 3). These wavelengths have been consistently cited as sensitive to variations in pigments in plants, especially chlorophyll, and a number of studies have focused exclusively on the use of red-edge information to monitor vegetation photosynthetic efficiency and plant nitrogen nutritional status (Cho & Skidmore, 2006).
The validation of the models by area and combined fields for reflectance and FDS curves was performed using independent data collected in the 2013/14 season (Figure 7). The spectral models for individual areas showed satisfactory performance, especially for fields 2 and 3. In field 2, the R², RMSE, and RE values were 0.72, 1.39 g kg -1 , and 7.49%, respectively, for the reflectance data and 0.73, 1.29 g kg -1 , and 7.26%, respectively, for the FDS data. These results are consistent with those observed by Miphokasap et al. (2012), who studied sugarcane cultivation in Thailand, representing one of the only studies in which hyperspectral data were used to monitor nitrogen in this crop.
The general models for both reflectance and FDS data showed a significant reduction in the accuracy of estimation when compared to the results of the calibration phase, with R², RMSE, and RE values of 0.51, 1.58 g kg -1 , and 9.20%, respectively, for reflectance data and 0.52, 1.48 g kg -1 , and 8.81%, respectively, for the FDS data (Figure 7). The individual model generated for field 1, despite presenting a high R² value in the validation phase (0.61 for both reflectance and FDS), also had elevated RMSE and RE values of 2.74 g kg -1 and 17.74%, respectively, for the reflectance models and 2.32 g kg -1 and 15.06%, respectively, for the FDS models (Figure 7). Much of this disparity was due to variations in climatic conditions and soil moisture between the seasons evaluated above.
Some research already developed in Brazil using commercial canopy sensors in sugarcane has observed limitations related to the calibration of an efficient algorithm for the estimation of LNC. The results were variable between productive areas and varieties or even in the establishment of sensor response ranges that are related to productivity losses or efficient fertilizer use Rosa et al., 2015).
The relationship between the sugarcane reflectance and LNC occurred in specific ranges of the spectrum, with an emphasis in short windows of green and red-edge bands. The spectral models of the sugarcane leaf can be applied to predict the foliar nitrogen content, with potential support in the optimization of conventional vegetable tissue chemical analysis, which has advantages of being a non-invasive and low-cost method. In addition, this basic study aims to guide future research in this area to enhance and consolidate this technology for in situ and remote analysis.

Conclusion
Hyperspectral data from sugarcane leaves are strongly influenced by several factors, including the environment, genetic variation among varieties, and temporal variability of climatic factors.
The spectral regions were strongly influenced by variations in leaf nitrogen content, with an emphasis on green light and red-edge bands, as higher nitrogen concentrations in sugarcane leaves caused a significant reduction in reflectance in these spectral regions, which supports the application of these spectral bands to predict the sugarcane leaf nitrogen content.
It was possible to calibrate a general model for predicting the nitrogen content in sugarcane with a coefficient of determination of 0.52 and a root squared mean error of 1.48 g kg -1 using data from seven varieties and three experimental fields, validating the models with data from an independent season. However, better results were obtained when performing the model calibration by a single experimental area.