Behavior of strawberry production with growth models: a multivariate approach

Strawberry is an economically and socially important crop in several regions worldwide. Thus, studies that provide information on topics in strawberry growth are important and must be constantly updated. The aims of this study were to fit a logistic growth model to describe strawberry fruit production and to estimate the partial derivatives of the fitted model in order to estimate and interpret the critical points, in addition to using multivariate analyses. To do this, data on 16 treatments [combinations of two cultivars (Albion and Camarosa), two origins (national and imported), and four mixed organic substrates (70% crushed sugar cane residue + 30% organic compost, 70% crushed sugar cane residue + 30% commercial substrate, 70% burnt rice husk + 30% organic compost, and 70% burnt rice husk + 30% commercial substrate)] conducted in a randomized complete block design (RCBD) with four replicates were used. A logistic model was fitted to the accumulated fruit production stratified by treatment and replication. Partial derivatives related to the accumulated thermal sum were estimated in order to quantify the critical points of the model. Subsequently, a principal component analysis was performed. The results show that the use of growth models substantially increases the inferences that can be made about crop growth, and the multivariate analysis summarizes this information, simplifying its interpretation. Approaches such as those carried out in this study are still rarely used, but, compared to simpler models, they increase the amount of inferences that can be made and provide greater elucidation of the results.


Introduction
The strawberry is a multiple-harvest crop, and the statistical analysis of data for these crops is sometimes complicated. The excess of zeros when a plant has no harvestable fruit, the heteroscedasticity among harvests, the correlation among harvests from the same plant and/or experimental unit and the inability to randomize the harvest as a factor in subplots (Lúcio et al., 2016) are factors that must be considered when analyzing multiple-harvest crop datasets. These characteristics often violate the analysis of variance (ANOVA) model assumptions.
To meet the ANOVA assumptions, the statistical analysis of strawberry trials has been carried out considering the total production only (Diel et al., 2017b;Mérelle et al., 2017;Morris et al., 2017). This approach is effective in solving the problem of the violation of ANOVA assumptions, but it reduces the information about production behavior over time, such as precocity and fruit production rate. These characteristics can be measured through the biological interpretation of the parameters and critical points of nonlinear growth models (Diel et al., 2019;Sari, Olivoto, Diel, Krysczun, & Lúcio, 2018). Sari et al. (2018) concluded that, for multiple-harvest crops, the logistic model has the best performance in describing crop production because the parameters estimated are close to being unbiased, meet the assumptions about the residuals, and present high linear approximation (e.g., low nonlinearity measures). In addition, for strawberry, the logistic model for the modeling of fruit production has been defined (Diel et al., 2019).
The benefits of the models are that they can increase the inferences about productive behavior, but they also increase the number of variables. When the number of variables increases, a univariate analysis cannot be conclusive. Therefore, the multivariate analyses must be utilized with nonlinear model analysis. Multivariate analyses are important for extracting the maximum amount of information from a set of variables. The use of multivariate techniques to establish the variability of treatments is an effective tool (Vargas et al., 2015).
In this context, the aims of this study were: i) to fit a logistic nonlinear growth model to describe strawberry fruit production; ii) to estimate the partial derivatives of the fitted model in order to estimate and interpret the critical points of the model; and iii) to use multivariate analyses to characterize the treatments through the parameters and critical points of the adjusted model.

Plant material, site description, and experimental design
The experiment was carried out at the Federal University of Santa Maria (UFSM), in Frederico Westphalen, Rio Grande do Sul State, Brazil, located at 27°23' S, 53°25' W and 493 m of altitude. The region's climate, according to the Köppen classification, is Cfa, wet subtropical, presenting temperate rainy characteristics, with an annual mean precipitation of 1,800 mm uniformly distributed throughout the year and subtropical temperatures (Alvares, Stape, Sentelhas, Gonçalves, & Sparovek, 2013). A soilless cultivation system was installed in a galvanized steel frame greenhouse with a semicircular ceiling oriented in a north-south direction. The dimensions of the greenhouse are 20-m length by 10-m width with a 3.5-m height lateral post. Strawberry seedlings were transplanted to 150µm tubular white plastic bags and were kept on wood tabletops 0.8 m above the ground.
A nutrient solution was provided by a drip system situated inside of the bags, composed of a drip line spaced every 0.10 m. Applying the fertigation according to the formula and frequency developed by (Gonçalves, Vignolo, Antunes, & Reisser Junior, 2016 ), the frequency was adjusted according to the stage of development of the crop. For the vegetative and reproductive phase, fertigation was provided every 2 and 5 minutes, totaling approximately 600 to 1,200 mL per bag, respectively.
The substrates used in the plastic bags were a mixture of 70% crushed sugarcane residue or burnt rice husk and 30% commercial substrate or organic compost (Table 1). Before transplanting, the substrate was rinsed until an electrical conductivity level of less than 1 mS cm -1 was reached, in order to make the substrate chemically inactive. Seedlings of two cultivars (Albion and Camarosa) from two origins (national and imported) were used.
Seedlings considered national were taken from a seedbed in Agudo, located in the basaltic slope in Rio Grande do Sul State, Brazil, between the central depression and the mid-uplands, whose geographic coordinates are 29°62' S, 53°22' W at 83 m of altitude. Imported seedlings grown in Argentina were produced at a seedbed called Patagônia Agrícola S.A., located in El Maitén, whose geographic coordinates are 42°3' S, 71°10' W at 720 m of altitude.
The experiment was conducted in a randomized complete block design with four replicates. Each experimental unit was composed of eight plants. The 16 treatments tested are listed in Table 1.

Assessments Completed
The air temperature inside the greenhouse was recorded with a thermohygrometer installed 1.5 m above the ground surface. The mean air temperature calculation was estimated by the following equation: (1) where: T ave is the air average temperature; T max is the maximum air temperature; and T min is the minimum air temperature.
The daily thermal sum (TS d ) in °C day -1 was calculated according to the following equation (Arnold, 1960): (2) where: TS d is the daily thermal sum (°C day -1 ); T ave is the air average temperature; and T b is the base temperature.
The base temperature (T b ) is set as the temperature below which the plant cannot develop, or its development is so slow that it can be ignored (Rosa et al., 2011). Strawberries have a base temperature of 7°C (Mendonça et al., 2012).
The daily thermal sum was calculated from the date of the seedling transplant to the plastic bags, and the accumulated thermal sum (TS a, in °C day -1 ) up to the ith day was calculated by: (3) Harvests were carried out twice a week during the complete maturity stage for a total of 37 harvests, segregating commercial from noncommercial fruits. The commercial fruits harvested in each experimental unit were weighed with the aid of a scale. Afterward, the fruit mass per plant was calculated, dividing the total mass of reaped fruits by the number of plants in the experimental units.

Adjustment of growth model
The mean mass of fruits per plant (g plant -1 ) obtained in each harvest was consecutively accumulated for each experimental unit. Afterward, a logistic model was fitted to each experimental unit according to the following equation: where: Y i is the mean mass of fruits per plant (dependent variable); X i is the accumulated thermal sum (TS a ), in degree days, from the seedling transplant up to the ith harvest (independent variable); β 1 is the asymptotic value, and its values represent the total production of treatments; β 2 is a parameter that reflects the distance between the initial value (observation) and the asymptote; and β 3 is the parameter associated with the growth rate.
The parameter estimates were obtained using the ordinary least squares method with a Gauss-Newton algorithm. This procedure was performed using the nls() function in R software (R Core Team, 2018). Later, the coefficient of determination (R²) and the intrinsic (c I ) and parametric (c θ ) nonlinearity were calculated by the curve method suggested by Bates and Watts (1988). Afterward, and values were estimated, where F (α,p,n-p) = F tabulated as a quantile of the F distribution in which α is 0.05, p is the number of parameters in the model and n is the number of observations. When these values are under 0.3 and 1.0, respectively, the parameters are close to being unbiased. The normality and homogeneity of residuals were tested by the Shapiro-Wilk and Bartlett tests, respectively.
Due to the violation of the model's assumptions, the confidence intervals were obtained by a bootstrap approach. Using the nlsboot() function of the nlstools package in software R (Baty et al., 2015), 10,000 estimates of each parameter were obtained for each treatment. The confidence intervals were obtained by the difference between the 97.5 th and 2.5 th percentiles of the bootstrap parameter estimates. When the confidence intervals did not cross, the treatments were considered different.

Precocity and concentration of the production
The coordinates (X and Y) of the critical points of the logistical model, known as the maximum acceleration point (XMAP), inflection point (XIP), maximum deceleration point (XMDP) and asymptotic deceleration point (XADP), were obtained by setting the following derivatives as equal to zero, according to methodology described in (Mischan, Pinho, & Carvalho, 2011): inflection point (XIP): ; point of maximum acceleration (XMAP) and point of maximum deceleration (XMDP): ; and point of asymptotic deceleration (XADP): . The precocity was defined when the XIP was achieved (this point was related to the moment at which the rate production of fruit was maximal). The concentration of production was defined by the difference between XMAP and XMPD, corresponding to the time during which the production increased exponentially (Sari et al., 2018).

Multivariate analysis
The variables analyzed were i) asymptote (representing the total production), ii) XIP (representing the precocity of production), iii) concentration of production (difference between XMAP and XMDP), and iv) XADP (indicates the moment of harvest at which increases in production become insignificant). These variables were estimated for the 16 treatments; afterwards, a principal component analysis (PCA) using the Pearson correlation matrix between the variables was carried out. The PCA was performed to reduce the dimensionality of the data into few components, allowing the interpretation of the relationships both among the variables and among the variables and the treatments. The PCs were obtained using the PCA() function, and the biplots were constructed using the fviz_pca_biplot() function, both implemented by the FactoMineR package (Le, Josse, & Husson, 2008) in R.

Model adjustment
The logistic model fit all analyzed treatments. All R² values were higher than 97%, and the c I and c θ were considered low ( Figures 1A and 2B). The low intrinsic and parametric nonlinearity measures indicate that the parameter estimates were close to being unbiased. The low nonlinearity is the most important criterion that must be used when selecting a model. As the parameters have a biological interpretation, biased estimates can lead to misinterpretations of the characteristics described by the parameters and critical points, such as the precocity and concentration of the production (Sari et al., 2018;Sari, Lúcio, Santana, & Savian, 2019a;Sari et al., 2019b). These aspects (higher R² and low nonlinearity) show that the model is a good predictor and that the parameters can be used as explanatory variables (Bates and Watts, 1988). Thus, this model can be used for modeling fruit production in strawberry crops. Treatments T9, T10, T11, T12, and T16 had significantly higher β 1 (asymptote), as they were the treatments with higher production. Production in these treatments ranged from 550 to 625 g plant -1 . The less productive treatments were T2, T3, T5, T6, and T7, which produced from 250 to 300 g plant -1 . The treatments T1, T4, T8, T13, T14, and T15 had intermediate production, between 328 and 439 g plant -1 (Figure 3). The parameter 2 is related to the degree of maturation of the crop at the beginning of the harvest. Low β 2 values indicate higher fruit ripeness at the beginning of the harvest (Figures 1 and 2), which is related to the productive precocity. In the case of this study, the treatment values were significantly lower for treatments T9, T10, and T11; they tended to have more growth early in the process. In treatments T2, T4, T6, and T8, the values of β 2 were higher, indicating later production. These results indicate that a higher proportion of the yield was obtained at the initial harvests in the T9, T10, and T11 treatments than at the initial harvests in the T2, T4, T6, and T8 treatments (Figure 3).
The value of the parameter β 3 is related to both the productive precocity and the concentration of production. Higher values of β 3 indicate higher productive precocity and higher harvest concentration. The T9, T10, and T11 treatments had the lowest values of β 3 . The values were significantly higher for T2, T3, T4, T6, and T8, indicating that the crop production was concentrated in a shorter period in these treatments (Figure 3).
The values of the parameters of the growth models should be analyzed jointly (Sari et al., 2019a). In our case, lower values of β 2 are associated with higher values of β 1 (total final production). This indicates that, in relative terms, a more of the fruits are harvested earlier in the more productive treatments than in the less productive treatments. The consequence of this behavior can be observed when comparing treatments T6 (less productive and with less early production) and T10 (more productive, with an earlier production start). Lower production was seen in the first T6 harvests due to their lower productive precocity, and higher production was seen in T10 due to their higher productive precocity (Figure 4). Despite the lower value of the parameter β 2 indicating earlier production in T10, treatments T6 and T10 reached the PI at the same time due to the lower value of β 3 in T10 than in T6. The low value of β 3 indicates a peak production delay in T10, even though this treatment produces more fruits initially, because the rate of fruit production is slower (Figure 4). The lower values of β 3 in T10 also indicate that the period of exponential increase in production lasts for a longer period of time (Figure 4).

Critical points interpretation
The time when the plants reach the inflection point (XPI), which represents the time when the plants are at maximum production, differs between some treatments. For example, treatments T5, T7, T11, and T15 reach the inflection point earlier than most treatments (Figure 3). XPAM indicates the moment at which the exponential increase in production begins in the treatments. A high value of XPAM is related to slow increases in production at the beginning of the harvest. That is, the initial harvests produce few fruits and are associated with the degree of maturation of the plants when the harvests begin. It is important to note that the harvests started at the same time, and therefore, the XPAM can be used as an indication of the maturation of plants at the beginning of the harvest. The treatments T2, T4, T6, and T8 are characterized by a very slow start to production, determined by the later occurrence of the maximum acceleration point (XPAM). The maximum deceleration point (XPDM) indicates the end of the period of exponential growth of fruit production and does not differ much between treatments.
The production concentration is obtained by the time difference between the occurrence of the acceleration point (XPAM) and the maximum deceleration (XPDM). Using this difference as a variable, it is possible to determine for how long the treatments have shown exponential production growth. The shorter this period is (the less the difference between XPAM and XPDM), the more concentrated the production is. On the other hand, the longer the period is (the greater the difference between XPAM and XPDM), the less concentrated the production. In our case, the most productive treatments (T9, T10, T11, and T12) were those that had the lowest production concentration in relation to the others. In these treatments, the onset of the exponential production period (XPAM) is early; however, due to the low increase in the rate of fruit production (lowβ 3 ), the production concentration period is longer.
The asymptotic deceleration point demonstrates for how long the product has significant growth during the harvests. Significant differences were observed among a few treatments. The T9 and T10 treatments are characterized by producing fruits for a longer time than the other treatments (although the difference is often not significant).
Compared to those from a simple production variable, the inferences that can be made about the productive behavior of the treatments are increased substantially with the use of the parameters and critical points of the logistic model. Surely, this is a great advantage over a simple comparison of averages that could be performed between treatments (which is certainly what many researchers would do in this case). However, univariate analysis is often inconclusive and may be confusing. Therefore, using growth models and multivariate techniques together can bring many advantages to the researcher, such as i) the use of growth models substantially increases the amount of inferences that can be made and ii) the multivariate analysis summarizes this information, simplifying its interpretation.

Principal component analysis (PCA)
The contributions of the first two main components, PC1 and PC2, to the variation in the results were 56.5 and 31.9%, respectively, while the contribution rate was 88.4%, which is a significant percentage of the variability extracted using the first two axes ( Figure 5). The relationship between the concentration of production and the final production, as discussed above, is confirmed in the PCA. The vectors of these variables in the same direction in the biplot show the direct relationship between the total production and the concentration of the production. That is, productive treatments have a lower concentration of production. The vectors for treatments T9, T10, T11, and T12 indicate that they are the most productive because they have a lower concentration of production. Treatments T4, T6, and T8 show the opposite results. These treatments are characterized by less production and a more concentrated production.
In general, the response of the treatments is related to the substrate used. Therefore, the treatments were grouped according to the substrate in a biplot ( Figure 5). The treatments on the burnt rice husk + organic compound (S3) substrate were the most productive at lower production concentrations of the crop, that is, the production increased exponentially for a longer period in comparison to that in the other treatments. The XPDA also shows that the production was not very concentrated in these treatments; that is, the treatments produced fruits for a longer period. In contrast, the treatments whose substrate was composed of the sugarcane bagasse mixture had the opposite behavior: they produced few fruits, and their production was concentrated in a short period.

Discussion
Growth models have been widely used to evaluate experiments with multiple-harvest crops. Sari et al., (2018;2019b) report that the use of ANOVA implies the loss of information about the productive behavior of crops over time. This information loss occurs because researchers group the observations of the crops in their analyses so that the assumptions of the ANOVA are fulfilled. (Sari et al., 2018;2019b) demonstrate that growth models are an alternative to ANOVA and increase the inferences that can be made about the productive behavior of these cultures; growth models can therefore be an alternative for statistical analysis in these cases.
Due to the recent use of growth models as an alternative for statistical analysis in multiple-harvest crops, the productive behavior of different cultivars of strawberries cultivated on different substrates has never been detailed. Most of the inferences were made based on total production using ANOVA as a statistical analysis tool. The loss of information about productive behavior over time has not allowed researchers to verify, for example, that the substrates interfere with the productive precocity, the concentration of harvests over time and the period in which the production of strawberry fruits is significant. With the use of growth models, it was possible to verify that the productive behavior is related more to the substrate used than to the genotype or origin.
Lower production was observed in treatments with sugar cane residue as the main component of the substrate. A substrate used in soilless strawberry cultivation systems should present physical characteristics that provide appropriate conditions for plant development, mainly regarding the ability to hold water and nutrients. In the case of treatments with lower production, the physical characteristics of the substrate led to this difference. The water saturation content in the substrate can often be up to 50%; however, it can be reduced quickly to less than 10% under a slight water tension increase of only 20 to 40 cm in the water column (Wang, Gabriel, Legard, & Sjulin, 2016). Differences in the parameters estimated by the logistic model can also be verified in work done with Dianthus chinensis L., in which differences between the evaluated substrates were observed (Milani et al., 2016). The same analogy can be used for the β 2 parameter, in which higher values were found in the treatments with sugarcane bagasse. This indicates that substrate can influence the start time of fruit production, providing gains and losses in the early production period, when the price per kilogram of fruit can be higher.
In the most productive treatments, that is, those with higher values of β 1 (T9, T10, T11, T12, and T16), the substrate was a mixture of burned rice husk and organic substrate; this combination provides an environment that is more conducive to the growth and development of the plants, as observed by (Diel et al., 2017a;. Higher strawberry production due to the use of burnt rice husk + organic compost as a substrate had already been reported by , but the behavior of production over time had not been studied. The growth models showed that production in these treatments started earlier (lower XPAM), ended later (higher XPDA) and had exponential growth occurring over a longer period of time (i.e., lower production concentration). On the other hand, the substrates that contained the sugarcane bagasse in the mixture led to lower production (lower values of β 1 ) concentrated in shorter periods (higher value of β 3 ).
The high concentrations of the production in treatments with burnt rice husk and organic compost is due to the physical characteristics of the substrates; the culture medium influences the yield of strawberry fruits (Sønsteby, Opstad, & Heide, 2013;Wang, Zhu, & Xia, 2012). It can also be observed in this case that, among the most productive treatments, T9 and T10 had significant increases in production for a longer period (evidenced by the greater XPDA) ( Figure 5). These treatments both used a day-neutral cultivar that does not respond to the photoperiod; therefore, the production was extended for a longer time, which was shown by the results of our analysis. This production dynamic is interesting for fruit producers, since they can offer the fruit for more days of the productive cycle, allowing greater gains and profitability from the productive system.
Through the interpretation of the parameters and the critical points of the models, it is possible to increase the amount inferences that can be made regarding strawberry production. The parameters and critical points become variables that explain strawberry behavior. On the other hand, the increase in the number of variables makes it difficult for the researcher to interpret the results. The use of multivariate techniques allows the researcher not only to verify the relationships among variables but also to discriminate among treatments. Rinaldi, De Lucia, Salvati, and Rea (2014) emphasize that multivariate analysis is a reliable tool for selecting substrates when comparing a large number of criteria; their PCA diagram discriminated four groups of treatments associated with the concentration of compounds in the substrate based on the observed physiological response of rosemary.
The use of PCA allowed us to verify that the production and the concentration of crops are related and that the variability in these aspects is more associated with the use of different substrates than with the use of different genotypes and origins, as observed in the biplot. The advantage of using PCA is that it considers the variables that are most informative in the data set, providing that the individuals are grouped by similarity (Hongyu, Jorge, & Junior, 2015). Approaches such as those carried out in this study are still rarely used, but they increase the amount of inferences that can be made and provide greater elucidation of the results of scientific studies that seek broader and clearer results.
In our case, the use of two methodologies allowed us to make inferences of great interest to the producer. We verified, for example, that strawberry cultivated with substrates containing burned rice husks and organic compost, regardless of cultivar and origin, produces higher quantities of fruits at the beginning of the harvest. In addition, the increase in production extends over a longer period of time, which is evidenced by the higher value of XPDA. Thus, the producer is not held hostage to price oscillations, which is characteristic of this type of fruit production. In contrast, the use of other substrates, in addition to producing less fruit, concentrates fruit production in a shorter period. We could not have drawn these conclusions if we had analyzed the variable mass of fruit per plant with a simple analysis of variance; this shows the power of adequate statistical analysis for drawing conclusions from an experiment.

Conclusion
The use of growth models increases the inferences that can be made about the productive behavior of strawberry, and the critical points are effective for interpreting the precocity, rate and concentration of the harvests. Additionally, multivariate analysis made it possible to verify that crop production and concentration are related and that the variability among treatments occurs due to the substrate. Thus, the joint use of nonlinear models and multivariate analysis allowed the identification of the behavior of strawberry production through the crop cycle in different treatments. This approach can be extended to other multiple-harvest crops in order to understand both treatment effects and production behavior throughout the crop cycle.