Repeatability coefficients and number of measurements for evaluating traits in strawberry

The objective of this study was to estimate the coefficient of repeatability and the number of measurements required for production and quality variables in a strawberry crop. An experiment was conducted with two strawberry cultivars from two origins grown in four substrate mixtures, totaling 16 treatments, evaluated in a randomized block design with four replications. Mass (MF) and number (NF) of fruits per plant were evaluated as measures of production, and total soluble solids (SST), titratable acidity (AT) and firmness (FIR) of fruits during the crop cycle were evaluated as measures of quality. Subsequently, the repeatability coefficient was estimated by the following methods: analysis of variance (ANOVA), principal component analysis using a correlation matrix (PCcor), principal component analysis using a variance-covariance matrix (PCcov) and structural analysis (SA). The number of measurements was adjusted for each studied variable based on determination coefficients of 0.80, 0.85, 0.90, and 0.95. The repeatability coefficients ranged from low to medium. The ANOVA method gave the lowest r values, while the PCcov method presented the highest values of r. When using the PCcov method, 3.6, 2.9, 6.2, 3.2, and 3.8 measurements were needed to reach 80% confidence for the variables MF, NF, SST, AT, and FIR, respectively, and this increased to 7.3, 14.0, 29.6, 15.4, and 18.1 for 95% confidence in the results for MF, NF, SST, AT, and FIR, respectively.


Introduction
With high taste and organoleptic properties (Šamec et al., 2016), the strawberry culture Fragaria x ananassa Duch. is preferred among rural producers who seek to increase crop production in various parts of the world, generating employment and income for family farmers. The high consumption of fruits has also been attributed to the nutritional compounds present in fruits, which contribute to the reduction of heart disease and chronic diseases, such as cancer (Zhang, Seeram, Lee, Feng, & Heber, 2008).
Performing experiments requires the use of physical, human and financial resources, which are, in most cases, scarce. Generally, when conducting an experiment, researchers use areas with dimensions defined by the budget of their institution without paying the attention that should be given to the minimum plot size required to prevent an increase the variability of the experiments (Lúcio, Haesbaert, Santos, & Benz, 2011). For the strawberry crop, Cocco, Boligon, Andriolo, Oliveira, and Lorentz (2009) showed that experimental variability is reduced with an increase in the number of plants per plot and that the number of plants per plot should be six plants for hydroponic cultivation and ten plants for cultivation in the soil.
Generally, measurements of experimental variables are carried out with the goal of measuring the largest possible number of characteristics that influence the plant in question (Peralta-Zamora, Morais, & Nagata, 2005). In the same way, several factors are evaluated to obtain satisfactory results for a crop; this is often one of the major obstacles in agricultural experimentation because labor is required for such evaluations as well as laboratory and/or field materials, resulting in high expenditures (Lúcio et al., 2011). Measurements of the variables are performed only once for each characteristic, which can reduce experimental accuracy.
On the other hand, when evaluating production in multiple-harvest crops such as strawberries, evaluations are carried out at each harvest, which would theoretically increase the accuracy and reliability of the results (Lúcio & Benz, 2017).
For strawberries, there are no studies in the literature in which the coefficient of repeatability was determined, so there is no definition of the number of measurements needed for each variable so that the results are reliable. To study the effect of treatment with melatonin on postharvest strawberry fruits, Liu, Zheng, Sheng, Liu, and Zheng (2018) used three replicates. The same authors used 20 fruits from each replicate for measures of disease incidence, weight loss, color, firmness and total soluble solids content. Yu et al. (2015) carried out 70 harvests during the cycle for the evaluation of disease incidence to determine the yield of strawberry fruits grown in a soilless system. Claire et al. (2018) evaluated a total of 15 plants per treatment four times during plant growth. Generally, fruit production is evaluated by accounting for all harvests during the cycle Kumar et al., 2011;Morris et al., 2017).
In genetic breeding, the repeatability coefficient is a parameter widely used to quantify whether multiple measurements of a characteristic are the same, expressing the total variance in terms of the contribution of the genotype and environment (Cruz, Regazzi, & Carneiro, 2012). Knowledge of the coefficients of repeatability of the variables allows for the reduction of time and manpower in experiments (Della Bruna, Moreto, & Dalbó, 2012). The coefficient of repeatability can be measured by several methodologies, including variance analysis (ANOVA), structural analysis, and principal component analysis, the latter being the most precise (Cruz et al., 2012). The coefficient of repeatability can be classified as high (r ≥ 0.60), medium (0.30 < r < 0.60), or low (r ≤ 0.30) ( Resende, 2002). Based on the repeatability coefficient, the number of measurements required for sufficient precision can be estimated; the smaller the repeatability coefficient is, the greater the number of measurements required for high precision, and vice versa (Cruz et al., 2012;Resende, 2002).
Considering the high cost of implementing experiments with strawberry cultures grown on a substrate and the high labor and cost demands associated with experimental measurements and analysis of variables, the objective of this study was to estimate the coefficient of repeatability via the following methods: analysis of variance, principal component analysis using a correlation matrix, principal component analysis using a variance-covariance matrix and structural analysis. Another objective of this study was to determine the number of measurements necessary for estimating characteristics of a strawberry crop grown on a substrate.

Material and methods
The experiment was carried out at the Federal University of Santa Maria (UFSM), Frederico Westphalen campus, located at 27º 23' S, 53º 25' W, at 493 m above sea level. According to the classification of Köppen, the climate of the region is classified as Cfa, subtropical humid, with rainy temperate characteristics and an average annual precipitation of 1,800 mm (Alvares, Stape, Sentelhas, De Moraes Gonçalves, & Sparovek, 2013).
The experiment was conducted in a substrate crop system under protected cultivation. The strawberry transplants were transplanted into 150 μm white tubular plastic bags and kept on wooden benches 0.8 m above the ground. Irrigation was performed by a drip system located inside the bags, which was composed of drip tubes spaced 0.10 m apart. Fertigation was carried out according to the formula developed by (Gonçalves, Vignolo, Antunes, & Reisser Junior, 2016) with frequency determined by the requirements of the crop.
Before planting, substrate mixtures were washed (Table 1) until an electrical conductivity of less than 1 mS cm -1 was reached to render the substrates chemically inert. Two cultivars (Albion and Camarosa) from two origins (domestic and imported) were used.
The experiment was conducted under a randomized block design of 16 treatments (Table 1) with four replications each. The experimental unit was composed of eight plants.
Acta Scientiarum. Agronomy, v. 42, e43357, 2020 At the stage of complete maturation of the fruits, two harvests were performed per week, totaling 42 harvests, with fruits classified as commercial or noncommercial (deformed or less than 6 grams). The commercial fruits harvested in each plot were counted and weighed. Subsequently, the mass and number of fruits per plant were calculated by dividing the total mass of fruits harvested by the number of plants in the plot.
Fruit quality variables were evaluated, such as titratable total acidity (AT), total soluble solids (SST) and firmness (FIR). These variables were measured during the production cycle to eliminate specific characteristics of the harvest season. Determination of total titratable acidity was performed by titration with a standardized solution of NaOH (0.1 mol L -1 ), and determination of the total soluble solids was performed with the use of a manual refractometer (± 2% accuracy), with results expressed in ºBrix. The firmness variable was determined with a bench penetrometer with a tip of 6.0 mm.
The coefficient of repeatability (r) was obtained by different methodologies: For the method of analysis of variance (ANOVA), the statistical model used was defined by: where: is the experimental observation of the i-th genotype in the j-th environment, is the grand mean, is the effect of the i-th genotype associated with permanent environmental influences, is the effect of the j-th environment, and is the experimental error associated with the i-th genotype in the j-th environment. The coefficient of repeatability is given by: where: is the variance attributed to the confounding effects of genotype and permanent environment and is the residual variance (Cruz et al., 2012).
The principal component method using the correlation matrix (PCcor) was also used. This method consists of obtaining a correlation matrix between the genotypes in each measurement pair and determining the eigenvalues and the normalized eigenvectors using the following formula: , where: is the number of periods evaluated and is the eigenvalue associated with the eigenvector that presents elements with a similar direction and effect size (Abeywardena, 1972).
A principal component method using a covariance matrix is an alternative technique that can be applied in the matrix of phenotypic covariances: where: and . The first eigenvalue is given by . Thus, the repeatability coefficient is given by: , representing the eigenvalue of associated with the eigenvector whose elements have the same sign and similar magnitudes. The method of structural analysis using the correlation matrix was proposed by Mansour, Nordheim, and Rutledge (1981). In this method, R is the parametric matrix of correlation among the genotypes in each harvest pair considering to be its estimator. Thus, the repeatability coefficient was estimated by: where: is the eigenvector with parametric elements associated with the largest eigenvalue of R. The minimum number of measurements (m) for all estimates and the coefficient of determination (R 2 ) and the minimum number of evaluations (m) necessary to predict the actual genotypic value associated with the magnitudes R 2 were pre-established.
The coefficient of determination was obtained on the basis of the evaluations, and the estimated coefficient of repeatability (r) was determined according to the following expression: The minimum number of measurements was calculated as follows: based on coefficients of determination of 0.80, 0.85, 0.90, and 0.95. All analyses were performed using the statistical software Genes (Cruz, 2013).
The coefficient of variation of the variables was low, with the exception of the number of fruits (NF) and fruit mass per plant (MF), which presented a coefficient of variation (CV) above 40%. In the present study, the repeatability coefficients ranged in magnitude from low to medium in the different estimation methods (Resende, 2002). When the repeatability coefficient is high for a particular variable, only one or a few observations are necessary for the selection (Turner & Young, 1969).
The analysis of the data revealed a lower value of r for the ANOVA method for all variables, especially the total soluble solids (SST), for which r = 0.05 and the coefficient of determination R² = 15.5, showing the low capacity of this method in predicting the treatment characteristic. The highest coefficient of repeatability obtained by the ANOVA method (r = 0.49) was estimated for the variable titratable acidity (AT), and a high number of measurements was necessary to select the best treatment with the highest accuracy (Table 2). Therefore, the average number of measurements needed to predict the actual value of each treatment was high. Shimoya et al. (2002) considered reasonable repeatability results to be above 0.5. The higher the repeatability coefficient is, the lower the number of measurements required, since the genetic parameters are repeatable with constant results in different environments (Kumar et al., 1998); the lower the coefficient of repeatability for certain characteristics, the higher the variability in the same cultivar or treatment (Yuan et al., 2016). Table 2. Coefficient of variation (CV), coefficients of repeatability (r), coefficient of determination (R 2 ) estimated by methods of analysis of variance (ANOVA), principal component method using correlation matrix (PCcor), principal component method using covariance matrix (PCcov) and method of structural analysis (SA) for production variables and quality of strawberry cultivated in substrate. The ANOVA method, with low repeatability coefficients, provides evidence for low explanatory power when using few measurements. The AT variable presented the highest repeatability coefficient (r = 0.53) when compared to the other variables ( Table 2). The low value found for r in the different treatments evaluated may be related to the effect of the crop cycle, biotic and abiotic factors capable of delaying or accelerating the crop cycle, and the effects on the number and weight of the fruits (Tenkouano et al., 2012). Studying the repeatability of Carthamus tinctorius, Mohammadi and Pourdad (2009) observed low repeatability for the parameters of genetic variability.

Variables
For the principal component method using the covariance matrix (PCcov), the coefficient of repeatability increased for all variables compared to those found using ANOVA, revealing a higher predictive capacity of the treatments used. The coefficient of determination also increased in all treatments, indicating an increase in precision of the procedure associated with this method. The method performed by structural analysis based on the correlation matrix (SA) showed low values of r, indicating that the number of measurements should be higher to have greater precision of results in different treatments (Table 2). Even though the results of r obtained from the PCcov method were below 0.6 for the strawberry crop, these can be considered reasonable since the coefficient of determination that expresses the prediction of the real value was high (Shimoya et al., 2002). Yuan et al. (2016) observed repeatability values that were considered moderate, and in this case, the contributions of genetic factors and environmental factors were almost equal.
In the present study, the low coefficients of repeatability may be related to the treatments used since the mixtures of substrates have different capacities of water retention , and in some treatments (with emphasis on mixtures with sugar cane bagasse), the beginning of the cycle was delayed precisely because the demand for retained water was not sufficient for the adequate development of the crop. As reported by Yuan et al. (2016), repeatability values between 0 and 0.11 showed the influence of the environment, revealing that, in addition to genetic factors, environmental factors should also be considered. (Cruz et al., 2012) explained that beyond genetic nature, the repeatability coefficient varies with the environmental conditions under which individuals are maintained.
For low or medium repeatability values, it is understood that there is greater variability in a certain characteristic, which may be caused by climatic variations during certain evaluation periods (Martuscello et al., 2015). In this case, seasons of high temperature or a change in the evaluation period between morning and afternoon can influence such results as the thermal amplitude is high. For this reason, evaluation of the variables at the same times during each evaluation period should be prioritized.
The highest values of the repeatability coefficient were estimated by the principal component method via the covariance matrix (PCcov). Similarly, Lessa et al. (2014) found higher values with the PCcov method when estimating the coefficients of repeatability in banana diploid hybrids. Martuscello et al. (2015) observed higher values of r for the principal component method using the correlation matrix when working with accessions of Panicum maximum. This occurred because the repeatability coefficient estimated by the principal component method has a greater reliability because this method takes into account the entire cycle of the culture (Abeywardena, 1972).
The number of measurements required (m) to achieve different determination coefficients (R 2 ) differed substantially among the evaluation methods for all variables (Figure 1). It was also revealed that an increase in accuracy beyond 95% would increase the number of measurements, and the accuracy increase would be small, not justifying the additional measurements (Martuscello et al., 2015). For the ANOVA method, as expected, because of low values found for the repeatability coefficient, the values necessary for the (m) were high in different R2, ranging from 10.2 to 48.7 for the variable fruit mass. Higher values of (m) were observed for the variable SST, with 65 measurements required for an R2 of 80% and approximately 309 measurements required to achieve 95% accuracy, which would make this type of analysis impracticable (Figure 1). These results are directly related to the value of r = 0.05 expressed for this variable.
The (m) values estimated by the principal component methods were smaller than the values estimated by the ANOVA method. Nevertheless, within this method, the values of (m) when estimated by PCcor were higher than when using PCcov. This result is a reflection of the values found for r in that the lower the r value is, the greater the (m) value that is needed to achieve greater accuracy. For the SA method, the values of (m) were elevated relative to those for PCcov.
Adopting the PCcov method, it was possible to observe that for the variable mass of fruits with R2 = 80%, approximately four measurements were necessary, while 17 measurements were necessary to obtain 95% confidence. The same was observed for the number of fruits (NF), in which three and 14 measurements were required to obtain 80% and 95% confidence, respectively. For a minimum of 80% significance, it would be necessary to perform six measurements for the SST variable, three for the AT variable and four for the FIR variable. Martuscello et al. (2015) studied the repeatability and phenotypic stabilization of Panicum maximum accessions and found that total and stem dry matter yields need only five harvests for selecting. For plant height, the same authors found that five harvests provided 90% efficiency in plant selection. For dead/senescent forage, dry matter yield from between 7 and 25 harvests is necessary. For diploid hybrids of banana, Lessa et al. (2014) determined that approximately 1 to 5 measurements are necessary for the prediction of the values of characteristics related to production, cluster mass, number of leaves, number of fruits and mass of fruits. Rondinelli, Carvalho, Neto, and Miqueloni (2014) determined that 15 evaluations are required to determine fruit quality characteristics of sweet orange with 90% certainty via trait soluble solids, and 11, 6, 3, 2, and 1 evaluations are required for average fruit mass, total acidity, technological index, juice yield, and SST/AT ratio, respectively.
The productivity of strawberry fruits, quantified in almost all research works with the most diverse cultivars and treatments, is generally carried out during the entire production cycle of the crop. When the objective of the work is to determine the total production of the crop and make comparisons between treatments, it is important to harvest the whole cycle. However, when the objective of the work is to differentiate only treatments, a minimum number of measurements would be necessary to determine this difference, and several evaluations during the cycle would not be necessary (Cruz et al., 2012).
In studies aimed at differentiating treatments, fewer measurements of the production variables would be sufficient to determine which treatment would be the best. Evaluating the effect of artificial vernalization of strawberry transplants cv. Albion and Camarosa,  evaluated fruit production throughout the growing cycle, and to determine the influence of the vernalization fewer measurements would likely be sufficient. However, when not all harvests are performed, information on the total production of the crop can be lost. Evaluating the behavior of 11 strawberry cultivars, Diel et al. (2018) measured production throughout the cycle. In this case, when the objective of the study is to evaluate the performance of the cultivar during the cycle, the harvest must be harvested over the whole cycle. However, if the objective of the study is only to differentiate cultivars, fewer measurements can be performed, which reduces the manpower, costs and conduction time of the experiment.

Conclusion
Some important considerations can be described in relation to the repeatability analysis and the number of strawberry measurements: 1) For the strawberry, the method used to estimate the values of r must be the principal component method, the most efficient being the one that uses the covariance matrix; and 2) The number of measurements defined for the variables evaluated in this study should take into account the objective of the study and the feasibility of the tests. Further studies should be carried out to estimate the coefficient of repeatability and the number of measurements required in other environments and for a greater number of variables measured in the strawberry crop.