Optimum environment number for the national sunflower trials network

This work aimed to present the optimum environment number methodology and propose the optimization of the National Sunflower Trials Network, by means of the environments exclusion that do not provide loss of the environmental variability already established. Grain and oil yield data of 16 genotypes evaluated at 16 environments of the National Sunflower Trials Network, obtained from trials conducted out-of-season in 2012 and 2013 were used. An analysis was proposed to establish the optimum environment number for genotypes evaluation, based on genotype performance in the various environmental combinations. The removal or maintenance of environments in the experimental network was dynamic, since different environmental combinations impacted the representativeness of the complete network in a different way. This analysis also provides a graphical view of the impact of the environment removal from the network. Once detected points below the established correlation, the researcher could infer about the network minimum environment number and, suggest through consistent information of several testing years, the environment exclusion.


Introduction
Sunflower (Helianthus annuus L.) is an annual plant, diploid (2n = 34), alogamous and native to North America (Kaya, Jocic, & Miladinovic, 2012). Considered as one of the most important oil plants in the world, more than 90% of its oil production is destined for human consumption (Jocic, Miladinovic, & Kaya, 2015). The crop presents important agronomic aspects such as broad range of drought, cold and heat tolerances (Leite, Brighenti, & Castro, 2005), seeds with about 40% of high quality oil and 25% of protein, besides being rich in fiber and possessing low caloric content (Paniego et al., 2007).
The main objectives of sunflower breeding are related to obtaining stable cultivars with higher grain and oil yield for use in the production of feed and oil extraction for human consumption (Nobre, Resende, Brandão Junior, Costa, & Morais, 2012). In Brazil, hybrids and sunflower varieties developed in different programs have been evaluated for grain and oil yield by the National Sunflower Trials Network, coordinated by Embrapa. The network established in different Brazilian states, investigates the interaction between genotypes and environments (G x E interaction) and enables the recommendation of adapted genotypes to the different regions. Selection and recommendation of adapted genotypes to the producing regions, tends to increase the success of the crop, providing greater economic return (Porto, Carvalho, Pinto, Oliveira, & Oliveira, 2009).
According to Ribeiro, Daros, Caires, and Vasconcellos (2011), experimental networks establishment in different stations throughout the country has been fundamental for the cultivars recommendation and agricultural zoning of the crop, as well as providing information about the yield potential in the different regions. The experimental network makes it possible to identify the GxE interaction, since it gathers information from many trials. The interaction significance allows the study of two optics, the first of them, through adaptability and stability analyzes to study genotype behavior throughout the different environments (Grunvald et al., 2014). Or, in an environment point of view, it is possible to perform the decomposition of the interaction in its simple and complex fractions according to the proposition of Cruz and Castoldi (1991) and then to stratify environments into groups whose interaction was at most, simple.
However, in situations where the environments are quite heterogeneous, stratification is difficult, and it is not possible to reduce expenses in breeding programs. Thus, a complementary and inverse methodology to environment stratification, based on the optimal environment number, could be applied in an experimental network to reduce the tests performed. The initial idea is to consider all environments clustered and carry out the environment removal, one by one, to form smaller groups, involving a reasonable number of possible analyzes to be carried out with appropriate computational resources. The different environmental combinations would be tested by means of a statistic, such as the mean or the Pi statistic of Linn and Binns (1988) and plotted against a pre -established quality value. This value would be the correlation established a priori with the complete experimental network. From the dimensioning of this optimal number, it would be possible to infer which environments could be excluded or maintained in an experimental network and what would be the implication when reducing the number of environments analyzed, as well as observing the consisten cy of genotype recommendation.
This work aimed to present the optimum environment number methodology and propose the optimization of the National Sunflower Trials Network, by means of the environments exclusion that do not provide loss of the environmental variability already established.

Material and methods
Grain and oil yield (kg ha -1 ) data from 16 genotypes (Table 1) evaluated at 16 environments of the National Sunflower Trials Network (Table 2), coordinated by Embrapa, obtained from trials conducted outof-season in 2012 and 2013 were used. The experiments were carried out in randomized blocks with four replicates. The plots consisted of four lines of six meters in length each, with a useful area corresponding to the two central lines, eliminating 50 cm at the ends. The cultural dealings were carried out according to the recommended for the culture.
Joint analysis of variance were performed, considering the effect of genotypes as fixed and of environments, random, according to Equation 1. (1) where: Y ijk is the genotype value of the k-th block, evaluated in the i-th genotype and j-th environment; µ is the overall average of the trials; B/E jk is the effect of the block k within the environment j; G i is the effect of the i-th genotype; E j is the effect of the j-th environment; GE ij is the effect of the interaction of genotype i with the environment j; e ijk is the experimental error associated with observation Y ijk , with e ijk ~N (0; σ²).
Since G x E interaction was detected, it was decomposed according to the method proposed by Cruz and Castoldi (1991), which allows to estimate the simple and complex G x E interaction fractions (Cruz, Regazzi, & Carneiro, 2012). The environments where the percentage of the complex interaction did not exceed 50% were considered similar.
In addition to the environmental stratification, we proposed an analysis to establish the optimum environment number, aiming at a greater use of resources in an experimental network. The procedure consisted in the establishment of a classification criterion of the performance of genotypes considering all the environments of the network, based on the Pi statistic of Linn and Binns (1988) and, later, evaluation of this same performance with a smaller number of environments, in all possible combinations (exhaustive search) given by , in which n is the number of environments evaluated. In the present study, k = 65,519 analyzes were performed.
The optimal environment number would be the one that presented the same agreement, according to a minimum correlation level of 90%, with the performance obtained with all the environments of the network. All analyzes were performed using the software Genes (Cruz, 2013).

Results and discussion
Individual analyzes of variance revealed a significant effect of genotypes (p < 0.01) for both traits in most environments, showing available genotype variation (Table 3). The coefficients of variation were lower than 20%, compatible with those found for the culture (Porto, Carvalho, Pinto, Oliveira, & Oliveira, 2008;Grunvald et al., 2014). The municipalities of Planaltina and Muzambinho stood out for their high average values for both traits and, in Palmas it was where the lowest averages occurred. These results indicate that the different edaphoclimatic conditions affect crop yield. The environments that returned the highest averages are those where there is commercial sunflower cultivation, mainly in the out-of-season (Leite & Borba Filho, 2014).
Joint analyzes of variance showed significant effect of genotypes, environments and G x E interaction (p < 0.01) for both traits studied (Table 4), showing available genotype variation and environmental diversity, as well as the differential behavior of genotypes throughout the environments, as observed by Grunvald et al. (2014) and Porto et al. (2009), facts that justify the environmental stratification. The coefficients of variation were also compatible with those found for the culture (Porto et al., 2008). Stratification has usually been performed for predictable environmental variations (Allard & Bradshaw, 1964), usually only for site effect. The effect of the years is an unpredictable factor and therefore, there is no way to carry out the zoning. For this study it was possible to verify that half and three quarters of the environment combinations present a predominantly complex interaction for grain and oil yield, respectively (Table 5), so, environment clustering was difficult. Palmas was the only environment that presented predominantly simple interaction when compared to all others, regardless of the evaluated trait, indicating that it could be excluded from the network without this impacting on the quality of this one; in addition, the test conducted in this environment presented low values that contrasted a lot in relation to the others. In some cases, values of the complex part above 100% were found, for example, for the environment pair two and seven Planaltina 2012 and 2013 in relation to grain yield, suggesting that the average correlation of the genotypes in these environments is negative (Cruz et al., 2012), in this case, non-controllable factors influenced the response in these two trials.
Optimum environment number analysis is a result of exhaustive search, so that they are presented at the end of the analysis, which environments have a greater or lesser impact on the network quality. In this study the established coefficient was 90%. The evaluation for oil yield showed that the reduction of any environment did not impact the representativeness of the network as much as the minimum correlation value obtained between the complete and the reduced network was 94.07% when the Muzambinho environment was withdrawn (Table 6).
When two environments were removed from the analysis, the estimated correlation values remained above the established cutoff point. However, the withdrawal of three environments should be judicious to characterize the grain yield, since the combination of the Canarana, Muzambinho, and Vilhena (2013)    For oil yield (Table 7), the removal of the trial A of Vilhena 2012 could impact the quality of the environmental network, leading to a correlation of 86.92%, demonstrating its importance to the network. The analysis of the optimum number of environments allows the selection of the most distinct environments that would impact on the loss of the environmental variability of the network. In this case, the environments Uberlândia, Vilhena A 2012 and Muzambinho are fundamental, as the maintenance of these resulted in high values of correlation even when eliminating all others. The definition of more important environments is closely related to the geographical and climatic conditions of each one, since these particularities define the genotype responses.
An important difference in the analysis of the environments for the two traits was that, for grain yield, the removal of any environment did not reduce the correlation to values below the cut-off point. However, when evaluating the oil yield, removal of the environment represented by trial A of Vilhena 2012, for example, would lead to a drop in correlation. It is therefore important that before deciding to remove sites from an experimental network, evaluate the behavior of each important trait and which environments have been determinant for it over time. In addition, to evaluate which environmental combination will provide greater correlation with the data obtained in the complete experimental network.

Conclusion
Optimal environment number analysis is a useful tool in breeding programs, since it allows to exclude redundant environments from experimental networks of the breeding programs.
The environments Palmas, Vilhena, Canarana and Muzambinho must be maintained in the National Sunflower Trials Network, so that the grain and and oil yield can be well characterized.