UFYMA   27844
UNIDAD DE FITOPATOLOGIA Y MODELIZACION AGRICOLA
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Comparison of variable selection methods in crop disease from climate
Autor/es:
SUAREZ, F.; BALZARINI, M.; GIANNINI KURINA, F.; BRUNO C.
Lugar:
Riga
Reunión:
Conferencia; 31st International Biometric Conference (IBC 2022); 2022
Institución organizadora:
International Biometric Society
Resumen:
The high dimensionality and the multicollinearity in a set of variables characterizing clime , potentially useful to predict disease risk in a plant pathosystems requires variable selection before to fit a predictive model. This work presents an evaluation of different methods for variable selection to build predictive models from climatic data. For modeling binary data of disease presence we implemented two types of models: LR (likelihood-based with a linear logistic model), and RF (random forest). Residuals from both strategies were assumed as independent. We compare the performance of variable selection methods dismissing irrelevant and redundant predictors variables to fit each model type. The compared variable selection methods were : Filter-based (F), Genetics Algorithm (GA), and Boruta (B). Further, the stepwise procedure for variable selection was applied in RL. The covariates used to fit the predictive models were the selected for each selection method. The combinations of variable selection method and model types were applied in three datasets of more than 1500 sites each with records of disease presence/absence distributed across a region of Argentina for different pathosystems (Mal de Rio Cuarto virus in maize, Begomovirus in soybean and bean). More than 300 biometeorological variables were built from temperature, precipitation, radiation, and other climatic variables expressed as averages or sums of different time periods associated to the crop cycles. Results were evaluated in terms of accuracy, sensitivity, and specificity of site classifications according with the disease probability . The stepwise procedure for variable selection showed the best performance for the RL fitting. However, when a RF was adjusted, no statistical differences were observed between the variable selection methods. We conclude that identifying the optimum method for variable selection depends on the type of predictive model to fit.