bioRxiv | 2021

Predicting Moisture Content During Maize Nixtamalization Using Machine Learning with NIR Spectroscopy

 
 
 
 
 
 
 
 
 
 
 
 
 

Abstract


Lack of high throughput phenotyping systems for determining moisture content during the maize nixtamalization cooking process has led to difficulty in breeding for this trait. This study provides a high throughput, quantitative measure of kernel moisture content during nixtamalization based on NIR scanning of uncooked maize kernels. Machine learning was utilized to develop models based on the combination of NIR spectra and moisture content determined from a scaled-down benchtop cook method. A linear support vector machine (SVM) model with a Spearman’s rank correlation coefficient of 0.852 between wet lab and predicted values was developed from 100 diverse temperate genotypes grown in replicate across two environments. This model was applied to NIR data from 501 diverse temperate genotypes grown in replicate in five environments. Analysis of variance revealed environment explained the highest percent of the variation (51.5%), followed by genotype (15.6%) and genotype-by-environment interaction (11.2%). A genome-wide association study identified 26 significant loci across five environments that explained between 5.04% and 16.01% (average = 10.41%). However, genome-wide markers explained 10.54% to 45.99% (average = 31.68%) of the variation, indicating the genetic architecture of this trait is likely complex and controlled by many loci of small effect. This study provides a high-throughput method to evaluate moisture content during nixtamalization that is feasible at the scale of a breeding program and provides important information about the factors contributing to variation of this trait for breeders and food companies to make future strategies to improve this important processing trait. Key Message Moisture content during nixtamalization can be accurately predicted from NIR spectroscopy when coupled with a support vector machine (SVM) model, is strongly modulated by the environment, and has a complex genetic architecture.

Volume None
Pages None
DOI 10.1101/2021.05.19.444884
Language English
Journal bioRxiv

Full Text