As genotyping technologies advance, so do their applications. Genomic selection through genotyping by sequencing can be used for breeding: marker data is produced, missing data is filled in across the genome, and models are run on the data to identify genotypes that express desirable traits and thus should go forward in breeding programs. Combined with field observations, genomic selection can provide a powerful lens for choosing good breeding lines.
Many CIMMYT staff from different areas are working on genomic selection, in partnership with scientists from Cornell University, Diversity Arrays Technology Pty Ltd (DArT P/L; Australia), the International Center for Agricultural Research in the Dry Areas (ICARDA), and Kansas State University. On 24 October 2011, a meeting coordinated Ky Mathews, CIMMYT Biometrician, with assistance from Geneticist/Molecular Breeder Susanne Dreisigacker, brought to El Batán 24 specialists from CIMMYT and partner institutes to enhance communication, share experiences, and identify challenges associated with genomic selection.
A key concern is managing and sharing the huge volumes of data that the approach is expected to generate. “The datasets will grow and grow as the technologies progress,” says Mathews. “CIMMYT and other organizations will need infrastructure and resources to store, analyze, and interpret results. Communication will also be vital, with maize and wheat researchers receiving data for analysis at slightly different times, and with a turn-around time shorter and faster than anything we’ve dealt with before.” Genotyping by sequencing can produce many markers across the genome (order of thousands to millions), but still as much as 70% of marker scores may be missing, so scientists are applying a technique known as “imputation” to fill in the rest. The technique involves estimating what the values might have been using information available in the dataset. José Crossa and his team have been working on developing imputation methods for genotyping by sequencing. He warns that the methods are still in development, and their accuracy and feasibility for imputing biological missing data are as yet unknown.
For now, CIMMYT researchers are running an initial, testing cycle of genomic selection that should conclude in about eight weeks. Further meetings at that time will look at results, analyze mistakes, and identify learning points on all aspects, including imputation.