SeedQuest - Central information website for the global seed industry

News Page

The news
and
beyond the news

Index of news sources

Topics

Alliances / M & A
Artificial intelligence
Bees & pollinators health
Biodiversity
Bioinformatics
Biologicals & inoculants
Biotechnology
Blockchain technology
Carbon
Cereal crops
Climate change
Coexistence
Cover crops
Crop protection
Data science
Digital agriculture
Drones / UAV
Drought tolerance
Education & careers
Financial
Food & health
Food safety
Food security
Forage crops
Fungicide resistance
Genetic resources
Genome-editing technology
Genomics
Heat tolerance
Herbicide resistance
Indoor agriculture
Insecticide resistance
Intellectual property protection
Legal & regulatory
Legumes
Lighting technology
Machinery & equipment
Market data
Microbials / Microbiome
New breeding techniques
New products & tools
New services
New technologies
Non-food agriculture
Oilseed crops
Organic
Ornamentals
Pasture grasses
People
Pesticide resistance
Phenotyping / Phenomics
Plant & seed nutrition
Plant breeding
Plant health
Plant protein
Post-harvest technology
Precision agriculture
Published in print
Pulse crops
Regenerative agriculture
Remote sensing
Research
Robots / Robotics
Root health
Seed analysis
Seed colorants & polymers
Seed enhancement
Seed health
Seed processing
Seed science & technology
Seed testing
Seed treatment
Soil health
Sustainable ag
Turfgrass
Urban farming
Vegetable crops
Web & IT solutions
Weed management

Species

Archives

News archive 1997-2008

New software from University of Georgia’s College of Agricultural and Environmental Sciences improves accuracy of DNA sequence analysis

Athens, Georgia, USA
March 3, 2022

UGA Assistant Professor Henk den Bakker leans in front of his computer monitor, which displays his Sepia software application Sepia is a cutting-edge read classifier, written by College of Agricultural and Environmental Sciences Assistant Professor Henk den Bakker, that is out now as open-source software.

Researchers from the University of Georgia’s Center for Food Safety have developed software that functions as an important step in improving the accuracy of DNA sequence analysis when testing for microbial contamination.

Sepia is a cutting-edge read classifier, written by College of Agricultural and Environmental Sciences Assistant Professor Henk den Bakker, that is out now as open-source software. And it should make genome sequencing much faster for researchers studying bacteria.

The length of chromosomes of bacteria typically range between 1.5 million base pairs to roughly 9.5 million base pairs, but if researchers want to “read” the individual bases of a genome (the genome sequencing process), they can only do that in pieces of 150 to 10,000 base pairs using modern technology. These pieces are called “reads.”

Now imagine when researchers want to determine what types of microorganisms and viruses are present in a sample — such as in a nasal swab — and sequence the DNA of all those organisms. Presented with a mixture of DNA reads of a plethora of organisms, researchers use a tool called a “read classifier” to quickly sort through the reads and determine to what microorganisms they most likely belong.

Like other read classifiers, den Bakker’s new software works by cross-referencing the information from the sample to existing databases, but it is designed to address challenges in the process posed by potential errors in the taxonomic information available on some microorganisms or switching to a new taxonomic system altogether.

Because bacteria are often single-celled microorganisms lacking physical distinctions, they are more difficult to classify than more complex organisms, such as mammals or reptiles. Researchers have only recently begun using DNA to determine the taxonomy of microorganisms. This means that the taxonomy of some databases that read classifiers pull information from are sometimes not in agreement with what similarities in DNA tells us.

“Only recently, in the last decade, we began sequencing these organisms and using the genetic data to build taxonomies. That’s very important because when we know things are genetically similar, a read classifier can use that information to make predictions,” den Bakker said.

Henk den Bakker holds new technology with monitor in the background

Den Bakker anticipates that the software in its current form will function as a base model onto which he will build additional features, like removing human DNA from test results to help protect patient confidentiality.

Using these predictions, when the read classifier discovers an organism that is missing from the database, it can help researchers determine what that unidentified organism is most closely related to by comparing its genetic material to that of known microorganisms, he said.

When writing the software, den Bakker intentionally made it simple for the end user to make edits and corrections as needed to help address the problems with the taxonomy used in databases. Given its wide range of applications, much of his focus was on creating software that was user-friendly, allowing researchers to easily edit the taxonomy of the databases if they find an error.

To test the software, den Bakker recruited the help of Lee Katz, a bioinformatician with the Centers for Disease Control and Prevention (CDC) and adjunct faculty member with the UGA Center for Food Safety. Katz tested the software for genome contamination — this occurs when researchers confirm that they have sequenced only the organism that they are interested in, and not a mixture of organisms. Based on his findings, Katz has suggested its use to CDC colleagues for metagenomics analysis.

Den Bakker anticipates that the software in its current form will function as a base model onto which he will build additional features. One such upcoming feature is designed to help protect patient confidentiality by removing human DNA from test results. Researchers will then be able to share the results of their research while simultaneously complying with health information privacy laws.

“For me, writing software is also exploring new data structures on a data science level — how to make these things more efficient. Writing it is more or less like starting an experiment in the lab,” den Bakker said.

The software is available now and is free to download on GitHub. More information on Sepia can be found in The Journal of Open Source Software. To hear a discussion on Sepia with Hendrik den Bakker, listen to episode 74 of the “Micro Binfie Podcast,” “Sepia With Henk – Soup Or Salad Yes”!

More news from: University of Georgia

Website: http://www.uga.edu

Published: March 7, 2022

The news item on this page is copyright by the organization where it originated
Fair use notice

Archive of the news section