SeedQuest - Central information website for the global seed industry

News Page

The news
and
beyond the news

Index of news sources

Topics

Alliances / M & A
Artificial intelligence
Bees & pollinators health
Biodiversity
Bioinformatics
Biologicals & inoculants
Biotechnology
Blockchain technology
Carbon
Cereal crops
Climate change
Coexistence
Cover crops
Crop protection
Data science
Digital agriculture
Drones / UAV
Drought tolerance
Education & careers
Financial
Food & health
Food safety
Food security
Forage crops
Fungicide resistance
Genetic resources
Genome-editing technology
Genomics
Heat tolerance
Herbicide resistance
Indoor agriculture
Insecticide resistance
Intellectual property protection
Legal & regulatory
Legumes
Lighting technology
Machinery & equipment
Market data
Microbials / Microbiome
New breeding techniques
New products & tools
New services
New technologies
Non-food agriculture
Oilseed crops
Organic
Ornamentals
Pasture grasses
People
Pesticide resistance
Phenotyping / Phenomics
Plant & seed nutrition
Plant breeding
Plant health
Plant protein
Post-harvest technology
Precision agriculture
Published in print
Pulse crops
Regenerative agriculture
Remote sensing
Research
Robots / Robotics
Root health
Seed analysis
Seed colorants & polymers
Seed enhancement
Seed health
Seed processing
Seed science & technology
Seed testing
Seed treatment
Soil health
Sustainable ag
Turfgrass
Urban farming
Vegetable crops
Web & IT solutions
Weed management

Species

Archives

News archive 1997-2008

Between Netflix and Big Data - A recent National Science Foundation grant of $3.8 million will fund development of a general-purpose data storage platform, enabled by the iPlant Collaborative’s community of scientists, developers and educators.

Arizona, USA
September 8, 2015

Creating lots of data in 2015 is rather easy.

Take, for example, a whole human genome, comprised of roughly 3 billion DNA base pairs and 20,000 genes. Scientists began sequencing the first human genome in 2000. It took 13 years and $3 billion. Today, for less than $1,000 and in a matter of hours — not weeks, not months — it can be sequenced and stored as a gigabyte and a half of data that would fit on a compact disc the size of NSYNC's "No Strings Attached," the best-selling album of 2000.

Scientists amass unprecedented amounts of data in very little time, but they cannot always manage the data as efficiently as they produce it. Syndicate, a four-year big data project led by University of Arizona professor of computer science Larry Peterson, addresses the problem.

Funded by a $3.8 million National Science Foundation research grant, Syndicate will be a general-purpose storage platform for data, adding to services of the data management infrastructure developed by the UA-led iPlant Collaborative, an all-science computational platform also funded by NSF. Peterson and his team of collaborators hope to evoke a time when the scientist didn’t also have to be the data management expert. The iPlant Collaborative will provide the infrastructure to integrate Syndicate — and the user community to pilot test the platform in its array of potential uses.

The conversation is no longer about whether scientists can turn out big data. It's about how it can be managed.

In order to build on each other's research, scientists must be able to share their data, and this does happen. But not always fast. Sending hundreds of terabytes from far-away origin servers (say, from Tucson to Beijing) can take so much time that the data becomes stale as it's passed from one research lab to another.

"If you're dealing with large datasets, the data changes. Computations happen," Peterson said.

Syndicate aims to make sharing faster, so scientists will receive only the freshest version of a dataset. The ability to more easily store and manage large amounts of data with a platform such as Syndicate will in turn make collaboration among scientists easier.

"We're trying to wean scientists off having their own local hardware, and help them tap into resources that are worldwide," Peterson said.

Slow-going data transfer is only part of the problem; currently, managing a large dataset also requires significant user involvement. Syndicate will address this, as it is designed for self-management. For example, users no longer will have to manually and individually dole out passkeys.

As it stands, according to Peterson, "Privacy can sometimes be a nightmare."

The goal is to be minimally disruptive in the process, by creating a system that utilizes many of the same cloud storage services scientists already use, such as Google Drive and Dropbox.

The crown jewel of its system is the same technology Netflix and Amazon Prime use to transfer television episodes and films: content distribution networks, or CDNs. Using CDNs, Syndicate will pull large datasets from an origin server and put them all over the globe. This way, the scientist in Beijing will not have to wait a month for data from an origin server in Tucson, because it also will be hosted somewhere closer, such as Tokyo.

Essentially, CDNs don't move big data faster, but they bring it closer so it is received sooner.

"CDNs are really common for video but they haven't been used a lot for big data," Peterson said.

Why?

"They're a challenge," he said. "Today, CDNs are typically used for (files) that don't change."

His team will have to integrate an element that allows the data to change due to computation — no small feat.

Peterson is hoping to deploy a pilot version of the Syndicate platform by the end of fall. The iPlant Collaborative will provide the community of scientists, developers, and educators necessary to ensure the platform is capable of translational use. Additional pilot users will include the M-Lab Consortium, for which Peterson is a founding member, and scientists who will house data from a clinical study in a Syndicate cloud.

The project brings together collaborators from all across the UA campus, including Nirav Merchant of the iPlant Collaborative and Arizona Research Laboratories; John Hartman of UA computer science; Anita Bapphu of the Norton School of Family and Consumer Sciences; and Bonnie Hurwitz of the College of Agriculture and Life Sciences and creator of the iMicrobe project, an affiliate of iPlant.

More news from: University of Arizona

Website: http://www.arizona.edu/

Published: September 9, 2015

The news item on this page is copyright by the organization where it originated
Fair use notice

Archive of the news section