News & Events

Our blog is where you'll find all our project updates, highlights and achievements, as well as other news and events related to iMENTORS

  • Home
    Home This is where you can find all the blog posts throughout the site.
  • Categories
    Categories Displays a list of categories from this blog.
  • Tags
    Tags Displays a list of tags that has been used in the blog.
  • Bloggers
    Bloggers Search for your favorite blogger from this site.
  • Team Blogs
    Team Blogs Find your favorite team blogs here.
  • Login
Posted by on in Related News
  • Font size: Larger Smaller
  • Hits: 1817
  • Print

Preparing for tomorrow's big data

According to an article published on iSGTW, last week, the inaugural ISC Big Data conference was held in Heidelberg, Germany. The event was chaired by Sverre Jarp, chief technology officer of CERN openlab, and CERN was the focus of two case studies presented during the two-day conference. Frank Würthwein, from the University of California at San Diego, US, discussed how CERN handles big data today and looked forward to how the organization will have to adapt these processes to cope with increased peak data rates from the experiments on the Large Hadron Collider (LHC) after upgrade works are completed as part of the first long shutdown (LS1).

Until recently, the large CERN experiments, ATLAS and CMS, owned and controlled the computing infrastructure they operated on in the US, and accessed data only when it was locally available on the hardware they operated. However, Würthwein explains, with data-taking rates set to increase dramatically by the end of LS1 in 2015, the current operational model is no longer viable to satisfy peak processing needs. Instead, he argues, large-scale processing centers need to be created dynamically to cope with spikes in demand. To this end, Würthwein and colleagues carried out a successful proof-of-concept study, in which the Gordon Supercomputer at the San Diego Supercomputer Center was dynamically and seamlessly integrated into the CMS production system to process a 125-terabyte data set.

CERN’s Pierre Vande Vyvre also gave a presentation at the event in which he discussed the role of fast custom electronic devices in filtering out much of the data produced by scientific experiments such as those at CERN, so as to make the data more manageable. Currently, just 1% of data from collision events in the LHC is selected for analysis. “The big science workflows are mainly data reduction,” says Vyvre.  “The archiving of raw data is not the de facto standard anymore.” He predicts that next-generation science experiments will reduce more and more the role of these custom devices and will instead entrust the processing of the complete data sets to standard computing algorithms. “One of the main problems faced by scientific experiments today is that lots of legacy software is in use that hasn’t been designed for the big data paradigm,” says Vyvre. “Tighter integration between data movement, scheduling and services is needed.”

Read more on iSGTW

- Andrew Purcell

Rate this blog entry:
EuropeLogo eInfastructure This project has received funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no 313203
Copyright © 2014 iMENTORS. All rights reserved.
evden eve nakliyat