Feedback

News & Events

Our blog is where you'll find all our project updates, highlights and achievements, as well as other news and events related to iMENTORS

  • Home
    Home This is where you can find all the blog posts throughout the site.
  • Categories
    Categories Displays a list of categories from this blog.
  • Tags
    Tags Displays a list of tags that has been used in the blog.
  • Bloggers
    Bloggers Search for your favorite blogger from this site.
  • Team Blogs
    Team Blogs Find your favorite team blogs here.
  • Login
Posted by on in Related News
  • Font size: Larger Smaller
  • Hits: 2049
  • Print

Big data, big opportunities — for science and for enterprise

iSGTW speaks to CERN openlab’s Sverre Jarp, ahead of this month’s inaugural ISC Big Data conference in Heidelberg, Germany. Jarp, who is the general chair of the event, picks his highlights from the conference program and tells us why big data presents exciting opportunities, but also poses challenges, for enterprise and HPC alike. 

 

Perhaps you could start by telling the iSGTW readers a little about the event in your own words…

 

This will be the first-ever ‘ISC Big Data’ conference and we’re looking forward to getting delegates together under the new motto ‘where enterprise and HPC meet’. There’s a lot of potential in bringing people together from these two camps and enabling cross-fertilization between enterprise users and HPC/HTC users.

 

Why is now the right time for a new conference focusing specifically on big data?

The importance of big data is growing so rapidly. Today, you can hardly open a webpage without discovering somebody talking about big data!

Of course, what’s most important is to target this intersection between the experience that is already in the HPC community and the interest that’s now coming from the business side. People really want to use data in a much more agile way. We’ve all heard about the three ‘V’s of big data (volume, velocity, and variety), so it’s no longer just a question of having static data reports on enterprises. They really want to find new ways of crunching the data and getting new business-related information out of it, which can then be used to improve processes, expand markets, drive customer uptake, etc.

 

So, it’s fair to say that ‘big data’ is more than just a buzz phrase then?

It’s clearly much more. Over the years, I’ve been at conferences where people have talked about dealing with gigabytes, terabytes, and now petabytes of data. Each time, there may have been a jump in scale of three orders of magnitude, but people have always wanted this same agility. They want to be able to extract new information that they can plough into their business processes, help them to get a return on their investment, and give them competitive advantages.

 

What are the highlights of the conference program for you? Are there any talks, workshops, or other sessions, which you are particularly looking forward to?

I think we’ve put together an outstanding program. I’m really excited about the sheer breadth and depth of the conference. We have speakers from academia, enterprise, and scientific research. We also have a keynote speech planned for each of the two days of the conference. We have another interesting speaker, Michael Feindt from Blue Yonder, who has actually taken analytics algorithms from physics and is applying them to businesses. We even have a speaker from PayPal explaining how the company is now using an HPC approach for fraud detection. It’s great to see that with this approach they’re able to make a significant return on investment. So, although they have installed ‘heavy iron’, it’s actually well worth it in terms of improving early detection of attempts at fraud.

 

As well as the opportunities provided by big data, there are also challenges it poses, too. Perhaps you could tell the iSGTW readers a little about what you think are the biggest challenges which need to be tackled in order to benefit fully from big data?

Big data is a new paradigm, which means you can no longer just sit in your chair and expect daily sales or transaction reports to come in. With big data, you’re really trying to use the data in a much more complex, combinatorial way, so that you can detect new things. It can be problematic because you’ve got to be sure that your data is correct on the input side, so that you don’t have a situation of ‘garbage in, garbage out’; you’ve got to have your complex algorithms ready for deployment; and you’ve got to know how you’re going to use the data to give your business an advantage.

 

And what about big data specifically in a scientific context? Could you maybe expand upon how big data plays a role in research here at CERN?

CERN has always been at the forefront of what you might call ‘physical big data’. We’ve always been pushing disk storage, as well as tape storage, so that we can cope with as much data as is financially defendable. If you look back, we’ve always had huge amounts of data: we had terabytes in the days of LEP. Obviously, a terabyte can now be put on one disk and carried away, so clearly time changes things as you look back. Nevertheless, I think that, if you freeze time at any particular point, you’ll see that we’ve always been right up there with the very, very big guys.

Another aspect, of course, is the complex analytics that needs to take place. It’s not just a question of running a quick skimming exercise with the physics data and saying ‘oh look, we’ve found the Higgs boson’. There are a lot of complex events which need to be processed and the signal always needs to be separated from the background, too. It is this ‘mining’ approach which puts CERN firmly on the big data map.

Yet, CERN isn’t the only research organization dealing with big data. Members of the EIROforum, which is a partnership between eight of Europe’s largest inter-governmental scientific research organizations, have recently collaborated on a report outlining their vision for e-infrastructures in Europe. In this report, they state that ‘the era of data-intensive science has begun’.

CERN, EMBL, and ESA are also collaborating with leading IT providers, from both the public and private sectors, on the Helix Nebula initiative, which aims to create a cloud computing infrastructure to support the massive IT requirements of European scientists. Rupert Lueck will be giving a talk at ISC Big Data ’13 explaining how EMBL, as one of Helix Nebula’s three flagship use cases, is able to implement a novel cloud service to simplify large-scale genome analysis and to help scientists, inside and outside EMBL, to better meet the challenges of analyzing large amounts of genomic sequence data by provisioning tailor-made high performance computing and bioinformatics resources on demand.

 

Read more on iSGTW

 

Rate this blog entry:
0
EuropeLogo eInfastructure This project has received funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no 313203
Copyright © 2014 iMENTORS. All rights reserved.
evden eve nakliyat