NoSQL Vs. Hadoop: Big Data Spotlight At E2

13073v1-max-450x450[1]Hadoop is the panacea, while NoSQL databases are the unsung heroes. Execs from MetLife, 10Gen, Informatica and Datameer discuss platform envy at E2 in Boston.

By Doug Henschen (courtesy InformationWeek)

5 Big Wishes For Big Data Deployments

5 Big Wishes For Big Data Deployments

A NoSQL database helped MetLife build in 90 days the kind of consolidated customer view it had dreamed about for nearly 10 years. Similarly, Constant Contact took three months to build a social marketing services app on NoSQL that would have required nine months to build on a conventional database and with higher levels of ongoing admin work.Stories like these have me thinking that NoSQL databases don’t get the credit they deserve as practical workhorses of the big-data revolution. That’s just one of the topics I’ll discuss in a series of keynotes at the June 17-19 E2 Conference in Boston.

The emphasis of three “fireside chats” on June 18 will be on getting to the business value in big data. Who better to kick off the conversation than John Bungert, the MetLife executive who led the insurer’s effort to build a Facebook-Wall-style interface for customer service. The application brings together data from more than 70 disparate administrative systems, claims systems and other data sources, and it rolled out to 200 call-center and claims admin staff in April. The Wall is expected to reach 3,000 more employees this summer, and there are plans to eventually add customer-self-service capabilities.

MetLife chose MongoDB for the document-intensive Wall project, but it’s using other NoSQL databases, including Cassandra, elsewhere within the organization. I’ll discuss tradeoffs and NoSQL-style breakthrough business applications with Bungert and with 10gen VP of corporate strategy Matt Asay.

[ Want more on bold claims about Hadoop? Read Cloudera Declares End Of Data Warehousing Era. ]

In contrast to NoSQL, Hadoop seems to be getting all the credit it deserves and then some. By many accounts, it’s the be-all and end-all of big data, despite the fact that the lion’s share of deployments today are little more than digital landfills. All too many organizations have yet to find nuggets of gold in those landfills, yet outfits like Cloudera are declaring that Hadoop is the new “center of gravity” in data management, displacing (though not replacing) the enterprise data warehouse.

Informatica CTO James Markarian will be at E2 to offer one of the most reasonable, nuanced and hype-free perspectives I’ve heard on big data trends, but I do intend to challenge him on whether Hadoop can actually provide the business benefit of dramatically reducing ETL costs.

Saving money is not the sort of inspiring, never-before-possible story we’ve come to expect from big data, however, so I’ll be asking Markarian and Datameer CEO Stefan Groschupf to share their most eye-opening, real-world success stories. Datameer offers tools to find those latent nuggets of gold on Hadoop clusters, and Groschupf confirms that the IT-centric “Hadoop-is-cheaper-than-relational” yarn is no longer enough. We need to get to the business value in big data. That’s why every Hadoop distributor now has a SQL-on-Hadoop strategy, even if most of the related software has yet to be proven and available.

Also on the Big Data and Analytics track at E2 is the June 19 presentation on “Fresh Approaches to Data Science.” I’m pretty excited about this one because the panelists include Will Cukierski of Kaggle and Omer Trajman of Wibidata. Kaggle has helped Allstate, GE, Merck and plenty of others get to pretty incredible analytics breakthroughs by crowdsourcing their problems. Allstate, for example, gained ideas for risk models that were more than 200% more effective than the insurer’s incumbent champion risk models.

WibiData provides open-source libraries, models and tools that make it easier to store, extract and analyze data on HBase, the Hadoop framework’s increasingly important NoSQL database.

That brings us full circle to the topic of NoSQL versus Hadoop, but it’s not an either-or, one-is-better-than-the-other proposition. Both of these platforms have their place in big data and are usually complementary. It’s strictly an observation about giving credit where credit is due and a good conversation starter on the topic of big data.

Thank you, TiA


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


This entry was posted on June 18, 2013 by in BIG DATA.

Top Posts & Pages


Enter your email address to follow this blog and receive notifications of new posts by email.


%d bloggers like this: