SHARING AMERICA'S TECH NEWS FROM THE VALLEY TO THE ALLEY
A NoSQL database helped MetLife build in 90 days the kind of consolidated customer view it had dreamed about for nearly 10 years. Similarly, Constant Contact took three months to build a social marketing services app on NoSQL that would have required nine months to build on a conventional database and with higher levels of ongoing admin work.Stories like these have me thinking that NoSQL databases don’t get the credit they deserve as practical workhorses of the big-data revolution. That’s just one of the topics I’ll discuss in a series of keynotes at the June 17-19 E2 Conference in Boston.
MetLife chose MongoDB for the document-intensive Wall project, but it’s using other NoSQL databases, including Cassandra, elsewhere within the organization. I’ll discuss tradeoffs and NoSQL-style breakthrough business applications with Bungert and with 10gen VP of corporate strategy Matt Asay.
[ Want more on bold claims about Hadoop? Read Cloudera Declares End Of Data Warehousing Era. ]
In contrast to NoSQL, Hadoop seems to be getting all the credit it deserves and then some. By many accounts, it’s the be-all and end-all of big data, despite the fact that the lion’s share of deployments today are little more than digital landfills. All too many organizations have yet to find nuggets of gold in those landfills, yet outfits like Cloudera are declaring that Hadoop is the new “center of gravity” in data management, displacing (though not replacing) the enterprise data warehouse.
Informatica CTO James Markarian will be at E2 to offer one of the most reasonable, nuanced and hype-free perspectives I’ve heard on big data trends, but I do intend to challenge him on whether Hadoop can actually provide the business benefit of dramatically reducing ETL costs.
Saving money is not the sort of inspiring, never-before-possible story we’ve come to expect from big data, however, so I’ll be asking Markarian and Datameer CEO Stefan Groschupf to share their most eye-opening, real-world success stories. Datameer offers tools to find those latent nuggets of gold on Hadoop clusters, and Groschupf confirms that the IT-centric “Hadoop-is-cheaper-than-relational” yarn is no longer enough. We need to get to the business value in big data. That’s why every Hadoop distributor now has a SQL-on-Hadoop strategy, even if most of the related software has yet to be proven and available.
Also on the Big Data and Analytics track at E2 is the June 19 presentation on “Fresh Approaches to Data Science.” I’m pretty excited about this one because the panelists include Will Cukierski of Kaggle and Omer Trajman of Wibidata. Kaggle has helped Allstate, GE, Merck and plenty of others get to pretty incredible analytics breakthroughs by crowdsourcing their problems. Allstate, for example, gained ideas for risk models that were more than 200% more effective than the insurer’s incumbent champion risk models.
WibiData provides open-source libraries, models and tools that make it easier to store, extract and analyze data on HBase, the Hadoop framework’s increasingly important NoSQL database.
That brings us full circle to the topic of NoSQL versus Hadoop, but it’s not an either-or, one-is-better-than-the-other proposition. Both of these platforms have their place in big data and are usually complementary. It’s strictly an observation about giving credit where credit is due and a good conversation starter on the topic of big data.
Thank you, TiA