Big Data and Retailers

by Deepak Sharma on Monday, January 16, 2012

Over the last few months, there has been a lot of coverage on how Retailers are using Big Data. Wal-mart with its recent acquisition of Kosmix is one Retailer which  is in the forefront of this wave. Here is a collection of articles which discusses how Wal-mart is using Big Data.

How Walmart plans to use Big Data

Kosmix stands out for its ability to search and analyze connections in real-time data streams to deliver highly personalized insights to users. The platform powers TweetBeat, a real-time social media filter for live events. By using this intelligence, Kosmix is building a giant knowledge base called the‘Social Genome.’ This giant knowledge base captures information and relationships about entities such as people, events, topics, products, locations and organizations.

By analyzing their social media activity, Social Genome can make recommendations about products, events or any other activity that the user is interested in. For example, by using publically available social media data, the Walmart product store can suggest product recommendations, based on recent tweets or Facebook wall posts.

While the idea sounds great, doing this in reality is a huge problem — especially since there are thousands of data pieces flowing in a torrent from live data sources such as tweets, Facebook posts and blogs. The data flow was so fast that Kosmix could not rely on the traditional Map-Reduce or Hadoop framework that is typically used to solve Big Data problems.

“Social Media data is the fastest growing source of Big Data today. In addition to being Big Data, social media data such as Twitter also has a real-time nature — it’s not just Big Data, but also Fast Data. With mobile devices, location data is now a new source of both Big and Fast data,” explains Rajaraman, on the technical challenges faced by his firm while building the platform.

To address this Big Data and Fast Data problem, Kosmix developed its own in-house solution called Muppet, which processes streaming fast data in a lightening fashion, over large clusters of machines. Today, Muppet can manage and track data streams with billions of updates a day.

Getting a Handle on Big Data with Hadoop

Wal-Mart Stores, struggling to translate its brick-and-mortar success to the Web, is using free software named after a stuffed elephant to help it gain an edge on Amazon.com in the $165.4 billion U.S. e-commerce market.

As customers flock to social media, Wal-Mart expects sites such as Facebook and Twitter to play a bigger role in online shopping. By analyzing what social network users say about products on those sites, the world’s largest retailer aims to glean insights into what consumers want.

With its online sales less than a fifth of Amazon’s last year, Wal-Mart executives have turned to software called Hadoop that helps businesses quickly and cheaply sift through terabytes or even petabytes of Twitter posts, Facebook updates, and other so-called unstructured data. Hadoop, which is customizable and available free online, was created to analyze raw information better than traditional databases like those from Oracle.

“When the amount of data in the world increases at an exponential rate, analyzing that data and producing intelligence from it becomes very important,” says Anand Rajaraman, senior vice-president of global e-commerce at Wal-Mart and head of @WalmartLabs, the retailer’s division charged with improving its use of the Web.

Big data and the disruption curve

Big data projects are aimed at revenue growth, many efforts are being funded by business units and not the IT department and money is increasingly being diverted from large enterprise vendors.