Big Data and Sentiment Analysis

COGSA – Big Data Sentiment Analysis Solution

How many times have you woken up in the middle of the night wondering why sales of your new product or service you had launched with such fun fair dwindled? After all you had left no stone unturned to market it. In this digital era where most of the shopping is done on the internet, feedback too is expressed on the internet. Customers today like to read reviews before venturing out to buy anything. From mobile phone to movies, cars to holiday packages, everything has a review about it. A quick search for your product reveals the truth. Among the negative reviews, you find one that is spewing venom at your product and causing ripples in the social media.
The person who posted the view is well known on the social media network for his blunt views. Right or wrong, this person has a lot of followers who read his posts and can pass on the views to their contacts.
As a business you would like to have an idea of what has been written about you in these forums. So you go about collecting data in various formats from everywhere you can. This huge collection of data is referred to as Big Data.
Big Data is a collection on complex, unstructured information from everything that is related to or referring to the organization or its products. It is a collection of information from emails, presentations, websites, social media, TV shows etc. There is no limitation on size above which data collection is referred to as Big Data. In case of one organization few hundreds of terabyte s of data could be considered as Big Data, while in other it will take a few hundreds of petabytes before referring to their data as Big Data. The sheer size of data takes it beyond the capability of commonly used software tools to capture, manage and process the data in reasonable time. Although most relational database management systems have evolved with time to process huge amount of data, processing Big Data with the commonly available tools is still proving to be an unsurmountable task.
Typical Uses of Big Data:

  • Helps to find, visualize and understand issues faced and improves decision making.
  • Helps in detection of fraud and enhance security by highlighting loop holes in technologies and procedures.
  • Get a complete spectrum of customer opinion about your services or products
  • Operational Analysis identifies bottlenecks and helps to improve business output thus enhancing productivity.
  • Improves data warehousing capability to store more data and improves operational efficiency.
  • What is COGSA ?

  • COGSA automatically extracts comments, experiences, emotions and opinions left by people (customers) on social media about your product and services
  • People utilize social media to express their feelings about almost everything on places like tweeter, or in reviews, comments, blogs and emails.
  • You can track a product or service you have launched by following such people and places and determine how your product is been received / viewed by the public on the web.
  • The process gives you the power to know what opinion your customer hold about you and your product.
  • The software also identifies the customer’s attitude towards a brand by using variable such as context, tone, mood, emotion, etc.
  • Benefits of Sentiment Analysis:

  • Product Perception: Gain insight of customer’s sentiments and monitor change in trend over time.
  • Flame Detection: Evaluate procedures and processes which have given rise to negative sentiments.
  • Identify feedback sources to define new marketing targets and enhance visibility of product.
  • Reputation Management: Helps to innovate brand to enhance customer experience by offering tailor made solution and gain a competitive edge in the market.
  • Provides invaluable inputs in designing next generation products and services
  • Data Preparation: This step involves two stages

    Opinion Detection

  • In order to analyse the data, it must be first prepared, so that applications analysing it can identify the key words, phrases etc. This step has its own challenges. The most difficult of the entire task is the linguistic variations across the globe. These various accents and different meanings to the same word or phrase in different language and dialects makes it difficult to identify nouns and adjectives that can be classified as sentiments. The next step is to filter what matters. Remove non-textual contents and mark-up tags and other data that is not required for the analysis.
  • Filter what matters. Remove non-textual contents and markup tags and other data that is not required for the analysis.
  • Polarity Classification or Sentiment Orientation:

  • Determine sentiment polarities. Two opposing once (positive and Negative) are ideal. More categories generate less accurate data but provide more variety to it.
  • Each sentiment should fall under any one polarity.
  • Carry out the task as different levels: terms, phrase, sentence or document level.
    CogSA’s middleware then trains the model to process the data. Here CogSA uses Hadoop Architecture to process this hug e data. Hadoop architecture has two major components, Hadoop Distributed file system (HDFS), a distributed file system that provides high-throughput access to application data and Hadoop MapReduce, a YARN-based system for parallel processing of large data sets. The data is transformed using HIVE for query and PIG for scripts to a format that can now be used for analysis.
  • Two major approaches to classify review

  • Sentimental Orientation (SO) Approach. Involves two tasks
  • Determine the SO of the sentiment extracted from the opinion e.g. “Excellent, Poor”
  • Determine the overall sentiment f the sentence or the review in the Review analysis step. The sentence is classified as either Positive or Negative depending on the dominant SO in the sentence.
  • Machine Learning Approach

  • Works by breaking down a review in words or phrases and classifying the review as Positive or Negative.
  •  Sentiment Prioritization

  • Drill down into negative sentiments and identify the source (writer) of content.
  • Determine the extent this sentiment could have speared.
  • Analyze how many potential customers could have been influenced.
  • Reputation:

  • Analyze the profile of the contributor on social media to see his average sentiment index (whether the writer always criticizes everything) or has a genuine cause.
  • Negate the views by stating facts, politely. If there is a issue, commit publicly to resolve it.
  • c) Intensity:

  • Identify intensity of each sentiment
  • Bigdata2
    4.Upload data to cloud with Hadoop Architecture and Complex Engine Processing -CEP

  • Hadoop Distributed File System (HDFS): A distributed file system that provides high-throughput access to application data.
  • Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.
  • Transform the data into a format that can be used for analysis.
  • Transformation is carried out using MapReduce
  • CogSA mines Big Data from social platform like Facebook and Twitter and provides data metrics and BI with awesome dashboards. Users can customize dashboards and even the Data Mining Workflows to better leverage the Sentiments Analytics made possible by COGSA Data Transformation of Raw Big Data.

    Corporates can now analyze this data using enterprise Business Intelligence tools like PowerPivot, PowerView or Analysis Services to create dashboard to display information on customer views on product, services or corporate brand name in general. Armed with this information, corporates can identify areas where they need to improvise and innovate to provide better services to their customers and improve their market presence and hence brand value.

    Please email for more details.