Sentiment Analysis 101

In this document we briefly explore the what, why, and how of sentiment analysis – let’s jump straight into it.

So what is Sentiment Analysis?

I would be surprised if you haven’t already come across the term and possibly know/use it already. It has become increasingly popular with the advert of big data in context of social networks and is a tool frequently used by brands to monitor and measure ‘chatter’ about themselves and/or market.

Essentially it offers a way to extract and measure the opinion of this chatter (normally filtered by a specific topic or on a channel) e.g. if you wanted to know the general opinion of your brand you could analyse all the tweets that link to your brand somehow, tallying the general opinion for each and reporting the result.

For a more detailed explanation, check out Wikipedia


This should be fairly obvious (and already stated) – it provides a way of monitoring the opinion of your customers for your brand and thanks to the likes of Twitter and Facebook, this monitoring can be near real-time giving you up to date analysis 24/7 of what people think about you.

Some applications:

  • Comparison between yourself and competitor(s)
  • Monitor performance for a specific campaign
  • Allow you to be proactive and engage with your consumers
  • Identify new insight and opportunity

How does it work?

Imagine you had access to thousands of tweets (for this example and the implementation we will focus on twitter but could any others or more than one source), each having been painstakingly categorised as either positive or negative  (thankfully datasets on almost any topic you can think of is available somewhere on the internet, and normally free – for this example we used data from

Now we have our data, we can harvest it – by this I mean iterate through each (either in parallel, MapReduce gives you a nice structure to use, or sequentially) throwing out words that add little value (normally referred to as stopwords – given that the English language normally contains 75% of redundant words this is an important step and removes a lot of noise) to extract features (elements that are used for assessment). Another way to filter out noise is to use the most frequently  used words for a particular category given they are not common across all categories.

Once we have filtered out all the noise we use the remaining ‘dictionary’ to create a feature list – with this we now construct a feature list for each tweet e.g. if our feature set was “bad ugly terrible great good loved” and tweet “I loved the movie, great to watch on Saturday”, we would have the feature set “0 0 0 1 0 1”, “positive” (here we also include the category it belongs to”.

We can now use our feature sets to train our model, in this implementation we’re using a Naive Bayes Classifier, which is a probabilistic classifier i.e. the category (positive or negative) is derived from finding the probability based on the trained data (what we just did) given a set of features.

How accurate is it?

Depends on the context you are using it and how accurate you need it (as this will determine what features you’ll extract and the sample size needed etc). I read  once that the Naive Baynes Classifier can predict almost anything given enough data and the right features. When creating a model you normally test by removing (randomly) 10% of your training data to reserve for testing  (normally repeated a few times). In our implementation we tested 10 times with 10% of the training data and achieved around 84% accuracy.

It’s worth noting that the Naive Baynes Classifier is called Naive for a good reason; it treats each feature independently e.g. “don’t see this movie this weekend” and “don’t see any other movie this weekend” obviously mean different things, but if we treat don’t as a feature and give it a high weight for the negative category then these two tweets will be classified the same. You could of course use a more sophisticated feature extraction techniques such that you capture 2 or more words but given the brevity of tweets, independence with a large training set should suffice.

Why not try it out yourself

Screen Shot 2013-07-28 at 17.09.04

You can have a play here  –

NB: The Sentiment Analysis service developed by We Make Play is an experiment/prototype – it will break and accuracy is variable.