Tuesday, 10 April 2012

The 3rd London Mayoral Election Twitter Poll

Click on pictures to enlarge.

Poll findings
1) Manifesto launches are a bit like Birthdays for political candidates except they hand out the presents I was really happy I was able to pick up on this is Brian Paddick's tweeter presence. Volume up, positive tweets up, negative tweets down. This is Mr Paddicks best poll by some way.

2) Perhaps the tax storm is dying down as Ken's negative rating is down quite strongly. The fact that he's level pegging on sentiment, if not quite on volume, is something of a vindication as the ComRes poll had them not too far apart. Boris's stats are roughly back where they were two days ago.

3) What to do about Siobhan Benita? That is the question that will be taxing the mind of many involved in the London Mayoral Election. ComRes had her nowhere. Twitter says different. We'll see who's right on election day. I don't think she has the kind of ground operation that the main parties have and if she starts to pose a threat to the other candidates she can expect a rougher ride than she's had up to now. ComRes doesn't mention her name in the polling question which they do for the 3 main party candidate's which could be a source of bias. But Twitter is well Twitter. I'd expect her to get 3-5% at the moment but if she does keep getting momentum then that could increase.

4) Jenny Jones has polled a little worse today but it's not a huge variation. Her volume was up slightly but she is by some way behind the others who also made the cut for our poll.


5) Talking of the cut UKIP and the BNP didn't make it today. Although it was good to find @UKIPWebb4London and we look forward to mentioning him more often should they get over 20 proper mentions. Not a high bar by anymeans.

Free free to compare to yesterdays poll which you can find here

How does the poll work 

Tweets are collected from Twitter and then counted to give the volume figures and then they are classified by the sentiment package which is an addition to the R programming language I use for this. They're classified based on the content of the tweet. So something like "Love @mayoroflondon he's brilliant" would end up in the positive pile while "I'm going to rip Boris Johnson's ugly evil head off if no bus in 30secs" would end up in the negative pile. If it's not quite to obvious then there's the neutral category.

Poll figures


      Candidate Pos3 Neut3 Neg3 Tot3 Pospercent3 Negpercent3
1       Ken Lab  244   648  386 1278          19          30
2     Boris Con  259   681  407 1347          19          30
3   Jenny Green   63    80   58  201          31          29
4 Brian Lib Dem  200   202   86  488          41          18
5   Siobhan Ind  166   205   55  426          39          13


Where is Boris's 100,000 strong Twitter Army?

I ask the question after reading this interesting interview on the Standard website. In the interview given to Pippa Crerar  Boris's campaign manager Lynton Crosby states:

"This time the huge supporters’ army, which has swelled to more than 100,000, is key. Each of the “captains” runs 25 other volunteers, leafleting, door-knocking and tweeting in support of the Mayor. There are tens of thousands more supporters online."

I'd like to ask where are these supporters because they're not tweeting that often if there are a 10's of thousands? Perhaps the definition of "supporter" is being rather stretched here perhaps it's a facebook like or an old email address given in 2007. I understand there's an election on but the evidence really doesn't support the claim.

Another interesting point Crosby made that the campaign has 347 "ward" captains these must be bigger than the normal wards in London local government or they're only targeting some areas as there are alot more wards in London than that.

In other news I've found the twitter account of the UKIP candidate. Which is handy as polling will commence shortly.


Monday, 9 April 2012

The 2nd London Mayoral Election Twitter Poll





Click on pictures to enlarge.


Poll findings
1) Boris had a good day today he got loads more tweets than anybody else. Infact more than everyone else combined. His positive rating hasn't really changed 22% today rather than 21% over Saturday and Sunday but his negative ratings came down from 28% to 15%. Less than a third of his main rivals.

2) Ken had a bad day he got more tweets but they were more negative 50% as opposed to 40% on Sat & Sun. Numerically Ken's positive tweets have hardly changed but they form a smaller %.

3) Siobhan Benita got more tweets than either the Greens or the Lib Dems and has grown in daily volume. She's wrestled the mantle of having the largest % of positive tweets from Jenny Jones. There is definately some momentum here.

4) Jenny Jones had a few more tweets today but the essentially not much has changed in the Twittersphere's positive view of her campaign.

5) Brian Paddick has both negative and positive %'s down as he's daily volume is slightly up. Although 5th out of 6 isn't anything to write home about.

6) Some people actually mentioned the BNP so they've made the poll cut with a whole 22 tweets. Not much should be read into the figures as the sample size is so small. Not expecting them to get anywhere.


7) Overall I think Ken's troubles means there is some leakage of the anti Boris vote to the Independent Siobhan Benita. Boris is sitting pretty even if he need to find £5million for homeless charities and  Chris Addison called him an idiot. Strangly I don't think that's going to hurt him too much. I think Boris is viewed something along the lines of they're all idiots but he's funny and his our idiot. Otherwise a relatively quiet day on the trail.

How does this poll work?

This poll is based on tweets made on the 9th April 2012 up until 10pm



     Candidate Positive2 Neutral2 Negative2 totalpoll2
1       Ken Lab       138      284       417        839
2     Boris Con       338      969       234       1541
3   Jenny Green        63       74        46        183
4 Brian Lib Dem        16       54        31        101
5   Siobhan IND       110      113        48        271
6    Carlos BNP         2       14         6         22
  Positive1 Neutral1 Negative1 Totalpoll1 Pospercent2
1       130      210       229        569          16
2       101      252       136        489          22
3        98      146        65        309          34
4        33       71        60        164          16
5        87      179        67        333          41
6         0        0         0          0           9
  Negpercent2 Negpercent1 Pospercent1
1          50          40          23
2          15          28          21
3          25          21          32
4          31          37          20
5          18          20          26
6          27         NaN         NaN

So Who Won The Weekend #Hashtag Wars?


Click on the picture for a better view.

The number for the hastags were lower than I expected but then people are on holiday. It may also be due to political hashtags being mainly used by political activists.

Boris clearly has more support amongst the hashtag users of Twitter it'll be interesting to see if this is follow up with the main poll later. Perhaps it's a result of the spring in the step of the Boris campaign.

Ken has had a rough few days over the tax issue and I think some of his support is looking for other options at least on the first round. Hence some of  the success of Siobhan Benita. There is still a sizable anti Boris vote but Ken needs to up his game and move on from the tax issue if he's going to pull off victory.

Brian Paddick needs to new hashtag. He was the only tweeter to use it over the weekend. Someone else used it after I scraped it but even if I up'd it to 3 it doesn't say he'll be the next Mayor of London to me.

Please feel free to suggest more Mayor related hashtags especially the parties not represented here. This is due to an inability to find their hashtags if they have any rather than any reluctance to cover them.

These tweets we sent from the 7th April until about 6pm on the 9th.

Sunday, 8 April 2012

The First London Mayoral Election Twitter Poll


 Click on pictures to enlarge.

Poll findings

1) Independent Siobhan Benita is polling third on Twitter slightly ahead of the Greens.

2) The three establishment parties with male candidates all have more negative tweets than positive. The Independent and Green female candidates have more positive ratings.

3) Ken Livingstone has both the largest number of tweets about him and is the most polarising figure with more positive and negative tweets about him than anyone else. The number of his negative tweets are significantly above Boris.

4) Boris Johnson isn't that far behind Ken Livingstone but hasn't got his higher number of negative tweets. It's hard to say that Johnson is ahead but his lower negative rating than Ken with be a source of pleasure for his campaign.

5) Jenny Jones is doing better than the Lib Dems and has along with the Independent Siobhan Benita more positive than negative tweets.

6) Brian Paddick has fallen behind both the Green Jenny Jones and Independent Siobhan Benita and has the lowest number of positive tweets of any of the 5 candidates featured in the poll. This is not great performance from the Lib Dems.

7) Both the BNP and UKIP clearly aren't putting enough effort in as I only found 1 tweet about each of their candidates.So they have been excluded for this poll.

How does this poll work?

I scraped Twitter for tweets containing both the name of the candidate eg "Ken Livingstone" or their main tweeter feed eg "@Ken4London". If anyone can find a tweeter feed for the UKIP candidate Lawrence Webb then please tell me! He seems rather elusive.

The poll was limited to tweets made on the 7th and 8th April. I then used the sentiment package in R to classify the tweets into Postive, Neutral or Negative categories.

Is this poll perfect and will it reflect what we happen on election day?

The short answer is no and we'll see.  The slightly longer answer is Twitter's user base isn't a proper sample of the entire London population but as Twitter becomes more popular and is taken up by more sections of society the more representative it becomes. It should also enable quicker snapshots to be taken than traditional polling techniques. It'll be more able to detect sudden swings in public opinion. If one of the candidates gets on TV and says something that enrages the people we should be able to measure the reaction. This is an experiment and it will evolve over time. I'll note any methodological changes here on the blog.

I'm going to do this poll daily until the election. Monday's poll should be out about Midnight.



Friday, 6 April 2012

Are any of the minor candidates in the London Mayoral election heading for a breakthrough?

When the electorate decide who shall occupy London's City Hall on 3rd May we can be as certain as you can be that it'll either be Boris Johnson or Ken Livingstone. But these are politicians from the 2 main political parties and we all know how popular they are! With the Bradford West by-election sending back to parliament some bearded bloke who goes on and on about the Middle East, is there a sign of a breakthrough for the minor candidates? Lets look at the evidence from Twitter.

Using R with the packages TwitteR, plyr and ggplot2 and some code bunged together from earlier posts I've got tweets mentioning the minor candidates for London Mayor by name. It's limited to 1500 mentions so if anyone feels like buying me access to the whole twitter hosepipe feel free!





1) The short answer to the question is no.

2) The highest point on all the graphs is when the Newsnight debate took place between Ken, Boris, Brian and Jenny. The other minor candidates got very small vox pop at the start but were totally excluded from the debate which accounts for their peaks.

3) Jenny Jones the Green candidate and supposedly the smallest of the "Big" four parties that were allowed to take part in the debate got more mentions than the Liberal Democrat Brian Paddick. Given the national poll ratings for the Lib Dems on the basis of Twitter he should be more worried about coming 4th than 2nd

4) The BNP candidate did get 200 mentions overall but they were practically all ones laughing at the idea someone with such a foreign sounding name is standing for the BNP. If the far right are going to take advantage of these economically troubled times they wont be doing it in this election for London Mayor.

5) The Independent Siobhan Benita is getting more tweets than the BNP and UKIP combined though that is not a great claim to make. Her tweets are actually rather positive and I think she'll capitalise on the public's negativity towards party politics. She'll get a good amount of first preferance votes and may not even come last which for an Independent in an election for the London Mayoralty is a great achievement.

6) UKIP has actually been polling rather well nationally but it's candidate for London Mayor one Lawrence Webb, no me neither, is nowhere. Getting less than 50 Twitter mentions some of which aren't actually about this Lawrence Webb is a woeful performance and many of the others just list his name as one of the people standing.

Sunday, 1 April 2012

How to produce normalized scores in R

I like little bits of code that make analysis simple and here's a nice one that uses the plyr package. Say you have some data. It could be sales figures from shops or exam scores from schools around the local area  or in this case votes for political parties and you want to compare how they've changed over time. You could look at the raw numbers there's nothing wrong in that but if you normalize the data it can be easier to make comparisions both to the same case over time and between cases

data <-mutate(data, LABscale = round((Labour/11559735)*100,0))


Just to go through this if we start with mutate. This is the function that takes data from your dataframe lets you change it and then with <- tacks it back on the end. You put your dataframe name as the first thing after the open bracket. In this case it's data and then follow it with a comma. LABscale is the name of the new normalized variable. Labour is the variable from the dataframe we're interested in as this contains the number of votes Labour got at each general election from 1992. So to normalize the data with 1992 as the benchmark we divide by the number of votes Labour got in 1992 ie 11,559,735 and times by 100.

The use of the round function with the 0 after the comma at the end is to specify the amount of numbers after the decimal point we want. In this case it's 0 as we want an integer ie 108 rather than 108.3467843 ect as that doesn't really add anything to people's understanding.

Anyway this should get you something like this.


  LABscale
1      100
2      117
3       93
4       86
5       74