Wednesday 26 October 2011

Pick a number any number

I was looking at GDP per person the other day on the UN HDI website and I thought the data was quite interesting. Not least because there were lots of incomplete data for the 1980 figure compared to the the 2010 one. This appeared partly because they didn't have the data for whatever reason but also because there have been a lot of  new countries created since 1980.

Anyway I thought it would be good to analyze the data to see which countries had done well or not so well economically over the last 30 or so years. The results were in themselves interesting, more on which in another post but I did notice a particularly striking finding.


> cor(Y1980, newgdp, method = "pearson")
[1] 0.7785629
> cor(Y1980, newgdp, method = "spearman")
[1] 0.9456404
> cor(Y1980, newgdp, method = "kendall")
[1] 0.8039875

With the type of data I had. Two continuous variables I needed to do a test of correlation on them. I actually had a look at all three tests Pearson, Spearman and Kendall as you can see above and the difference in the results on the same data did suprise me somewhat. So getting the statistical test right is important as the differences can be large.

No comments:

Post a Comment