With the release of the full data from the European Broadcasting Union, we are able to perform a statistical analysis on the voting in the Eurovision Song Contest 2014. For the second ‘Voting Insight’ article, we analyse the impact of running order in this year’s Eurovision Song Contest. Previously on ESC Insight we have compared how the performance position impacts other contests with producer-led running orders, and before the final this year in Denmark gave our analysis of what we could infer from trends in the Danish-produced running order.
Now we are able to look at where songs placed on the scoreboard based on their running order. This analysis looks to see if the EBU is correct with their statement that ‘’ and furthermore compare this impact for both juries and televotes. Ben Robertson investigates.
Read This First, Here’s The Summary Of Our Findings
By its nature, this article is going to be long, densely packed, and have quite a bit of maths going on. That said, the conclusions are worth clearly highlighting. If you don’t want to be spoiled, skip over this bullet pointed summary.
- Through the use of correlation analysis on the jury votes, the public votes, and the running order, we will investigate if the running order has a statistical impact on the results of the Eurovision Song Contest 2014.
- Firstly, we will show that the running order on its own is not enough to ensure a victory for any one song at the Contest.
- Secondly, we will find evidence that shows with statistical confidence that songs do benefit from a later running order slot.
- Finally, we will examine the variability of both televote patterns and jury patterns as they change from the Semi Final through to the Grand Final, and show that juries have a wider range of opinion than the televote and are more likely to change their vote between the semi-final and the grand final than the public.
The overall conclusion is this. The running order does have an impact on the Eurovision Song Contest. In general the best songs will continue to do well, no matter their running order location, but there are circumstances where the running order can decide the winning song.
This confirms previous studies both by ESC Insight and other academic research (including Page, L. and Page, K., 2008. A Field Study Of Biases In Sequential Performance Evaluation On The Idol Series. Journal of Economic Behavior and Organization; and Bruine de Bruin, W., 2005. Save the last dance for me: unwanted serial position effects in jury evaluations. Acta Psychologica 118).
How this affects the long-term health of the Song Contest, why this is important, and what can be done to address it, will be areas for discussion throughout the summer and into the 2015 season here on ESC Insight.
Okay, summary over. Time for the science part.
What Effect Do We Expect Running Order To Have?
Before conducting our analysis, let’s go back over the Song Contest history and look at some of the prevailing thinking about the impact of the running order.
Before Conchita Wurst’s victory for Austria this year, you have to go all the way back to 2004 to find a winner that sung outside of the final ten songs in the running order; nobody ever wins from second place in the running order; and if you’re looking to qualify from the semi-finals you want to be in the bottom half of the draw more than the first.
Going into a bit more detail, our previous analysis on former Eurovision Song Contests suggests that of the running order on the points score of a song, with a later start resulting in a better performance. That’s not enough to change the entire Contest, but enough to offer a 20 to 30 point spread on a song’s ultimate performance.
That means ‘Euphoria‘ would have still won the 2012 Contest if Loreen had been drawn in the first half. And yes, that means that if Sweden had been singing second, the ‘curse’ would have been broken. It also means that in a close Song Contest the running order could change the result – the example here being 2003 with Turkey (singing 4th) scoring 167, Belgium (singing 22nd) scoring 165, and Russia (singing 11th) scoring 164 points. Switch Russia and Belgium, and t.A.T.u. would have beaten Sertab (but then if you switch Sertab to 11th and Russia to 22nd you probably would have got a tie-breaker).
Returning to the 2014 Contest, our analysis will firstly test the ‘5% variance’ imparted by the running order, before going further to see how songs have benefited or lost out as they qualify through from the Semi Finals to the Final and have their running order position changed.
Our Methodology and Assumptions
With the full results and rankings available from the EBU from the 2014 Song Contest, we are able to analysis the full voting patterns. This enables us to use rank correlation to look at the individual jury rankings, overall country rankings, and the final results, using Spearman’s Rank Correlation. This is a statistical calculation which looks at how strong the correlation is between the two sets of data.
For example if the results of the correlation analysis conclude as +1, that implies that we have a perfect correlation in the data. Comparing the running order the result, a +1 would occur if the winning song performed last,the second placed song had sung in second-last place, and so on up to the last placed song opening the Contest.
If the result of the correlation is -1, then the correlation is perfectly negative. In this case that would imply that the song that won the Contest opened the show, the second placed song was performed second, all the way through the order so the song that was drawn last finished last.
If the result of the analysis is 0, this implies that there is zero mathematical correlation between the two sets of data.
To determine if we believe a statistical significant correlation exists, we will consult these tables. The values in these tables are used to decide if the correlation is significant or not, and how likely it would be by chance or not. As you may expect, if more data is correlated together, we require a less perfect order to show a statistically significant trend.
We also use means (averages) and standard deviations (measuring the spread of data) as used in our first Voting Insights 2014 article for further comparison purposes of the data. We also take use where appropriate of Pearson’s Correlation. Pearson’s produces a similar number to Spearman’s Rank, but uses the full range of data rather than just the ranking. We will come back to this later in the analysis as required.
Can The Running Order Alone Guarantee Victory?
First of all, let’s look for any strong bias between the running order and the rankings made by the jury and the public in both the Semi-Finals and the Grand Final.
We take the individual rankings from each jury member and each accumulation of televotes and convert this into a 1-25 (or 26 for non-qualifying countries), we give the highest points to the highest running order positions, and then perform the Spearman’s Rank Correlation on these results.
Average of all Televote Spearman’s Rank Correlations | Average of all jury scores Spearman’s Rank Correlations | Standard Deviation of Televote Spearman’s Rank Correlations | Standard Deviation of jury scores Spearman’s Rank Correlation | |
First Semi Final | 0.119 | -0.089 | 0.197 | 0.244 |
Second Semi Final | 0.271 | -0.017 | 0.261 | 0.325 |
Grand Final | -0.014 | 0.028 | 0.167 | 0.201 |
We can note at this stage the increase in standard deviation between the juries and the televotes. This backs up again from our previous study that the juries are less reliable compared to televotes for following trends and patterns. The difference here between the juries and televoters (around twenty percent) is in keeping with the figures for spread we found before.
At first glance, these results show a split between juries and televoters, suggesting that in the Eurovision Semi Finals being later in the running order is better for gaining televotes. We attribute the disparity between the jury scores and televotes mostly to the styles of songs in question. Malta was a strong jury favourite compared to its televote placing, which would skew our results significantly due to the fact this opened the show and therefore as per our mathematical hypothesis had the worst running order position.
However for any result to be statistically significant we would need to see stronger correlations than are seen here. The Second Semi Final shows a positive televote correlation of 0.271, however for us to be 90% confident such a result did not occur by chance alone we would need the correlation to reach at least 0.35.
Because of the lack of correlation, we can argue that the running order alone is not a decisive factor in the Eurovision scoring, but as we are about to show, the running order does have an impact.
Let Us Now Consider The Impact Of The Producer-Led Draw
Before jumping to a conclusion about the lack of correlation above, we need to consider the running order itself. Simply put, it is not completely random, and nor is it completely chosen by the host broadcaster. The Eurovision songs are drawn at each stage into the first half or second half; and from these allocations the running order is decided by the host broadcaster. As it so happened both pre-contest favourites, as well as our eventual winner, were drawn in the first half of both their respective Semi-Finals and the Grand Final.
As we at ESC Insight commented before the final, the running order produced from DR . What happens if we split up each section of the running order, and treat it not as three running orders, but six running orders?
Under these circumstances, we can perform the same rank correlation figures below, but on the top and bottom halves of each show to test for any bias where the running order for a block is under complete producer control.
We will perform this part of the analysis using the Pearson’s Correlation technique to account for the full range of scores from the juries and the televotes in each section of the contest, rather than the relative ranks of songs in each half of the Contest.
Average of all Televote Pearson’s Rank Correlations | Average of all jury scores Pearson’s Rank Correlations | Standard Deviation of Televote Pearson’s Rank Correlations | Standard Deviation of jury scores Pearson’s Rank Correlation | |
1st half Semi Final 1 | -0.331 | 0.006 | 0.327 | 0.342 |
2nd half Semi Final 1 | 0.273 | 0.263 | 0.214 | 0.386 |
1st half Semi Final 2 | 0.351 | -0.083 | 0.271 | 0.332 |
2nd half Semi Final 2 | 0.314 | -0.009 | 0.35 | 0.388 |
1st half Grand Final |
0.259 | 0.076 | 0.221 | 0.289 |
2nd half Grand Final |
0.198 | 0.155 | 0.191 | 0.288 |
Excepting the anomaly of the top half of the first Semi Final where Sweden and Armenia are powerful enough to drown out any smaller trends, we can see here a weak but visible trend that the running order does have stronger bias towards the final position of songs. The effect of Semi Final One is strongly negative due to the strong scoring potential that both Armenia (drawn first) and Sweden (drawn fourth) had regardless of their running order position, which strongly skews the dataset. Our juries may be more spread, but this suggests that juries are less likely to be affected by running order bias than televoters.
The running order from DR overall shows a slight apparent increase in running order bias overall, showing that where the producers can control the running order, the product of this is that it increases any running order bias. Note in particular we found a negative bias overall for the Grand Final, but splitting it into two halves shows each have quite weak but also quite clear running order bias based on the producers’ choices.
This is in common with other producer-led running orders as .
We must remember that the quality of the songs and the performances themselves, as well as any cultural links exhibited in Eurovision voting, play a major role in the outcome. However, these results still show that the running order does contribute towards the ultimate performance of a song in the Eurovision Song Contest.
How To Solve A Problem Like Removing The Songs From The Analysis?
Twenty of the twenty-six songs in the Grand Final had to qualify through a Semi Final. What we are able to do with the full spread of data is see how the voting patterns changed between these songs as they progress from the Semi Final to the Grand Final.
The question we test is simply whether or not a song drawn later in the show moves up in the relative rankings above other songs that qualified from the same Semi Final.
We are aware of the potential flaws in this, that different people could be watching and casting votes at home, there is a wider pool of potential countries to vote for, or that the performances could be delivered in differing quality on stage. We assume for the purpose of this investigation that these factors have no bearing on the result and would point out the professionalism of everyone taking to the Eurovision stage this year.
For an example of how this works, let’s assume that the televote in country X gives country Y a fifth place in the Semi Final between the ten songs that had qualified. If country Y then finishes fourth in the country X televote (between the ten songs that qualified) we give country X a score of +1, indicating it gained one place (from country Y). If it lost two places relative to the other countries in its Semi Final in the Grand Final, it would have a score of -2.
We do this for each televote ranking and each jury member’s ranking. We then compare the sums of all these changes to the difference in the running order position, based on a percentage. For example Sweden was drawn fourth in the Semi Final, and thirteenth in the Grand Final. The change in Sweden’s running order was as follows:
13/26 – 4/16 = 0.25
Meaning a 25% benefit to Sweden’s running order position assuming our running order hypothesis is correct. We ignore any possible effects of running order benefit or loss based on other songs around them in the competition, and we take the pure mathematical solution with later in the draw being better for our purpose.
Here is our table of running order winners and losers as we go from each Semi Final qualifier to the Grand Final.
Name of Country (1st Semi Final) |
Percentage change in running order | Rank of running order benefit/loss | Name of Country (2nd Semi Final) | Percentage change in running order | Rank of running order benefit/loss |
Armenia |
+20.7 % | 3 | Malta | +77.9 % |
1 |
Sweden |
+25 % | 1 | Norway | -0.8 % |
5 |
Iceland |
-15.9 % | 6 | Poland | +1.3 % |
4 |
Russia |
+13.9 % | 4 | Austria | +2.3 % | 3 |
Azerbaijan | -38.5 % | 8 | Finland | +15.9 % |
2 |
Ukraine |
-52.4 % | 9 | Belarus | -59.0 % | 9 |
San Marino |
+21.2 % | 2 | Switzerland | -3.1 % |
6 |
The Netherlands | +4.8 % | 5 | Greece | -48.2 % |
8 |
Montenegro |
-63.0 % | 10 | Slovenia | -27.9 % |
7 |
Hungary | -19.2 % | 7 | Romania | -73.1 % |
10 |
This table shows which countries have the biggest change in running order as they go from the Semi Final to the Grand Final. We therefore would be looking especially closely at Montenegro, Ukraine and Azerbaijan for possible drops in Semi Final One, and from Belarus, Greece and Romania to suffer moving from Semi Final Two, with Malta as a strong beneficiary.
To see if these trends come true, we compare these ranks, both percentage (for Pearson’s correlation) and ranked (for Spearman’s correlation) to assess whether the running order does alter the results of each televote and jury.
Here is a table for those songs in Semi Final One and Two to show how much gain and loss they had from each televote as they progressed to the Grand Final. Note that those countries that did not meet the threshold for their televote for either the Semi Final or the Grand Final are not included.
Name of Country (1st Semi Final) |
Increase or decrease in televote rank from each country | Rank of televote change | Name of Country (2nd Semi Final) | Increase or decrease in televote rank from each country | Rank of televote change |
Armenia |
+11 | 2 | Malta | +14 |
2 |
Sweden |
+9 | 3 | Norway | +25 |
1 |
Iceland |
+4 | 4 | Poland | +6 |
4 |
Russia |
+21 | 1 | Austria | +5 |
5 |
Azerbaijan |
+1 | 5 | Finland | -4 |
7 |
Ukraine |
-2 | 6 | Belarus | -16 |
9 |
San Marino |
-7 | 8 | Switzerland | +7 |
3 |
The Netherlands |
-5 | 7 | Greece | -3 |
6 |
Montenegro |
-14 | 9 | Slovenia | -13 |
8 |
Hungary | -18 | 10 | Romania | -21 |
10 |
When we compare the data here with the running order benefit or loss table we are able to calculate the correlation over the entire televote.
Pearson’s Rank Correlation for Televote to Running Order Difference |
Spearman’s Rank Correlation for Televote to Running Order Difference | |
First Semi Final |
0.523 |
0.479 |
Second Semi Final |
0.718 |
0.544 |
The results here show a strong positive correlation between the running order and the relative rankings of songs that qualified from the same Semi Final. Our average of the Spearman’s Rank values calculated (which would be 0.512) would for a data set of 10 values give us a value clearly over required 0.4424 to give us a 90 % certainty that this trend is not random. It is statistically significant and both Semi Final results would meet this requirement. The Pearson’s Rank, using the full spread of the percentage benefit/loss and the full rank, shows an even stronger positive correlation.
It is not perfect, and we note that Norway’s ‘Silent Storm’ benefitted the most from televoting despite not receiving a notably different draw in the final.
However it is important to remember what we are suggesting with this conclusion. The running order does not as we have previously investigated give a significant bias to the contest as a whole, but it does have some measurable difference on the outcome. For Norway, their +25 increase in rnakings averages out across the 16 countries with televotes in the Second Semi Final and the Grand Final to boosting their average televote rating by +1.6 relative to the other countries from their Semi Final.
It is unlikely these kinds of figures alone would win or lose anybody the Song Contest, but they are enough to make a noticeable difference, and in a closely fought Contest the points margin of victory could easily be less than the points margin gained through a better running order position.
Are The Juries Affected By The Running Order?
You will recall that the EBU brought the juries back in part to diminish the impact of the public vote. Let us perform the same calculations on the jury rankings (Georgia’s results are not taken into consideration because the jury votes were disqualified in the Grand Final).
Firstly, the gain or loss of a country’s ranking in the jury results.
Name of Country (1st Semi Final) |
Increase or decrease in individual jury ranks from each country | Rank of jury vote change | Name of Country (2nd Semi Final) | Increase or decrease in individual jury ranks from each country |
Rank of jury vote change |
Armenia |
-15 | 6 | Malta | -10 |
7 |
Sweden |
+41 | 1 | Norway | +38 |
1 |
Iceland |
+5 | 5 | Poland | +16 |
4 |
Russia |
-31 | 10 | Austria | +27 |
2 |
Azerbaijan |
+10 | 4 | Finland | +7 |
5 |
Ukraine |
-17 | 7 | Belarus | -22 |
9 |
San Marino |
+14 | 3 | Switzerland | +3 |
6 |
The Netherlands |
-19 | 8 | Greece | -11 |
8 |
Montenegro |
-26 | 9 | Slovenia | +20 |
3 |
Hungary |
28 | 2 | Romania | -68 |
10 |
…and the correlation…
Pearson’s Rank Correlation for Jury Rankings to Running Order Difference |
Spearman’s Rank Correlation for Jury Rankings to Running Order Difference |
|
First Semi Final |
0.270 |
0.382 |
Second Semi Final |
0.476 |
0.527 |
The average of the Spearman’s Rank totals gives us a score of 0.455. This would still give us mathematical evidence that the juries, just like the televoters, are also susceptible to the same kind of running order bias.
The net results of this data averages to be less than comparing to televote changes. Our biggest mover is Romania, losing 68 ranking points relative to the other songs in the second Semi Final. However we assessed this for a total of eighty-five jury members, taking our average number of points lost down per jury member to only 0.8.
Nevertheless, even when we know the people are the same, and they have listened to the songs at least on more than one occasion, we still see that some running order bias shines through both juries and televoters. Even with this apparent though some of the trends in the data are the polar opposites. The main televote advance in Semi Final 1 was for Russia, but they lost out the most from the jury groups as one example. Armenia did a similar thing on a smaller scale as it progressed to the Friday and Saturday night performances.
The overall trend in the data though does give statistically significant data to suggest the running order does have an impact on the workings of votes in the Eurovision Song Contest. Not enough to decide if you win or lose, but enough for a small pool of points that may be vital.
A Statistical Aside About Our Juries
The fact that the juries also exhibit running order bias may be a surprise to many. It was a surprise to me. I would expect jury members to have strong pre-judgments about the songs they like and would generally see little change in their results. The change may not be as dramatic, but it is still significant.
If you look at the 180 jury members that voted in the Semi Final and the Grand Final, you will find that 179 of them changed their relative rankings of songs that qualified from the Semi Final they vote in. Something triggered them to change their minds.
The obvious answer would be that this is due to the performance quality on the night, especially with ‘vocal capacity’ and ‘the overall impression of the act’ being amongst the criteria that juries have to work on. However the trends are less clear than we might expect. Norway was our main beneficiary of jury points progressing to the Grand Final (gaining thirty-eight points), but still twenty-three of the eight-five jurors (27.1 %) voted it in a lower relative position in the Grand Final compared to the Semi Final. This is one example of the many that shows the juries do not clearly show consensus.
The juries are also not able to follow trends from their own voting as clearly as televoters are able to. We run Spearman’s Rank Correlation on the rankings of each country’s points to the qualifiers in each Semi Final to the Grand Final. We then average for all the televoting scores and for the jury voting scores. This shows how well the placing of a song in the Semi Final predicts how it will place in the Grand Final.
Mean result of Spearman’s Rank Correlations of Semi Final Televotes to Grand Final Televotes |
Standard Deviation of Spearman’s Rank Correlations |
|
First Semi Final |
0.868 |
0.087 |
Second Semi Final |
0.860 |
0.101 |
Mean result of Spearman’s Rank Correlations of Semi Final Jury Scores to Grand Final Jury Scores |
Standard Deviation of Spearman’s Rank Correlations |
|
First Semi Final |
0.744 |
0.231 |
Second Semi Final |
0.673 |
0.209 |
As you would expect, we have very strong positive correlations that the televotes are similar between the Semi Final and the Grand Final. The effect though is much weaker for the jury groups, and the results are spread over a larger amount. The juries therefore are less predictable even based on their votes in the previous show compared to televoters who have more steady patterns in voting for their favourite entries.
The large standard deviation here is notable because it is much larger for the jury groups than televoting, suggesting the data is very spread out with some jury members have very low correlations between the two voting occasions. Four jury members, two or which are jury Chairs, shown either a correlation equal to zero or negative correlation between their voting in the Semi Final compared to the Grand Final. Mathematically speaking, how they voted in the Semi Final gave no bearing to how they voted in the final.
As an example, the lowest correlation recorded was -0.08. The biggest outlying vote here was one to move Greece from a twelth place in the Semi Final (eighth overall compared to other songs that qualified in that Semi Final) to rating them eighth in the Grand Final (second highest from songs qualified from Semi Final Two). This apparent boost was something that many other jurors seemed to miss, as Greece was a net loser in jury votes heading from Semi Final Two into the Grand Final.
This article is not the place to speculate as to how and why these changes happened, but that all but one jury member changed their mind on the songs between the different performances, and that some did quite wildly, is a further sign of the increased random noise that the jury vote gives to the final score that is read out on the Saturday night show.
What will be interesting will be to see if this is true in later years, as the rules from the EBU now ensure that all jury members must have at least a three year window before being invited on the jury again, so next year’s jury members will be completely different to this year.
Drawing Conclusions About The Running Order
Our investigations with the actual voting data from the 2014 Song Contest has shown that the running order has a statistical impact to Eurovision. This backs up previous research into the topic. What we have also established directly for the first time is that both juries and televoters can be susceptible to the same kinds of bias caused by it.
The large spread of jury data and values, and their lack of consistency, adds an extra confusing dimension to this analysis.
The EBU should examine the options available to limit running order bias, perhaps through the mixing of the running order show to each jury member as we have previously discussed as an option on ESC Insight. Further ideas in the linked article such as providing training or increasing the numbers of voters from each jury should also be investigated.
Based on the findings here that show juries are less in keeping with each other, or even themselves, than we would otherwise expect and wish from our industry experts, the issues surrounding jury voting are likely to continue strongly.
Overall though, the big trends at the Song Contest are still apparent.
A good song is needed to do well, and voting traits are visible positively and negatively between different countries. Running order is a small but significant factor in the final result, and could be the decisive factor in a close Contest. It may not change our winner this year, but if as an example San Marino and Latvia had swapped starting positions in Semi Final One I would expect based on the trends above that maybe Valentina would not have delighted the microstate with her qualification, and we may have eaten cake twice on Saturday night.
The running order has an impact on the Eurovision Song Contest, and it should be a factor that is closely and carefully noted by the European Broadcasting Union for future Contests.
Keep a look out in the near future for our third ‘Voting Insight’ which will examine the trends between the ages, gender and professions of our jury members and how they have voted.
If you find any mathematics published here to be incorrect please drop an email to Ben via ben@escinsight.com.
Thank goodness for the summary points – I’ve a science background but the stats were even starting to make my head spin!
I wonder if the ‘producer-lead’ running order will continue UNTIL a close final produces a situation as la 2003 and accusations fly about host/EBU bias?
Having said that, the EBU could take the opportunity to blow apart the ‘No.2 never wins’ when there is another ‘Fairytale’ or ‘Euphoria’ around…
One thing I’ve always wondered is whether televoters (or indeed juries) have any bias in the final towards songs from the semi they voted in (i.e. That they’d heard before)?