Mac
Fly

Champions League

Every year, in European football, takes place the UEFA Champions League. This competition is the most important for the European clubs. The best teams of each championship face each other in order to win the “big ears cup”. The first three months correspond to the pool phase. 32 teams are in competition, issued from the first places of their respective championships or winners of the preliminaries rounds. The teams are separated in eight groups of four, chosen by draw, according to some rules:

• One team of each “hat”, where the hat represent the quality of the team with coefficients from the five previous years.

• Only one team per country in the same group.

• Because of TV rights, we need to separate teams from the same country. For example, in Spain, FC. Barcelona may be in groups A-B-C-D and Real Madrid in groups E-F-G-H.

Then, there are six games in each group where every team faces the others at home and away. To classify them, the number of points is used (3 for a victory, 1 for a draw) or the particular difference in case of equality (victories, goals difference and away goals). The two first continue in Champions league with the first as seed, the third one goes to Europa League and the fourth is eliminated. After that, it will be direct eliminations with two-way games.

I have done a study on the pool phase of the season 2014-15. For that, I have taken 91 variables to determine the position in the group. I have also classified them to study different aspects.


Basic data

In this first part, we will see the groups' data, I mean the principal data that are shown when we look at the rankings of the groups. We have the number of points, the number of victories, draws and loss, the goals for, against and the difference.

For the following graphs, we can see points for the different values taken. The index are the teams, ordered by groups and by rank. Four lines represent the average for each position. And the colours are there to show the position (black is for the firsts, red for the seconds, green for the thirds and blue for the fourths).

image001

When we see the number of points, it respects the order in the group. That is totally consistent because the ranking is done with the number of points. This graph explains nothing, except that the ranking is highly correlated to the number of points.

We can see some differences between the groups, where Real Madrid dominated the second group, whereas the third one was more undecided.

We have a similar assessment when we look at the number of wins and loss:

image002

The more we have wins, or the less we have loss, the more we have points. And it is the sign of the position in the group. Now let's see the number of draws:

image004

That is surprising! We cannot predict the rank with the number of draws. Doing several draws has no effect on the chance of qualification. This analysis is confirmed with the correlation coefficient of only 0.095. If we try to have a continuous curve, thanks to a density representation, the result is the same, we cannot differentiate the positions:

image005

Now, if we analyse the goals, we also can separate the positions:

image006

To have a better position, we need to score more and concede less goals. Like that, we can win games and have more points, it seems logical. This point can be confirmed with the goal difference, which is goals for minus goals against.

image008

To conclude, this part, we can say that the logic is confirmed. We are better placed if:

• We concede less goals

• We score more goals

• We win more

• We lose less

• We have more points

But we have also seen that the number of draws has nearly no influence on the final ranking, this conclusion is not evident at the first sight.


Days' results

We will now study the results per game. For that, we compare the 6 games one by one. We will analyse the number of points, the goals for and against, the location (at home or away), but also the variables normalised: the number of points, divided by the total of points of the team, likewise for the goals.

image009

The first assessment is that the hierarchy is mainly respected. We can even though see that the thirds score more at the first game and less during the second one, whereas it is the opposite for the seconds. The second day was the most beneficial for the fourths but less for the firsts.

image010

When we look at the ratio, we can better see these differences (each day must have 0.167 of ratio). The best day was the first for the thirds, the second for the fourths, the third for the firsts and the fourth for the seconds. Every position has a favourite game to score.

image011

For the goals against, it is more difficult to make a conclusion. The day 3 may be the best to conclude about the qualification or not (first and second together), but it doesn't seem significant. We can notice the firsts case where five of them have a favourite game to concede goals (1, 2, 4 or 5) but it is because they have conceded less goals (1 for Monaco, 2 for Real Madrid).

image012

The number of points can be deduce from the goals. We mainly respect the hierarchy, except for the first game: this one shows an opposite result for the second place battle (thirds have more points than seconds).

With the ratio view (next), we can see that the fourths take most of their points during the second game, whereas the thirds take them during the first and fifth games. It is better divided for the first and second (they take more points).

image013

Finally, if we study the location of each game, we can really see differences. This variable is worth 1 if the team plays at home, 0 if she plays away.

image014

For example, the firsts play at home for the games 1, 3 and 6. But this has not a direct link with the final position. We will see later the present link between the hat and the position, and the places of the games are decided by the hat.

In conclusion for this part, we can say, with only the first four games, the different positions by studying the ratio of goals for. Otherwise the results just show the repartition between the games but don't really explain the final position.


Statistics

We will see the data that make a team be better than the others. These are the variables that are used to show the quality of the team.

First, we will look at the goals in both half-time. For each half, it respect the hierarchy (as the total of goals for and against) but we can see some differences in the ratio (divided by the total of goals).

image015

The goals are mainly scored during the second half, this result is true for all the teams. But for the goals conceded, we can separate the qualified and unqualified: The firsts and seconds take more of their goals during the second half, whereas the fourths have a ratio near 50%. We can explain that by an easing off: If the team lead 4-0, they are more able concede a goal at the end of the game.

image016

The average minute for a goal can be significant. The seconds score firsts, and the hierarchy is respected for the others. We can say that the seconds are more able to stop playing before the end of the game.

image017

If we analyse the number of penalties, we see that the qualified teams have more, but once more, it is due to the total of goals. If we look at the ratio, we see that it is not really significant to make a result, the fourths rate is distorted by APOEL Nicosia who scored once on a penalty.

image018

If we look at the number of shots for the teams, the result is logical: if we shoot more, we are more able to score and to win the game. We could analyse the shots on post, stopped or on target but there is a lack of data on the UEFA website.

image019

The number of passes attempted is representative of the position but it is linked with the possession (see after). A good team attempt more passes but also succeed more of these passes. The hierarchy is still respected but we see that the fourth teams miss more passes than the others.

image020

The possession of the ball is a good indicator of the position. In France, there are often games where the winner does a hold-up: a bad possession and one goal during the match, but in Champions league, we cannot afford that, we need to have a good possession to have a better place. “Dominating is not winning” is false in European games.

image021

The number of crosses also influence the position. We can see that crossing well can determine the second place. Thirds missed more of them, they have a ratio worse than the fourths one.

image023

The result is quite the same with the corners: the fourths have more corners for than the thirds. But the number of corners is a rather good indicator of the position.

On the other hand, the number of fouls made is not representative of the position. If we do many fouls, we don't finish firsts nor lasts. The fouls suffered are not enough extensive to conclude on it.

image024

Finally, we can see the number of bookings. Whereas the yellow ones can make the position, the red ones just tell us that the firsts take none and the fourths take more of them.

image025

In conclusion of this part, we can say that the positions are deserved: The firsts dominated in most of the domains whereas the lasts were less powerful.


Players

When we think of Barcelona, we think of Messi. Ronaldo for Madrid, Zlatan for Paris, Drogba for Chelsea, Gerrard for Liverpool, Sneijder for Galatasaray, Perrin for Saint-Étienne, Pirlo for Juventus, etc. Having a good player in the team can make a success, but is it the sign of a good team?

image026

Running is not a sign of a good team, nor a sign of a bad team. In every team, there is a player that runs around 10.5 km per game. The short difference between the positions can be explained by turnovers of the players when the team lead.

The assessment is the same for the best shooter and passer of the team. Of course the best shooter score more in a team that score more. But the ratio shows us that it is not a sign of a good team. The ratio changes because fourths teams score few and on free kicks or individual actions (as Nicosia on one penalty).

image027

To conclude: a good team is not directly due to a good player. Individualities don't make the result. Football stays a team sport, with eleven players, every one remains as important as the others.


Clubs data

We have seen fields' data that can explain the quality of the team, but now we will see variables that seem independent from the results.

The groups are made thanks to hats based on the UEFA rank. This is a sign of the quality of the five past years, but does it show the current quality?

image028

In fact, it is. A team with a good recent past is a better team, that's why we always have the same teams in the final phase of the Champions league. We also can see how many teams has respect their initial rank:

image030

22 teams out of 32 (69%) have respected their rank. Thanks to the graph on the right, we can see that five groups have respected totally this rank (groups A, B, F, G and H) whereas the group C (the one of Monaco and Benfica) is reversed.

That was for the recent history, but what about the old one?

image031

The number of participation can explain the rank for the two qualified and two unqualified, but it doesn't explain why the team can be qualified. The year of creation can explain it: A young club is less able to be a good team and finishes often last. A football team is like wine, it enhance with time. As French, we also can notice that the youngest qualified is Paris SG and the youngest first is Monaco. French football sees more frequent moves in the best teams, it is hard to have a beautiful history and be actually good.

John Stein, a Scottish player and trainer said that “Football without fans is nothing”. Can we rule in his favour? Is the twelfth man as important as the eleven's others?

image033

The size of the stadium explain the final position. With a big stadium, we finish firsts whereas a small one makes us finish lasts. If the size is medium, the crowd makes the difference. Of course there are more people in a big stadium, but the attendance percentage shows a difference of about 10% between seconds and thirds.

image035

To finish, we will study the influence of the name of the team, and more precisely the presence of “FC”. There will be three FCB in the round of sixteen with Basel, Bayern Munich and Barcelona, this factor can be important.

image036

Unbelievable but true! A club with “FC” is more likely to be qualified than a club without it. We also can see that, with this name, the median position is 1.9 (instead of 2.5 as expected). Among the 15 FC teams, 12 of them were qualified (out of 16 qualified teams). An innocent name is a factor of qualification. Could you imagine that?

To conclude this part, we can say that the history (and more specifically, the recent one) explains the position. We have confirmed John Stein's point of view (and the one of hundreds of thousands of fans around the world) about the importance of the fans. Finally, we have highlighted the importance of “FC” in the name of the club in view of the qualification, I don't think many people know that.


Conclusion

This report is there to make us understand the Champions league data. We have seen the important points that makes the ranking. We have confirmed the place of the goals and wins, but also the continuity during the days of competition. We have seen the role of each day in the final position (each position has his own day).

About the statistics, the position confirms the importance of being powerful in every domain, whether it be concerning the possession or the number of corners. The importance of a good player in the team was not confirmed with these data. The impact of the best player is not seen in the best striker, best passer nor best runner.

We also have seen surprising facts about the importance of a good (recent) history and numerous fans, but more surprising with the presence of “FC” in the name, and the absence of link between the number of draws and the position.

With these conclusion, a team can know how to be good in the future years. When we know that last year every firsts beat the seconds to access to the round of height, the first place is really the place that the teams should aim for.


To finish, here are the different correlations between the position and the other variables, ordered by their importance, followed by a graph to compare them:

VariableCorrelation
Points-0.9185937397
Losts0.8967631239
Difference-0.8942697709
Wins-0.8793488715
Goals for-0.7894636493
Goals against0.7779934020
Corners against0.7003642842
Goals against in first half0.6768985550
Hat0.6750000000
Goals for in first half-0.6518171826
Percentage of possession-0.6469441968
Goals for in second half-0.6465182980
UEFA rank before the group stage-0.6408776505
Goals against in second alf0.6363372345
Points on day 3-0.6330740706
Time of possession-0.6283644820
Goals for on day 3-0.6164004621
Goals against on day 60.6061252864
Best passer-0.5965587590
Pass attempted-0.5952520265
Shots-0.5893593531
Crowd-0.5867043545
Points on day 6-0.5854656110
Corners for-0.5528285026
Goals against on day 30.5486641476
Best striker-0.5240601711
Stadium capacity-0.5052886578
Goals against on day 40.5025189076
Crosses succeded-0.4879025031
"FC" in the team name-0.4760952286
Goals for on day 5-0.4729653291
Ratio of goals for on day 3-0.4686198678
Crosses attempted-0.4595133553
Points on day 5-0.4550219882
Points on day 1-0.4529073595
Points on day 4-0.4448039039
Ratio of goals for on day 20.4342363516
Ratio of shots by the best striker0.4334293319
Points on day 2-0.4308143175
CreationYear0.4151015097
Goals for on day 6-0.4006590876
Percentage of successful passes-0.3947214397
Goals against on day 50.3924605922
Goals against on day 10.3840844210
Goals for on day 4-0.3819143698
Ratio of shots blocked0.3722119977
Ratio of goals against on day 30.3608701287
Yellow cards0.3524372772
Penalties-0.3286878676
Goals for on day 1-0.3278769448
Shots on goal post-0.3233161507
Red cards0.3222516933
Goals against on day 20.3143026903
Goals for on day 2-0.2869720216
Ratio of points on day 6-0.2739387659
Ratio of points on day 20.2691652366
Ratio of successful crosses-0.2566649266
Attendance percentage-0.2520724300
Played at home on day 6-0.2520504151
Ratio of goals against during second half-0.2325479491
Ratio of goals against during first half0.2325479491
Played at home on day 1-0.2236067977
Played at home on day 20.2236067977
Played at home on day 50.2236067977
Ratio of passes by best passer-0.2191536889
Ratio of points on day 3-0.2088698143
Number of participations-0.1944987186
Ratio of goals on penalties0.1839176720
Ratio of goals against on day 60.1665360216
Shots blocked-0.1624419897
Ratio of goals for on day 5-0.1492584676
Fouls commited0.1473647101
Ratio of goals against on day 2-0.1464973049
Ratio of goals on post -0.1458779669
Fouls received-0.1391157741
Average minute of goals scored0.1327158077
Ratio of points on day 5-0.1319646309
Ratio of goals against on day 1-0.1291799654
Ratio of points on day 10.1227495572
Ratio of goals for on day 6-0.1133369394
Ratio of goals against on day 4-0.0960851477
Draws0.0951063937
Maximal distance run by one player0.0875209087
Ratio of goals scored in first half-0.0808207353
Played at home on day 30.0559016994
Played at home on day 4-0.0559016994
RatGoalsFor2Half0.0287101335
Ratio of goals against on day 5-0.0163546280
Ratio of goals for on day 1-0.0101980901
Ratio of goals for on day 40.0093564754
Ratio of points on day 40.0009535543
image037

Nom

Commentaire