Analysing Team Attributes

Here we will analyze the team Attributes table. There are various fields which I have chosen to help me analyze the data.

In [1]:
# import Statements

import numpy as np
import pandas as pd
import sqlite3
import matplotlib.pyplot as plt

print("-------------------->>  Imports successful")
-------------------->>  Imports successful
In [2]:
#Setting up Path of the Database
path = "data/"
database = path + 'database.sqlite'

#Connection to Database
conn = sqlite3.connect(database)
print("-------------------->>Debug: Database connected")

#Command used to check tables in DB
# tables = pd.read_sql("""SELECT * FROM sqlite_sequence""",conn)
# tables
-------------------->>Debug: Database connected

Here we check what data columns are present in the Team Attributes Table

24 in total here we will try to find relevant ones

In [3]:
#Checking Relevant Columns which are present
df = pd.read_sql("SELECT * from Team_Attributes ",conn)
for i, v in enumerate(df.columns):
    print(i, v)
0 id
1 team_fifa_api_id
2 team_api_id
3 date
4 buildUpPlaySpeed
5 buildUpPlaySpeedClass
6 buildUpPlayDribbling
7 buildUpPlayDribblingClass
8 buildUpPlayPassing
9 buildUpPlayPassingClass
10 buildUpPlayPositioningClass
11 chanceCreationPassing
12 chanceCreationPassingClass
13 chanceCreationCrossing
14 chanceCreationCrossingClass
15 chanceCreationShooting
16 chanceCreationShootingClass
17 chanceCreationPositioningClass
18 defencePressure
19 defencePressureClass
20 defenceAggression
21 defenceAggressionClass
22 defenceTeamWidth
23 defenceTeamWidthClass
24 defenceDefenderLineClass
In [4]:
df.describe()
Out[4]:
id team_fifa_api_id team_api_id buildUpPlaySpeed buildUpPlayDribbling buildUpPlayPassing chanceCreationPassing chanceCreationCrossing chanceCreationShooting defencePressure defenceAggression defenceTeamWidth
count 1458.000000 1458.000000 1458.000000 1458.000000 489.000000 1458.000000 1458.000000 1458.000000 1458.000000 1458.000000 1458.000000 1458.000000
mean 729.500000 17706.982167 9995.727023 52.462277 48.607362 48.490398 52.165295 53.731824 53.969136 46.017147 49.251029 52.185871
std 421.032659 39179.857739 13264.869900 11.545869 9.678290 10.896101 10.360793 11.086796 10.327566 10.227225 9.738028 9.574712
min 1.000000 1.000000 1601.000000 20.000000 24.000000 20.000000 21.000000 20.000000 22.000000 23.000000 24.000000 29.000000
25% 365.250000 110.000000 8457.750000 45.000000 42.000000 40.000000 46.000000 47.000000 48.000000 39.000000 44.000000 47.000000
50% 729.500000 485.000000 8674.000000 52.000000 49.000000 50.000000 52.000000 53.000000 53.000000 45.000000 48.000000 52.000000
75% 1093.750000 1900.000000 9904.000000 62.000000 55.000000 55.000000 59.000000 62.000000 61.000000 51.000000 55.000000 58.000000
max 1458.000000 112513.000000 274581.000000 80.000000 77.000000 80.000000 80.000000 80.000000 80.000000 72.000000 72.000000 73.000000

Below we have chosen the following fields for analysis for games won by the team

  • Home api id
  • Winner of the match
  • Build Up play Speed
  • Build Up Dribbling Speed
  • build Up Play Passing Speed
  • Defence Pressure
  • Defence Aggression
  • DefenceTeam Width
In [5]:
## This Query Selects the following Data,
## Home API ID, winner (calulated as a diffrence of home minus away team goals),
## and a few factors like PlaySpeed and Defence Stats

attr = pd.read_sql("""
SELECT 
Match.home_team_api_id as home_id,
(home_team_goal - away_team_goal) as winner,
Team_Attributes.buildUpPlaySpeed as buildUp_PlaySpeed,
Team_Attributes.buildUpPlayDribbling as buildUp_PlayDribbling,
Team_Attributes.buildUpPlayPassing as buildUp_PlayPassing, 
Team_Attributes.defenceAggression as defence_Aggression,
Team_Attributes.defenceTeamWidth as defence_team_width
from Match
JOIN Team_Attributes on Team_Attributes.team_api_id = home_team_api_id
WHERE winner >=1
""",conn)

Here we are plotting graphs for each of the above values.

In [6]:
graph1 =attr.sort_values(by="winner", ascending=False)
graph1.hist(column="buildUp_PlaySpeed");
In [7]:
graph1.hist(column="buildUp_PlayDribbling");
In [8]:
graph1.hist(column="buildUp_PlayPassing");
In [9]:
graph1.hist(column="defence_team_width");
In [10]:
graph1.hist(column="defence_Aggression");
In [11]:
print("Mean Play Speed Value for Winning Teams is",graph1['buildUp_PlaySpeed'].mean())
print("Mean Play Dribbling Value for Winning Teams is",graph1['buildUp_PlayDribbling'].mean())
print("Mean Play Passing Value for Winning Teams is",graph1['buildUp_PlayPassing'].mean())
print("Mean Team Width for Winning Teams is",graph1['defence_team_width'].mean())
print("Mean Defence Agression for Winning Teams is",graph1['defence_Aggression'].mean())
Mean Play Speed Value for Winning Teams is 52.48631047513779
Mean Play Dribbling Value for Winning Teams is 49.06916969452294
Mean Play Passing Value for Winning Teams is 47.01571345352433
Mean Team Width for Winning Teams is 52.577364859818665
Mean Defence Agression for Winning Teams is 49.813096536169326

The above is the ideal mean values for winning teams. A team should strive to be within one standard deviation of the mean to have the chance of winning.