A scraper that directly gives football(not soccer) data from FBRef website directly to Pandas dataframe. Major teams and leagues supported.
Project Repository PyPi Project
Contribute to the Project so that everyone can benefit from it.
Disclaimer : This package in no way tries to take away from the work of FBRef.com I love that website and just needed a package that makes my life easier
To install the package(it just requires pandas !!) :
pip install fbref2pandas
After installation, create a MatchLogsLink object by providing it the following arguments(in str format) :
- team_id : The identifier of the team. Generally, as I have noticed, fbref uses a particular
idfor each of the team. It is a 8 character string that uniquely identifies a football team. See this table foridof some popular teams.
| Team ID | Team Name |
|---|---|
| '206d90db' | 'Barcelona' |
| '53a2f082' | 'Real-Madrid' |
| 'db3b9613' | 'Atletico-Madrid' |
| 'e31d1cd9' | 'Real-Sociedad' |
| '2a8183b3' | 'Villarreal' |
Just copy the team_id and pass as the first argument.
-
year: Theyearfor which the data is required. Theyearis in the format2022-2023(for 2022-23 season). Pass this as second argument in theMatchLogsLinkobject. -
comp_id: Thecomp_idis also one of the variables that fbref maintains internally, as far as I can deduce. Thecomp_idis of the formcXXX, and can be found from the fbref website. See this table below for some commoncomp_id
| Comp ID | Competition Name |
|---|---|
| 'c8' | 'Champions-League' |
| 'c12' | 'La-Liga' |
| 'c19' | 'Europa-League' |
| 'c122' | 'UEFA-Super-Cup' |
| 'c569' | 'Copa-del-Rey' |
| 'c646' | 'Supercopa-de-Espana' |
| 'c882' | 'Europa-Conference-League' |
log_type: I love how many stats are available in the fbref website. These are just awesome for your next project. Thelog_typecould be any of these values:
| Log Type |
|---|
| 'scores_and_fixtures' |
| 'shooting' |
| 'goalkeeping' |
| 'passing' |
| 'pass_types' |
| 'goal_and_shot_creation' |
| 'defensive_actions' |
| 'possession' |
| 'miscellaneous_stats' |
After passing these 4 parameters to the MatchLogsLink object, most of the task is done. Just create a new Data object, and pass the above MatchLogsLink object. An example of this would be :
from fbref2pandas.classes import MatchLogsLink, Data
link = MatchLogsLink('206d90db', '2022-2023', 'c12', 'shooting')
# print(link)
data = Data(link)
If the link is correct, there shouldn't be a problem. Now, to get the data as a DataFrame object, just call the function fbref2pandas() of the Data object. The functions returns the data as a pandas Dataframe. If the link is incorrect, an exception is raised. Just double check if the data from the table above. Enjoy the data.
To get the data as DataFrame :
df = data.fbref2pandas()
Note : I comply with all the rules given for the data use in the sports reference website, and I believe that it is in fair use. Not many requests will be taken from this package.
Note : Help would be really appreciated to expand this package. Create a PR and add whatever you could scrape from links of FBRef. Great PRs will be merged on small notices.