For those of you who’ve been following me on Twitter, you’ll know that I’ve been working on an R package for AFL called fitzRoy with Rob from Analysis of AFL. Today we released a new version which has a much requested feature, so I’d figured a blog post was in order.
You’ll have to reinstall fitzRoy to get the latest functions. We still aren’t on CRAN but you can use devtools to get it.
# install.packages("devtools") # uncomment if you haven't installed devtools before
devtools::install_github("jimmyday12/fitzRoy")
AFL Tables player stats
Our initial version of fitzRoy had some data included in it from a data dump we got from Paul at AFLtables. This data was great as it had a all of the afltables stats on a player by player basis for all time. While this was ok for historical analysis, it stopped at round 3, 2017 and it was a one off dump meaning we couldn’t keep it up to date.
As such, we’ve written a new function to replace this internal data. It’s called get_afltables_stats
. It takes two arguments start_date
and end_date
. These are pretty self explanatory - the function will return stats from all matches between start_date
and end_date
. The format of these inputs needs to be either dmy or ymd.
Both arguments are optional. start_date
will default to the first AFL game end_date
will default to the System Date.
As an example, we could just grab data from this year.
library(fitzRoy)
library(tidyverse)
## ── Attaching packages ───────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.0.0 ✔ purrr 0.2.5
## ✔ tibble 1.4.2 ✔ dplyr 0.7.6
## ✔ tidyr 0.8.1 ✔ stringr 1.3.1
## ✔ readr 1.1.1 ✔ forcats 0.3.0
## ── Conflicts ──────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
dat <- get_afltables_stats("2018-01-01")
## Returning data from 2018-01-01 to 2018-08-15
## Downloading data
##
## Finished downloading data. Processing XMLs
## Finished getting afltables data
tail(dat)
## # A tibble: 6 x 58
## Season Round Date Local.start.time Venue Attendance Home.team
## <int> <chr> <date> <int> <chr> <int> <chr>
## 1 2018 21 2018-08-12 1440 "Perth St… 40028 Fremantle
## 2 2018 21 2018-08-12 1440 "Perth St… 40028 Fremantle
## 3 2018 21 2018-08-12 1440 "Perth St… 40028 Fremantle
## 4 2018 21 2018-08-12 1440 "Perth St… 40028 Fremantle
## 5 2018 21 2018-08-12 1440 "Perth St… 40028 Fremantle
## 6 2018 21 2018-08-12 1440 "Perth St… 40028 Fremantle
## # ... with 51 more variables: HQ1G <int>, HQ1B <int>, HQ2G <int>,
## # HQ2B <int>, HQ3G <int>, HQ3B <int>, HQ4G <int>, HQ4B <int>,
## # Home.score <int>, Away.team <chr>, AQ1G <int>, AQ1B <int>, AQ2G <int>,
## # AQ2B <int>, AQ3G <int>, AQ3B <int>, AQ4G <int>, AQ4B <int>,
## # Away.score <int>, First.name <chr>, Surname <chr>, ID <dbl>,
## # Jumper.No. <int>, Playing.for <chr>, Kicks <dbl>, Marks <dbl>,
## # Handballs <dbl>, Goals <dbl>, Behinds <dbl>, Hit.Outs <dbl>,
## # Tackles <dbl>, Rebounds <dbl>, Inside.50s <dbl>, Clearances <dbl>,
## # Clangers <dbl>, Frees.For <dbl>, Frees.Against <dbl>,
## # Brownlow.Votes <dbl>, Contested.Possessions <dbl>,
## # Uncontested.Possessions <dbl>, Contested.Marks <dbl>,
## # Marks.Inside.50 <dbl>, One.Percenters <dbl>, Bounces <dbl>,
## # Goal.Assists <dbl>, Time.on.Ground.. <int>, Substitute <int>,
## # Umpire.1 <chr>, Umpire.2 <chr>, Umpire.3 <chr>, Umpire.4 <chr>
Note that each row is a ‘player match’ so the first few columns are just repeated team level data. It is probably more intersting to look at specific columns relating to player stats.
dat %>%
select(Date, First.name, Surname, Playing.for, Contested.Possessions,
Uncontested.Possessions, One.Percenters, Time.on.Ground..,
Brownlow.Votes)
## # A tibble: 7,656 x 9
## Date First.name Surname Playing.for Contested.Possessions
## <date> <chr> <chr> <chr> <dbl>
## 1 2018-03-22 David Astbury Richmond 9.
## 2 2018-03-22 Shai Bolton Richmond 3.
## 3 2018-03-22 Dan Butler Richmond 7.
## 4 2018-03-22 Josh Caddy Richmond 11.
## 5 2018-03-22 Jason Castagna Richmond 7.
## 6 2018-03-22 Reece Conca Richmond 6.
## 7 2018-03-22 Trent Cotchin Richmond 13.
## 8 2018-03-22 Shane Edwards Richmond 9.
## 9 2018-03-22 Brandon Ellis Richmond 3.
## 10 2018-03-22 Corey Ellis Richmond 7.
## # ... with 7,646 more rows, and 4 more variables:
## # Uncontested.Possessions <dbl>, One.Percenters <dbl>,
## # Time.on.Ground.. <int>, Brownlow.Votes <dbl>
That’s about it. The the rest of the changes are just bug fixes which you can see in the NEWS page of the packages website.
Hit us up on Twitter at plusSixOneBlog anoafl or over on Github if you have any feedback or issues! Enjoy.