fitzRoy - 0.1.5 release

For those of you who’ve been following me on Twitter, you’ll know that I’ve been working on an R package for AFL called fitzRoy with Rob from Analysis of AFL. Today we released a new version which has a much requested feature, so I’d figured a blog post was in order.

You’ll have to reinstall fitzRoy to get the latest functions. We still aren’t on CRAN but you can use devtools to get it.

# install.packages("devtools") # uncomment if you haven't installed devtools before
devtools::install_github("jimmyday12/fitzRoy")

AFL Tables player stats

Our initial version of fitzRoy had some data included in it from a data dump we got from Paul at AFLtables. This data was great as it had a all of the afltables stats on a player by player basis for all time. While this was ok for historical analysis, it stopped at round 3, 2017 and it was a one off dump meaning we couldn’t keep it up to date.

As such, we’ve written a new function to replace this internal data. It’s called get_afltables_stats. It takes two arguments start_date and end_date. These are pretty self explanatory - the function will return stats from all matches between start_date and end_date. The format of these inputs needs to be either dmy or ymd.

Both arguments are optional. start_date will default to the first AFL game end_date will default to the System Date.

As an example, we could just grab data from this year.

library(fitzRoy)
library(tidyverse)
## ── Attaching packages ───────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.0.0     ✔ purrr   0.2.5
## ✔ tibble  1.4.2     ✔ dplyr   0.7.6
## ✔ tidyr   0.8.1     ✔ stringr 1.3.1
## ✔ readr   1.1.1     ✔ forcats 0.3.0
## ── Conflicts ──────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
dat <- get_afltables_stats("2018-01-01")
## Returning data from 2018-01-01 to 2018-08-15
## Downloading data
## 
## Finished downloading data. Processing XMLs
## Finished getting afltables data
tail(dat)
## # A tibble: 6 x 58
##   Season Round Date       Local.start.time Venue      Attendance Home.team
##    <int> <chr> <date>                <int> <chr>           <int> <chr>    
## 1   2018 21    2018-08-12             1440 "Perth St…      40028 Fremantle
## 2   2018 21    2018-08-12             1440 "Perth St…      40028 Fremantle
## 3   2018 21    2018-08-12             1440 "Perth St…      40028 Fremantle
## 4   2018 21    2018-08-12             1440 "Perth St…      40028 Fremantle
## 5   2018 21    2018-08-12             1440 "Perth St…      40028 Fremantle
## 6   2018 21    2018-08-12             1440 "Perth St…      40028 Fremantle
## # ... with 51 more variables: HQ1G <int>, HQ1B <int>, HQ2G <int>,
## #   HQ2B <int>, HQ3G <int>, HQ3B <int>, HQ4G <int>, HQ4B <int>,
## #   Home.score <int>, Away.team <chr>, AQ1G <int>, AQ1B <int>, AQ2G <int>,
## #   AQ2B <int>, AQ3G <int>, AQ3B <int>, AQ4G <int>, AQ4B <int>,
## #   Away.score <int>, First.name <chr>, Surname <chr>, ID <dbl>,
## #   Jumper.No. <int>, Playing.for <chr>, Kicks <dbl>, Marks <dbl>,
## #   Handballs <dbl>, Goals <dbl>, Behinds <dbl>, Hit.Outs <dbl>,
## #   Tackles <dbl>, Rebounds <dbl>, Inside.50s <dbl>, Clearances <dbl>,
## #   Clangers <dbl>, Frees.For <dbl>, Frees.Against <dbl>,
## #   Brownlow.Votes <dbl>, Contested.Possessions <dbl>,
## #   Uncontested.Possessions <dbl>, Contested.Marks <dbl>,
## #   Marks.Inside.50 <dbl>, One.Percenters <dbl>, Bounces <dbl>,
## #   Goal.Assists <dbl>, Time.on.Ground.. <int>, Substitute <int>,
## #   Umpire.1 <chr>, Umpire.2 <chr>, Umpire.3 <chr>, Umpire.4 <chr>

Note that each row is a ‘player match’ so the first few columns are just repeated team level data. It is probably more intersting to look at specific columns relating to player stats.

dat %>% 
  select(Date, First.name, Surname, Playing.for, Contested.Possessions, 
         Uncontested.Possessions, One.Percenters, Time.on.Ground.., 
         Brownlow.Votes)
## # A tibble: 7,656 x 9
##    Date       First.name Surname  Playing.for Contested.Possessions
##    <date>     <chr>      <chr>    <chr>                       <dbl>
##  1 2018-03-22 David      Astbury  Richmond                       9.
##  2 2018-03-22 Shai       Bolton   Richmond                       3.
##  3 2018-03-22 Dan        Butler   Richmond                       7.
##  4 2018-03-22 Josh       Caddy    Richmond                      11.
##  5 2018-03-22 Jason      Castagna Richmond                       7.
##  6 2018-03-22 Reece      Conca    Richmond                       6.
##  7 2018-03-22 Trent      Cotchin  Richmond                      13.
##  8 2018-03-22 Shane      Edwards  Richmond                       9.
##  9 2018-03-22 Brandon    Ellis    Richmond                       3.
## 10 2018-03-22 Corey      Ellis    Richmond                       7.
## # ... with 7,646 more rows, and 4 more variables:
## #   Uncontested.Possessions <dbl>, One.Percenters <dbl>,
## #   Time.on.Ground.. <int>, Brownlow.Votes <dbl>

That’s about it. The the rest of the changes are just bug fixes which you can see in the NEWS page of the packages website.

Hit us up on Twitter at plusSixOneBlog anoafl or over on Github if you have any feedback or issues! Enjoy.

comments powered by Disqus