fitzRoy v1.0.0

I’m excited to announce the official v1.0.0 release of fitzRoy has landed on CRAN!

Motivation

It’s been over 3 years since the first commit of the package and it’s great to finally feel like it’s stable enough for a v1 release. The initial idea was to provide a set of functions that made scraping data (at the time, the only way to get access to historical AFL results possible) easy so that modellers and analysts could spend more time doing modelling and analysing.

In the subsequent years, I’ve added a bunch of patches, learned a lot about writing better code, optimised a bunch of functions and even restructured the package a few times. While this has been great, the code and package itself was only just held together by a lot of spaghetti code.

This came to a head last year where, like in most walks of life, COVID threw a spanner in the works. The various logic that the package used was thrown out the window by things like rolling fixtures, weird round structures and various other quircks of the season that was 2020. I spent much of last year just fixing bugs and trying to keep the package functional and it was quite apparent it needed fixing.

An additional motivation has been AFLW. When I started the package, it didn’t sit great with me that we weren’t able to provide basic AFLW data but, in the early seasons of AFLW, that data generally didn’t exist. While that data is still hard to find, it is now more readily available and so I really wanted to make sure it got treated the same as the AFLM data.

As such - over the last few months I’ve been completely overhauling the package and restructured everything. I think it’s now much simpler, easy to use and includes both a wider range of data and importantly, a solid amount of AFLW data!

The main changes are below. For further details have aread through the pkgdown site.

Highlights

New family of fetch_* functions

This new family of functions provides a simple and consistent API to the common types of AFL data. It also allows you to use the same interface to access data from any data source, meaning that it should require much less effort to switch between data sources or start a new analysis.

The basic structure of the fetch_* functions is that you specify the season, round_number, source and comp. Some examples are below but read through the extensive Main Fetch Functions vignette for a detailed run through.

An example with the fixture dataset.

fetch_fixture(season = 2021, source = "AFL", comp = "AFLM")
## # A tibble: 198 x 47
##       id providerId   utcStartTime       status  compSeason.id compSeason.provi…
##    <int> <chr>        <chr>              <chr>           <int> <chr>            
##  1  2991 CD_M2021014… 2021-03-18T08:25:… SCHEDU…            34 CD_S2021014      
##  2  2986 CD_M2021014… 2021-03-19T08:50:… SCHEDU…            34 CD_S2021014      
##  3  2992 CD_M2021014… 2021-03-20T02:45:… SCHEDU…            34 CD_S2021014      
##  4  2993 CD_M2021014… 2021-03-20T05:35:… SCHEDU…            34 CD_S2021014      
##  5  2994 CD_M2021014… 2021-03-20T08:25:… SCHEDU…            34 CD_S2021014      
##  6  2987 CD_M2021014… 2021-03-20T08:45:… SCHEDU…            34 CD_S2021014      
##  7  2990 CD_M2021014… 2021-03-21T02:10:… SCHEDU…            34 CD_S2021014      
##  8  2989 CD_M2021014… 2021-03-21T04:20:… SCHEDU…            34 CD_S2021014      
##  9  2988 CD_M2021014… 2021-03-21T07:10:… SCHEDU…            34 CD_S2021014      
## 10  2999 CD_M2021014… 2021-03-25T08:20:… SCHEDU…            34 CD_S2021014      
## # … with 188 more rows, and 41 more variables: compSeason.name <chr>,
## #   compSeason.shortName <chr>, compSeason.currentRoundNumber <int>,
## #   round.id <int>, round.providerId <chr>, round.abbreviation <chr>,
## #   round.name <chr>, round.roundNumber <int>, round.byes <list>,
## #   home.team.id <int>, home.team.providerId <chr>, home.team.name <chr>,
## #   home.team.abbreviation <chr>, home.team.nickname <chr>,
## #   home.team.teamType <chr>, home.team.club.id <int>,
## #   home.team.club.providerId <chr>, home.team.club.name <chr>,
## #   home.team.club.abbreviation <chr>, home.team.club.nickname <chr>,
## #   away.team.id <int>, away.team.providerId <chr>, away.team.name <chr>,
## #   away.team.abbreviation <chr>, away.team.nickname <chr>,
## #   away.team.teamType <chr>, away.team.club.id <int>,
## #   away.team.club.providerId <chr>, away.team.club.name <chr>,
## #   away.team.club.abbreviation <chr>, away.team.club.nickname <chr>,
## #   venue.id <int>, venue.providerId <chr>, venue.name <chr>,
## #   venue.abbreviation <chr>, venue.location <chr>, venue.state <chr>,
## #   venue.timezone <chr>, metadata.travel_link <chr>,
## #   metadata.ticket_link <chr>, compSeason.year <dbl>

The other functions are below.

lineup <- fetch_lineup(season = 2021, round_number = 7, comp = "AFLW")
results <- fetch_results(season = 2020, source = "afltables", comp = "AFLW")
ladder <- fetch_ladder(season = 2020, source = "squiggle")
stats <- fetch_player_stats(season = 2020, source = "fryzigg")

New data source

The official AFL website has been added as a new data source for fitzRoy. This provides access to official statistics and generally includes a lot more data than other sources. It is the default source in any of the fetch_* family of functions.

Some good examples are shown in the new Getting started vignette

# The following will return the same. 
fetch_results(season = 2021, round_number = 1, source = "AFL", comp = "AFLM")
fetch_results_afl(season = 2021, round_number = 1, comp = "AFLM")

AFLW data

All data from the source “AFL” will also contain AFLW data for the first time. This includes fixtures, results, ladders, lineups and player stats. It is as simple as changing the comp argument to “AFLW” when using any of the fetch_* family of functions.

fetch_player_stats(season = 2021, round_number = 1, source = "AFL", comp = "AFLW")
## # A tibble: 294 x 67
##    providerId  utcStartTime  status compSeason.shor… round.name round.roundNumb…
##    <chr>       <chr>         <chr>  <chr>            <chr>                 <int>
##  1 CD_M202126… 2021-01-28T0… CONCL… 2021 AFL Womens  Round 1                   1
##  2 CD_M202126… 2021-01-28T0… CONCL… 2021 AFL Womens  Round 1                   1
##  3 CD_M202126… 2021-01-28T0… CONCL… 2021 AFL Womens  Round 1                   1
##  4 CD_M202126… 2021-01-28T0… CONCL… 2021 AFL Womens  Round 1                   1
##  5 CD_M202126… 2021-01-28T0… CONCL… 2021 AFL Womens  Round 1                   1
##  6 CD_M202126… 2021-01-28T0… CONCL… 2021 AFL Womens  Round 1                   1
##  7 CD_M202126… 2021-01-28T0… CONCL… 2021 AFL Womens  Round 1                   1
##  8 CD_M202126… 2021-01-28T0… CONCL… 2021 AFL Womens  Round 1                   1
##  9 CD_M202126… 2021-01-28T0… CONCL… 2021 AFL Womens  Round 1                   1
## 10 CD_M202126… 2021-01-28T0… CONCL… 2021 AFL Womens  Round 1                   1
## # … with 284 more rows, and 61 more variables: venue.name <chr>,
## #   home.team.club.name <chr>, away.team.club.name <chr>,
## #   player.jumperNumber <int>, player.photoURL <chr>,
## #   player.player.position <chr>, player.player.player.playerId <chr>,
## #   player.player.player.captain <lgl>,
## #   player.player.player.playerJumperNumber <int>,
## #   player.player.player.givenName <chr>, player.player.player.surname <chr>,
## #   teamId <chr>, gamesPlayed <lgl>, timeOnGroundPercentage <dbl>, goals <dbl>,
## #   behinds <dbl>, superGoals <lgl>, kicks <dbl>, handballs <dbl>,
## #   disposals <dbl>, marks <dbl>, bounces <dbl>, tackles <dbl>,
## #   contestedPossessions <dbl>, uncontestedPossessions <dbl>,
## #   totalPossessions <dbl>, inside50s <dbl>, marksInside50 <dbl>,
## #   contestedMarks <dbl>, hitouts <dbl>, onePercenters <dbl>,
## #   disposalEfficiency <dbl>, clangers <dbl>, freesFor <dbl>,
## #   freesAgainst <dbl>, dreamTeamPoints <dbl>, rebound50s <dbl>,
## #   goalAssists <dbl>, goalAccuracy <dbl>, ratingPoints <lgl>, ranking <lgl>,
## #   lastUpdated <chr>, turnovers <dbl>, intercepts <dbl>,
## #   tacklesInside50 <dbl>, shotsAtGoal <dbl>, goalEfficiency <lgl>,
## #   shotEfficiency <lgl>, interchangeCounts <lgl>, scoreInvolvements <dbl>,
## #   metresGained <dbl>, clearances.centreClearances <dbl>,
## #   clearances.stoppageClearances <dbl>, clearances.totalClearances <dbl>,
## #   player.playerId <chr>, player.captain <lgl>,
## #   player.playerJumperNumber <int>, player.givenName <chr>,
## #   player.surname <chr>, teamStatus <chr>, team.name <chr>

Deprecating functions

Many functions have been deprecated in favour of the new fetch_* family of functions. These are soft deprecations in that they will still work, but internally just call the newly named fetch_* function. You will get a warning message. In the future, most of these functions will get removed. A full list can be seen on the Changelog but the most used ones will be

  • get_fixture
  • get_match_results
  • update_footywire_stats
  • return_ladder

Bug fixes

There are a few small bug fixes as well. Probably the bigger impacts will be over on the data repository where we’ve re-scraped a bunch of historical data. This mostly affects AFL Tables and Footywire data.

comments powered by Disqus