tf2statr v0.1.0 Released
This package tries to connect much of the TF2 ecosystem to explore basic statistics and open it up to statistical models available in R. The main driving force behind this project is actually the abundance of stats available due to automatically generated logs of each match. However, the querying this information is easiest to do right after a match or on a case by case basis. For insight on a whole period of time, over certain play styles, or with a certain level of competition, the process is tedious and prone to error.
The package is structured around the main function of grabbing the actual logs from logs.tf. This uses the JSON API provided by logs.tf so all the raw values that are recorded can be examined as shown below.
library(tf2statr)
map1UBFi55 <- "999065"
log <- getLog(map1UBFi55)
names(log)
> [1] "teams" "length" "players" "names" "rounds"
> [6] "healspread" "info" "killstreaks" "table"
colnames(log$table)
> [1] "team" "kills" "deaths" "assists"
> [5] "suicides" "kapd" "kpd" "dmg"
> [9] "dmg_real" "dt" "dt_real" "hr"
> [13] "lks" "as" "dapd" "dapm"
> [17] "ubers" "drops" "medkits" "medkits_hp"
> [21] "backstabs" "headshots" "headshots_hit" "sentries"
> [25] "heal" "cpc" "ic" "daphr"
> [29] "hr_ratio" "dmg_realpdmg" "num_streaks"
log$table[,c("kills", "dmg")]
> kills dmg
> [U:1:31470213] 27 7503
> [U:1:10403381] 24 8215
> [U:1:2431135] 9 7405
> [U:1:25560164] 1 192
> [U:1:80143504] 15 6703
> [U:1:39592949] 15 8054
> [U:1:41611618] 14 7308
> [U:1:78056263] 0 266
> [U:1:25503021] 18 7212
> [U:1:40142806] 15 6786
> [U:1:47737701] 25 6692
> [U:1:47655495] 20 6738
Most of what you want to see in a log is compiled in the matrix ‘log$table’. Most of the columns are straight from the raw logs except for the dmg per amount of heals (daphr), percent of total heals received (hr_ratio), the ratio of the damage received by the opponent versus the dmg inflicted (dmg_realpdmg) and the number of kill streaks above 2 kills (num_streaks). The complicated variables “number of donks” are still found in the ‘log$player$[player_name]’. Medic stats are still in ‘medicstats’ though some of them has been integrated into the table. For a full list variables and their definitions look in ‘man/i55.Rd’
This seems dandy except for the row names being the steamID3. Its not very intuitive and I’m pretty sure no one remembers their steam IDs. One solution is to used the usernames used in the match that are still kept in “log$name”. This changes after games at the whim of the player so it would be hard to do a consistent analysis across a long period of time like a season. Players often have all sorts of extra characters, spaces or tags that could easily make R very unhappy. The current solution is to use the redirect from the steam id3 profile to get the custom profile for a player. These are relatively constant only a small number of people don’t have one.
logwnames <- getLog(map1UBFi55, useAltNames = TRUE)
log$table[,c("kills", "dmg")]
> kills dmg
> [U:1:31470213] 27 7503
> [U:1:10403381] 24 8215
> [U:1:2431135] 9 7405
> [U:1:25560164] 1 192
> [U:1:80143504] 15 6703
> [U:1:39592949] 15 8054
> [U:1:41611618] 14 7308
> [U:1:78056263] 0 266
> [U:1:25503021] 18 7212
> [U:1:40142806] 15 6786
> [U:1:47737701] 25 6692
> [U:1:47655495] 20 6738
Much better. These are stored in a csv file in ‘data/playerDict.csv’ and can be manually changed if the names aren’t pleasing. Using this, the third party utility only gets called when a new player is encountered.
How do you get the log IDs though? Logs.tf has a sequential ID scheme that on the basic level can be gathered in 2 different ways:
- Using the Logs.tf JSON search API
- Scraping comp.tf pages
The second one looks bad because it would seem to warrant needless stress to the comp.tf website. However, there is a archiving mechanism that saves the log IDs from the events in a JSON format in ‘data/eventArchive’. This means that to analyze a particular event it would only require 1 page hit for use whenever one wants the data. This was purposely made easily accessible and readable so that people can share these archives and make it so that the archives spans all TF2 events.
Here they are in action:
getLogIDsJSON(player = "[U:1:36568047]")
> Warning in parseJSONSearch(jsonSearch, num): More results may be online
> since you requested 10 matches and got 10 matches
> [1] 1025340 1025269 1025208 1025142 1025107 1024372 1024059 1023981
> [9] 1023921 1023072
getLogIDsComptf("Insomnia52", shReDownload = FALSE)[1:5]
> Upper_Round_1_match1_map1 Upper_Round_1_match2_map1
> "426348" "426353"
> Upper_Round_1_match3_map1 Upper_Round_1_match4_map1
> "426329" "426324"
> Upper_Round_1_match5_map1
> "426327"
Normally if you don’t specify if you want to redownload the data from comp.tf and its in the archive already, it will prompt you if you want to redownload. If its a new event it automatically scrapes the data and saves it into the archive.
So you now have a bunch of logs but want to do some analysis over all of them. Here is where “aggregateStats” come in. You can do per match statistics over any list of matches. Lets try this over i55 logs which are included as part of the package.
data("i55")
aggregateStats(i55)[,c("kills", "dmg", "gp")]
> kills dmg gp
> [U:1:74858077] 5.5000000 2101.5000 2
> mophead127 6.0000000 2032.5000 2
> [U:1:119224189] 6.5000000 2626.5000 2
> PsychoClarkie 0.0000000 79.5000 2
> HobNobsTF2 6.5000000 2688.0000 2
> youkn0w 6.0000000 2702.5000 2
> ashcp 15.5000000 4579.3750 17
> b4nny 18.3750000 7483.8125 17
> Duwatna 12.6875000 6224.6875 16
> sdot_ 0.5000000 131.6250 16
> -1 16.5625000 6178.7500 17
> zxzxzsxzsxzsxzzxazxaz 21.5000000 6120.8125 17
> yak404 20.8333333 8228.3333 6
> MaxPayneNG 19.0000000 6485.6667 6
> oskartfol 22.5000000 6709.3333 6
> Drackk 29.0000000 9170.0000 3
> mici- 0.8333333 229.0000 6
> henghast 18.5000000 4876.5000 3
> moursifanshawe 14.5000000 4654.0000 3
> stoyanovich 20.0000000 5728.5000 3
> [U:1:63821331] 19.0000000 6995.1667 6
> Nuseman 16.0000000 6480.5000 3
> turbo264 0.5000000 152.5000 3
> MattCV 16.1666667 4455.0000 6
> [U:1:81039113] 2.0000000 578.5000 2
> Maxi7 10.5000000 3614.5000 2
> Zafus 13.0000000 3563.0000 2
> bigwizard4 15.0000000 6332.0000 2
> DaSwift 14.5000000 5112.5000 2
> DanFromGrimsby 5.5000000 2977.0000 2
> kaidussssss 16.1666667 7251.0000 13
> [U:1:24249787] 0.5000000 122.1667 12
> vik6 16.5833333 5536.6667 13
> [U:1:3790465] 18.1666667 7861.3333 13
> kaptains 19.4166667 7179.0000 13
> Hafficool 17.0000000 5184.5000 13
> eepily 9.5000000 4832.0000 2
> CaptainGinger 1.0000000 241.0000 2
> Gubbinsss 13.0000000 4166.0000 2
> sircupcake 11.5000000 5590.5000 2
> olearry 10.0000000 3769.5000 2
> fribs 4.5000000 2307.5000 2
> fgjhkslfjhaslekfjhaleskj 15.5000000 6496.7500 12
> Bdonski 11.2500000 6382.5833 12
> TheFragile 0.6666667 199.8333 12
> xenema 16.7500000 4435.0833 12
> sorando 11.9166667 5544.5000 12
> shrugger94 14.3333333 4792.4167 12
> sweetfeetpete 11.5000000 6175.5000 2
> bagsbanijs 1.5000000 428.0000 2
> neotf2 9.5000000 4053.5000 2
> Sepulch 16.0000000 5829.0000 2
> En7ize 11.0000000 3302.5000 2
> irppa 17.5000000 4134.5000 2
> apubber 0.5625000 173.8125 13
> starkiepoo 17.9375000 5667.8750 13
> [U:1:41611618] 16.1250000 5851.5000 13
> tek36 19.1875000 7079.1250 13
> Flippy_AA 17.1875000 5828.0000 13
> ryb 11.9375000 5928.8125 13
> McLime 21.5000000 7266.5000 4
> turtletickler8 20.4000000 6241.4000 5
> Alfierulz 15.0000000 6201.0000 5
> Pheaassexynipples 18.2500000 6921.0000 4
> IceCubeFarmer 10.2500000 4029.0000 4
> Raptor00X 17.5000000 6247.0000 4
> warpzy 13.0000000 4145.2500 4
> stormingsheep 20.2000000 5680.4000 5
> [U:1:106391137] 15.4000000 6942.4000 5
> fishage 0.5000000 329.0000 4
> thatspudd 17.8000000 8414.4000 5
> kr4tos 2.8000000 558.8000 5
> leticus 14.5714286 6114.5714 7
> Smirre 12.7142857 6353.1429 7
> ondkaja 16.5714286 7416.0000 7
> vani- 0.5714286 181.1429 7
> DenniaPlz 15.4285714 4566.1429 7
> philmac 17.0000000 5326.6667 2
> danAnderson 16.0000000 6592.3333 2
> [U:1:38796065] 21.0000000 8971.6667 2
> [U:1:27767074] 19.5714286 5405.2857 7
> [U:1:114365291] 19.3333333 6799.3333 2
> craigcoleman 0.0000000 164.6667 2
> [U:1:1042154] 21.3333333 5769.0000 2
> serotone_alt_420 0.2222222 191.5556 6
> totoriffic 15.6666667 4610.1111 6
> WARHURYEAH 13.6666667 6769.8889 6
> The7alfa 14.8888889 5904.5556 6
> amsii_ 14.0000000 6404.8889 6
> [U:1:15933145] 18.2222222 5175.7778 6
Here you have a gigantic table with all the players who played with per game stats for each of the columns in log$table. There is another column include here called “games played (gp)” which each match is averaged against. Now averages are cool, but what if you want another function like standard deviation or a random user made function? Both can be done as shown below.
aggregateStats(i55, sd)[1:5,c("kills", "dmg", "gp")]
> kills dmg gp
> [U:1:74858077] 2.1213203 929.84542 2
> mophead127 2.8284271 193.04015 2
> [U:1:119224189] 0.7071068 605.99051 2
> PsychoClarkie 0.0000000 30.40559 2
> HobNobsTF2 0.7071068 1657.45830 2
bestStat <- function(x, na.rm = TRUE){ x[1] + x[2] }
aggregateStats(i55, bestStat)[1:5,c("kills", "dmg", "gp")]
> kills dmg gp
> [U:1:74858077] 11 4203 2
> mophead127 12 4065 2
> [U:1:119224189] 13 5253 2
> PsychoClarkie 0 159 2
> HobNobsTF2 13 5376 2
This opens the data set to all kinds of possibilities ranging from detecting trends in the meta, to machine learning the results for next season.
rmarkdown files for this post can be found at: https://github.com/sidjai.github.io/_drafts/tf2statr010.rmd