No that’s the K2 Krabiwe data scrape (4-7 Jan 2024). Sadly not likely to ever be repeated.
WWIILogs participants put a Java app on their desktop. If they ran it with their game, it captured the data log from their port :8111 feed as the game ran and when the game ended closed the file and shipped the result to the project server.
This gave you all significant data for that game for all players, not just the participant, in the same text file. This allowed you to aggregate things like which vehicle killed which vehicle the most, and unlike TS worked across all PvP modes. It also allowed you to see real change over time, as you had a real date stamp for each player action, which TS does not (only the date stamp for when you choose to update your or someone else’s profile).
I wonder how many of those players go to Thunderskills each day to update their player records? Because the only date it captures is the date updated, everybody else’s data is only as good as the last day they did so (or someone does it for them).
Shame, because thats as far as im concerned probably the way to go for gathering meaningfull statistics. Then again you will rather quickly reach many many gigabytes and terrabytes of data.
Yeah that would probably be an upgrade to TS in some ways, but would also skew harder towards those who are very interested in their stats. Thats one of the nice things about TS in that it polls independantly of someone having to run a piece of software.
There’s however many hunderds of thousands of matches logged so a fair amount.
@DiamondLag
The only thing it’s biased towards is people in squadrons, which I don’t believe can be tied to any sort of skill level beyond excluding the newer players that might not be in one.
If I were to add you to Thunderskill, that would add an experienced player, but it would also add the entirety of your squadron, which is 70 players, your previous squadron another 4 players so that’s 74 additional players, and all those players have their own history of squadrons, and those players in there will add even more to where you probably end up having a very large chunk of the playerbase added.
You have to be registered, and you have to be logging in and updating frequently. It’s not really well understood, because the site was abandoned by its maker’s years ago. Vehicle data is supposedly only for the last month of member updates.
So if you check Thunderskills regularly, your personal influence on the stats for that vehicle is up to date. But If you’ve logged in last August once and login again now, the vehicle stats will auto-update with all the results from all the games you’ve played since last August, as if they all just happened today. Thunderskill has zero sense of when any of it’s games were actually played, which is probably it’s biggest problem.
Does’t seem really accurate, how many actually use that Webside? Im hanging around quite a bit in the Steam Communityhub and here and maybe saw it like once mentioned and when i checked it out i found the Web-design to ugly to bother with it more lol doesn’t even shows any of the newer Nations like China or Sweden from what i can see
Maybe would be cool if Gaijin released the data they got since they clearly must collect it to balance that stuff out
Its fine for some things. Its accurate afaik in what polls, but what it polls and how it does it gives you an indication on what you can learn from it.
Dataset analysis is a pain, and i can see why people think the way they do about Thunderskill.
im not that familiar with the exact details on how Thunderskill gathers its data, but i will say that doesnt really seem to work the way you imply. At least in terms of dating of various members of my squadron and when they got polled compared to me.
And i dont have an account on there, and just checking some of the names in my squadron it shows it doesnt actually update them at all, and there are some that go 6+ months with no updates.
Like Thunderskill is a cool concept, its just not as applicable as you think it is.
It seems valid enough to me anyways, obviously a vehicle with 5 games played isn’t useful, but you think 17.000 games in a M1128 is not going to show a similar outcome as what Gaijin is looking at? Or at least what their algorithm is looking at as surely no human ever looks at it.
Comes down to how the polling is conducted. The way TS does it will skew the results in some way.
What are you actually looking to know?
I think there are perfectly fine questions one could answer using the dataset TS provides. Like asking “What are currently the “Meta vehicles” or “Meta linups” for a given BR range”. Something like that i see no issue Thunderskill answering (at least how i understand Thunderskill).
Yeah this isn’t actually true (what Miragen said). The only person it adds is you. And the only data it accumulates is the differentials from the last update when you or someone else presses your Update button.
Squadron listings are actually incredibly bad, because it’s an abandoned zombie site there hasn’t been a check against real squadron membership since December 2022, so anyone on there who’s changed squadrons is currently listing and counting on the stats of both.
Also you can’t add another player to Thunderskill. You can only update records of other people who have previously granted Thunderskill their Gaijin account details (which are also all sitting on an abandonware server in Russia now, btw…2FA people.)
It’s lower-effort, but it’s still only working from the data provided when you or someone else hits the Update button on your player record on their website. It has no capacity to collect anything other than the deltas in your service record from the last time the Update button was pushed.
Yeah its a trade-off, but so are most things. Like scraping every battle is the way to replicate any data Gaijin may have, but its also prohibitally expensive, and you actually have to sort it at the end of it. Which is a major project on its own.
And they’ve locked it down further since the K2 heist. I think (no proof) that’s why Gszabi’s attempt to do something similar a week ago failed out. Blocked the bulk download of all replay files over 7 min long.