128 ticks per second servers. (And lo, suddenly the article's thesis is inherently clear.)
A "tick", or an update, is a single step forward in the game's state. UPS (as I'll call it from here) or tick rate is the frequency of those. So, 128 ticks/s == 128 updates per sec.
That's a high number. For comparison, Factorio is 60 UPS, and Minecraft is 20 UPS.
At first I imagined an FPS's state would be considerably smaller, which should support a higher tick rate. But I also forgot about fog of war & visibility (Factorio for example just trusts the clients), and needing to animate for hitbox detection. (Though I was curious if they're always animating players? I assume there'd be a big single rectangular bounding box or sphere, and only once a projectile is in that range, then animations occur. I assume they've thought of this & it just isn't in there. But then there was the note about not animating the "buy" portion, too…)
When Battlefield 4 launched it had terrible network performance. Battlefield 3 wasn't great but somehow BF4 was way worse. Turned out while clients sent updates to the server at 30 Hz, the server sent updates back only at 10 Hz[1].
This was the same as BF3, but there were also some issues with server load making things worse and high-ping compensation not working great.
After much pushback from players, including some great analysis by Battle(non)sense[2] that really got traction, the devs got the green light on improving the network code and worked a long time on that. In the end they got high-tickrate servers[3][4], up to 144Hz though I mostly played on 120Hz servers, along with a lot of other improvements.
The difference between a 120Hz server and a 30Hz was night and day for anyone who could tell the difference between the mouse and the keyboard. Problem was that by then the game was half-dead... but it was great for the 15 of us or so still playing it at that time.
Also for comparison, the Runescapes (both RS3 and Oldschool Runescape) have a 0.6 tick/second system (100 ticks/minute). It works rather well for these games, which I guess highlights that some games either a) can get away with high latencies depending on their gameplay mechanics, or b) will evolve gameplay mechanics based on the inherent limitations of their engines/these latencies. RS3 initially leaned into the 0.6s tick system (which is a remnant of its transitions from DeviousMUD to Runescape Classic to RS2) and eventually developed an ability-based combat system on top of what was previously a purely point-and-click combat system, whereas OSRS has evolved new mechanics that play into this 0.6s tick system and integrate seamlessly into the point-and-click combat system.
Having played both of these games for years (literally, years of logged-in in-game time), most FPS games with faster tick systems generally feel pretty fluid to me, to the point where I don't think I've ever noticed the tick system acting strange in an FPS beyond extreme network issues. The technical challenges that go into making this so are incredible, as outlined in TFA.
They're right, when they said 0.6s tick they mean there's a tick every 0.6 seconds.
It's important to some players because you can get some odd behaviour out of the game by starting multiple actions on the same tick or on the tick after you started a different action. It's ridiculous click intensive but you can get weird benefits like cutting the time to take an action short or get xp in 2 skills at once.
> I assume there'd be a big single rectangular bounding box or sphere, and only once a projectile is in that range, then animations occur.
Now that's a fun one to think about. Hitscan attacks are just vectors right? So would there be some perf benefit to doing that initial intersection check with a less-detailed hitbox, then running the higher res animated check if the initial one reports back as "Yeah, this one could potentially intersect"? Or is the check itself expensive enough that it's faster to just run it once at full resolution?
This is the basis for basically every physics engine in some form of another. Collision is divided into the "broad phase" pruning step that typically uses the bounding box of an object and a "narrow phase" collision detection step that uses the more detailed collision primitives
Nowadays it's more like 20 players and 80 bots, so a lot less networking stuff going on, and the bot AI is so basic that I doubt it has a significant impact on server performance unless it's very badly implemented.
In my experience you can do reasonable bots cheaper than sending network updates to a regular player. Thats just straight up, but you can also tick their logic way less than every update if you want. Even more way-less if nobody is near them or spectating them.
Also some stuff you might want to calculate for them to use to make decisions can be shared among all bots.
An advantage to bots is that the server can trust them. It doesn't need to do any verification of input since it can trust that they aren't cheating (except they probably are...).
CS2 is mostly 64 tick from what I understand. The "sub-tick" stuff is timestamping actions that happen on a local frame before the next tick. So in theory the client feels perfectly responsive and the server can adjust for the delta between your frame and the tick.
In practice it seems to have been an implementation nightmare because they've regularly shipped both bugs and fixes for the "sub-tick" system.
The netcode in CS2 is generally much worse than CSGO or other source games. The game transmits way more data for each tick and they disabled snapshot buffering by default. Meaning that way more players are experiencing jank when their network inevitably drops packets.
That's very interesting. The CS2 netcode always felt a little brittle and janky to me, but I could never pin point exactly what was causing the issues. Especially since other games mostly run fine for me.
I also remember reading a few posts about their new subtick system but never put two and two together. Hopefully they keep refining it.
Worth noting that part of the packet size appears to be due to animation data, which they’ve begun the process of transitioning to a more efficient system. [0]
With that being said: totally agree on the netcode.
It's actually incredible how CSGO was such a great game and it's been replaced (not deprecated, replaced!) by CS2 which is still inferior over 2 years after the launch.
CS2 is 64 tick under the hood, with interpolation between the ticks. In the beta, server operators could modify the tick rate by patching the server binary, but when that revealed inconsistencies (which was meant to be avoided with the "subtick" system), they hard coded the client side tick rate to 64 [0].
Client update was measured to be 73, not quite matching the 128 server tick and update rate. Maybe it changed in the last 5 years. CSGO private servers also ran with 128 tick rate.
Fallout 76, for example, lets you see where other players are facing/looking at, or where are they pointing their guns even if they don't fire. The models are animated according to the input of their users.
I don't think its ticks per second are great, because the game is known for significant lag when more than a dozen of players are in the same place shooting at things.
At any given time, ~50 of those games are going to be in the buy phase. Players will be purchasing equipment safely behind their spawn barriers and no shots can hurt them. We realized we don’t even need to do any server-side animation during the buy phase, we could just turn it off.
That explains the current trend of "online" video game that is so annoying: For 10 minutes of play, you have to wait for 10 minutes of lobby time and forced animations, like end game animations.
On BO6 it kills me, you just want to play, sometimes you don't have more than 30 minutes for a quick video game session, and with the current games, you always have to wait a very very long time. Painfully annoying.
This is not equivalent to "lobby time or end game animations" in other games.
In Valorant (similar to Counter Strike), at the start of the game you have 60 seconds to buy your weapons and abilities for the round. Valorant/CS is typically a best-of-13, and before each round is a 60 second "buy" period.
In CS you can leave the buy zone immediately. I don't necessarily believe that Valorant's decision to fence players in their spawn for the first minute during the buy period is simply to save on server costs, especially because they realized that optimization possibility after the fact. Being able to buy your weapons quickly may have an element of skill, but it doesn't make for particularly interesting gameplay. They may have just decided that they'd rather level this part of the playing field so people can focus on the core tactical FPS gameplay.
Counter-Strike: Global Offensive was also able to handle 128 TPS just fine. They just chose to never implement it in official matchmaking (64 TPS). It did work very smoothly on community servers.
Counter-Strike 2 implements a controversial "sub tick" system on top of 64 TPS. It is not comparable to actual 128 TPS, and often worse than standard 64 TPS in practice.
Lots of things work fine when you throw twice as many dollars at them. It’s not a matter of it working or not. It’s a matter of economics.
Most game servers are single threaded because the goal is to support the maximum number of players per dollar.
A community server doesn’t mind throwing more compute dollars to support more players or higher tick rate. When you have one million concurrent players - as CounterStrike sometimes does - the choice may be different.
Sure, twice as many dollars in the immediate term. In the context of a tick rate being 10 or 20 years old, it's more like deciding whether you can host 50x or 100x as many players per dollar of infrastructure. The upside of a higher tick rate grows as computers and connections improve and the downside shrinks semi-exponentially.
I vaguely remember Counter-Strike Source servers running at 33, 66, or 100 tick. My high school gaming clan was called "10tik", poking fun at the ancient Pentium box that I ran the CSS server on.
You can mess with the code all day long, but you're not getting away from raw latency.
The modern matchmaking approach groups people by skill not latency, so you get a pretty wild mix of latency.
It feels nothing like the old regional servers. Sure the skill mix was varied, but at least you got your ass handed to you in crisp <10ms by actual skill. Now it's all getting knife noscoped around a corner by a guy that rubberbanded 200ms into the next sector of the map already while insulting your mom and wearing a unicorn skin
Good thing they thought of that. Disclaimer: I was at Riot During some of the Valorant dev cycle and the stated goal in this tech blog [0] was a huge goal (keeping latency < 35ms).
This was only really doable because Riot has invested significantly in buying dark fiber and peering at major locations worldwide [1][2]
I miss playing on consistent <10ms servers in the CS 1.6 days.
The Houston/Dallas/Austin/San Antonio region was like a mini universe of highly competitive FPS action. My 2mbps roadrunner cable modem could achieve single digit ping from Houston to Dallas. Back in those days we plugged the modem directly into the gaming PC.
> The modern matchmaking approach groups people by skill not latency
I work at a game studio and something I have seen is that nobody is on wired anymore. You are poweruser if you are on wired. Significantly the 99% of users will be on mobile or wifi and be 10ms to first hop or two hop.
very interesting read, it seems like management/engineering/vendors were all willing to get on the same page to hit the frame budget. especially the bit about profiling every line of game code into an appropriate bucket - sounds like a lot of work which paid off handsomely.
If you just make a list of “performance tweaks” you might learn about in, say, a game dev blog post on the internet, and execute them without considering your application’s specific needs and considerations, you might hurt performance more than you help it.
You'll never get a modern FPS gameserver with good performance written in a GC language. Erlang is also pretty slow, it's Python like performance. Very far from C#, Go and Java.
The other reason is that the client and the server have to be written in the same language.
> The other reason is that the client and the server have to be written in the same language.
This isn't true at all.
Sure, it can help to have both client and server built using the same engine or framework, but it's not a hard requirement.
Heck, the fact that you can have browser-based games when the server is written in Python is proof enough that they don't need to be the same language.
I am currently doing this! Working on an MMO game server implemented in Elixir. It works AMAZING and you get so much extra observability and reliability features for FREE.
I don't know why its not more popular. Before I started the project, some people said that BeamVM would not cut it for performance. But this was not true. For many types of games, we are not doing expensive computation on each tick. Rather its just checking rules for interactions between clients and some quick AABB + visibility checks.
I've been told that Erlang is somewhat popular for matchmaking servers. It ran the Call of Duty matchmaking at one point. Not the actual game servers though - those are almost certainly C++ for perf reasons.
I distinctly remembered that Eve Online was in Erlang, went to go find sources and found out I was 100% wrong. But I did find this thread about a game called "Vendetta Online" that has Erlang... involved, though the blog post with details seems to be gone. Anyway, enjoy! http://lambda-the-ultimate.org/node/2102
Network connection, lobby, matchmaking, leaderboards or even chats, yes. But the actual simulation, probably not for fast paced twitchy shooter.
Also not just for performance reasons, I wouldn’t call BeamVM hard realtime, but also for code. Your game server would usually be the client but headless (without rendering). Helps with reuse and architecture.
In the case of Call of Duty: Black Ops 1. Thee matchmaking + leaderboards system was implemented by DemonWare (3rd party) in Erlang.
Erlang actually has good enough performance for many types of multiplayer games. Though you are correct that it may not cut it for fast paced twitch shooters. Well...I'm not exactly sure about that. You can offload lots of expensive physics computations to NIF's. In my game the most expensive computation is AI path-finding. Though this never occurs on the main simulation tick. Other processes run this on their own time.
The biggest hurdle to a game server written entire on the BEAM is the GC. GC pauses just take too much time, and when you need to get out (for example) 120 updates per second, you can't afford it. Even offloading stuff to C or C++ does not save you, because you either have to use the GC, do a copy, or both.
Game servers typically use very cheap memory allocation techniques like arenas and utilize DOD. It's not uncommon for a game server simulation to be just a bunch of arrays that you grow, never shrink, and then reset at the end of the game.
> In VALORANT’s case, .5ms is a meaningful chunk of our 2.34ms budget. You could process nearly a 1/4th of a frame in that time! There’s 0% chance that any of the game server’s memory is still going to be hot in cache.
This feels like an unideal architectural choice, if this is the case!?
Sounds like each game server is independent. I wonder if anyone has more shared state multi-hosting? Warm up a service process, then fork it as needed, so there's some share i-cache? Have things like levels and hit boxes in immutable memfd, shared with each service instance, so that the d-cache can maybe share across instances?
With heartbleed et al, a context switch probably has to totally burn down the caches now a days? So maybe this wouldn't be enough to keep data hot, that you might need a multi-threaded not multi-process architecture to see shared caching wins. Obviously I dunno, but it feels like caches are shorter lived than they used to be!
I remember being super hopeful that maybe something like Google Stadia could open up some interesting game architecture wins, by trying to render multiple different clients cooperatively rather than as individual client processes. Afaik nothing like that ever emerged, but it feels like there's some cool architecture wins out there & possible.
It does sound like each server is its own process. I think you're correct that it would be a little faster if all games shared a single process. That said, then if one crashed it'd bring the rest down.
This is one of those things that might take weeks just to _test_. Personally I suspect the speedup by merging them would be pretty minor, so I think they've made the right choice just keeping them separate.
I've found context switching to be surprisingly cheap when you only have a few hundred threads. But ultimately, no way to know for sure without testing it. A lot of optimization is just vibes and hypothesize.
Sub tick is probably more accurate overall but I do think the cs2 animation netcode is crap and hides a lot of the positives. Hopefully moving to Animgraph 2 will help that, who knows
In general, Valve designs software that is incomparably better than Riot. Compare the League of Legends client to the Dota 2 game (which doesn't even have a client/game distinction), for instance - the quality gap is massive in favor of Valve.
This is from 2020. Valve wanted to be smart and invented a new "subtick" system in 2023 which isn't as good as 128 tick. To make things worse, CS is a paid game, not free like Valorant, and makes probably much more money. They seemingly just don't care enough about the problem to solve it correctly. That or there is more work to be done on subtick to make it work better than 128.
Nowadays Counter Strike 2 is free to play, although a paid prime upgrade is almost required if you want to play decent matches with less cheaters. FACEIT requires prime status too.
CSGO could do 128 tick, Valve just doesn't want to pay for it, but you can easily find private hosted servers with 128 tick. Riot did put in a lot of work to get it down so much though.
> CS is a paid game, not free like Valorant, and makes probably much more money
(Veering offtopic here) Remember that Valve invented the free-to-play business model when they made TF2 free. As Gabe Newell said in some interview long ago, they made more money from TF2 after it went F2P ("sell more hats!")
Point being, being a paid vs free game is largely irrelevant to the profitability & engineering budget.
That said, I'm not sure why you say CS is a paid game. It is also free-to-play. Is some playable content locked behind a paywall?
As a lifelong Valve/CS fan, I've been so disappointed with subtick. It was pitched as generational evolution to the games netcode. Yet years later they're still playing catchup to what CS:GO provided..
Hopefully competition from Valorant and others puts more pressure to make things happen at Valve.
There’s a long history of subtick bugs that have been identified and patched over the years. CS2 still isn’t quite as stable as 128-tick CS:GO (perhaps benefitting from a decade of patches and simpler architecture)
But animations are now lerped after each 4 frames. Do tickrate is 32 with interpolation. Not sure if sudden direction changes now might result in ghost hits. Some hardcore quake fans probably know the answer.
This post reads less like an engineering deep dive and more like a Xeon product brochure that wandered into a video game blog. They casually name-drop every Intel optimization short of tattooing "Hyperthreaded" on their foreheads.
well of course they would. they bought all intel hardware. and they are making one of the most perfromant multiplayer servers ever. they should be mentioning every optimization possible. if they had amd thread ripper servers they would mention all those features too.
128 ticks per second servers. (And lo, suddenly the article's thesis is inherently clear.)
A "tick", or an update, is a single step forward in the game's state. UPS (as I'll call it from here) or tick rate is the frequency of those. So, 128 ticks/s == 128 updates per sec.
That's a high number. For comparison, Factorio is 60 UPS, and Minecraft is 20 UPS.
At first I imagined an FPS's state would be considerably smaller, which should support a higher tick rate. But I also forgot about fog of war & visibility (Factorio for example just trusts the clients), and needing to animate for hitbox detection. (Though I was curious if they're always animating players? I assume there'd be a big single rectangular bounding box or sphere, and only once a projectile is in that range, then animations occur. I assume they've thought of this & it just isn't in there. But then there was the note about not animating the "buy" portion, too…)
When Battlefield 4 launched it had terrible network performance. Battlefield 3 wasn't great but somehow BF4 was way worse. Turned out while clients sent updates to the server at 30 Hz, the server sent updates back only at 10 Hz[1].
This was the same as BF3, but there were also some issues with server load making things worse and high-ping compensation not working great.
After much pushback from players, including some great analysis by Battle(non)sense[2] that really got traction, the devs got the green light on improving the network code and worked a long time on that. In the end they got high-tickrate servers[3][4], up to 144Hz though I mostly played on 120Hz servers, along with a lot of other improvements.
The difference between a 120Hz server and a 30Hz was night and day for anyone who could tell the difference between the mouse and the keyboard. Problem was that by then the game was half-dead... but it was great for the 15 of us or so still playing it at that time.
[1]: https://www.reddit.com/r/battlefield_4/comments/1xtq4a/battl...
[2]: https://www.youtube.com/@BattleNonSense
[3]: https://www.reddit.com/r/battlefield_4/comments/35ci2r/120hz...
[4]: https://www.reddit.com/r/battlefield_4/comments/3my0re/high_...
Also for comparison, the Runescapes (both RS3 and Oldschool Runescape) have a 0.6 tick/second system (100 ticks/minute). It works rather well for these games, which I guess highlights that some games either a) can get away with high latencies depending on their gameplay mechanics, or b) will evolve gameplay mechanics based on the inherent limitations of their engines/these latencies. RS3 initially leaned into the 0.6s tick system (which is a remnant of its transitions from DeviousMUD to Runescape Classic to RS2) and eventually developed an ability-based combat system on top of what was previously a purely point-and-click combat system, whereas OSRS has evolved new mechanics that play into this 0.6s tick system and integrate seamlessly into the point-and-click combat system.
Having played both of these games for years (literally, years of logged-in in-game time), most FPS games with faster tick systems generally feel pretty fluid to me, to the point where I don't think I've ever noticed the tick system acting strange in an FPS beyond extreme network issues. The technical challenges that go into making this so are incredible, as outlined in TFA.
100ticks/minute is 1.666... ticks per second, not 0.6.
They're right, when they said 0.6s tick they mean there's a tick every 0.6 seconds.
It's important to some players because you can get some odd behaviour out of the game by starting multiple actions on the same tick or on the tick after you started a different action. It's ridiculous click intensive but you can get weird benefits like cutting the time to take an action short or get xp in 2 skills at once.
Whoops, bit of a slip there. It is 100 ticks per minute, or 1 tick every 0.6 seconds. I was wrong in the first description there.
> I assume there'd be a big single rectangular bounding box or sphere, and only once a projectile is in that range, then animations occur.
Now that's a fun one to think about. Hitscan attacks are just vectors right? So would there be some perf benefit to doing that initial intersection check with a less-detailed hitbox, then running the higher res animated check if the initial one reports back as "Yeah, this one could potentially intersect"? Or is the check itself expensive enough that it's faster to just run it once at full resolution?
Either way, this stuff is engineering catnip.
This is the basis for basically every physics engine in some form of another. Collision is divided into the "broad phase" pruning step that typically uses the bounding box of an object and a "narrow phase" collision detection step that uses the more detailed collision primitives
And Fortnite is allegedly 30 ticks per second.
Fortnite has to handle 100 players on the same process, very different from 128hz @10players max.
Nowadays it's more like 20 players and 80 bots, so a lot less networking stuff going on, and the bot AI is so basic that I doubt it has a significant impact on server performance unless it's very badly implemented.
As long as some games (e.g. tournaments, ranked unreal) need to be mostly human, the software still needs to be able to handle it.
But it wouldn’t surprise me too much if those use a higher tier of hardware.
AI is more expensive than a regular player.
In my experience you can do reasonable bots cheaper than sending network updates to a regular player. Thats just straight up, but you can also tick their logic way less than every update if you want. Even more way-less if nobody is near them or spectating them.
Also some stuff you might want to calculate for them to use to make decisions can be shared among all bots.
An advantage to bots is that the server can trust them. It doesn't need to do any verification of input since it can trust that they aren't cheating (except they probably are...).
That's not an excuse to give up on performance. The map is also much bigger which spreads people out.
Apex Legends is at 20 IIRC (My memory is from 2-3 years back on this one tho)
CSGO was at 64 for the standard servers and 128 for Faceit (IIRC CS2 is doing some dynamic tick schenanigans unless they changed back on that)
Overwatch is I think at 60
CS2 is mostly 64 tick from what I understand. The "sub-tick" stuff is timestamping actions that happen on a local frame before the next tick. So in theory the client feels perfectly responsive and the server can adjust for the delta between your frame and the tick.
In practice it seems to have been an implementation nightmare because they've regularly shipped both bugs and fixes for the "sub-tick" system.
The netcode in CS2 is generally much worse than CSGO or other source games. The game transmits way more data for each tick and they disabled snapshot buffering by default. Meaning that way more players are experiencing jank when their network inevitably drops packets.
That's very interesting. The CS2 netcode always felt a little brittle and janky to me, but I could never pin point exactly what was causing the issues. Especially since other games mostly run fine for me.
I also remember reading a few posts about their new subtick system but never put two and two together. Hopefully they keep refining it.
Worth noting that part of the packet size appears to be due to animation data, which they’ve begun the process of transitioning to a more efficient system. [0]
With that being said: totally agree on the netcode.
[0]: https://old.reddit.com/r/GlobalOffensive/comments/1fwgd59/an...
It's actually incredible how CSGO was such a great game and it's been replaced (not deprecated, replaced!) by CS2 which is still inferior over 2 years after the launch.
CS2 is 64 tick under the hood, with interpolation between the ticks. In the beta, server operators could modify the tick rate by patching the server binary, but when that revealed inconsistencies (which was meant to be avoided with the "subtick" system), they hard coded the client side tick rate to 64 [0].
[0]: https://twitter.com/thexpaw/status/1702277004656050220
Its strange Apex is 20 ticks… it is often as fasr as FPS get. Is it because of number of players at same map that is a lot higher than in other games?
And a few more players.
Client update was measured to be 73, not quite matching the 128 server tick and update rate. Maybe it changed in the last 5 years. CSGO private servers also ran with 128 tick rate.
https://www.youtube.com/watch?v=ftC1Rpi8mtg
OSRS plays on 0.6 TPS... or 100 ticks per minute kind of funny how different that is.
No, OSRS is 100 ticks per minute which gives 0.6 second ticks, which rounds to 1.667 ticks per second.
Eve Online probably wins the slowest tickrate award with its whopping 1 tick per second.
IIRC it gets even slower during massive battles where there are hundreds/thousands of players on the same server and area
https://wiki.eveuniversity.org/Time_dilation
Yup. As a plugin dev it has its weird quirks but it's quite amazing how the entire time runs at that speed
OSRS players hate when they have to actually play the game. They just want to click the screen every 10 minutes while playing something else.
...what? the game is full of highly technical and demanding challenges which require "tick-perfect" inputs 2-5x per 0.6s game tick.
OSRS is a rhythm game. Fight me.
I don't think anyone would fight you on that. Inferno cheat plugins looked like 100 bpm Guitar Hero back in the day.
me_irl
Fallout 76, for example, lets you see where other players are facing/looking at, or where are they pointing their guns even if they don't fire. The models are animated according to the input of their users.
I don't think its ticks per second are great, because the game is known for significant lag when more than a dozen of players are in the same place shooting at things.
Lag is different than ‘unloaded’ ticks per second.
20-30 is believed commonly by the Fallout 76 community.
We should add 2020 to this. I read this article earlier and thought there had been some updates to the architecture.
This deep dive article is very nice.
That explains the current trend of "online" video game that is so annoying: For 10 minutes of play, you have to wait for 10 minutes of lobby time and forced animations, like end game animations.On BO6 it kills me, you just want to play, sometimes you don't have more than 30 minutes for a quick video game session, and with the current games, you always have to wait a very very long time. Painfully annoying.
This is not equivalent to "lobby time or end game animations" in other games.
In Valorant (similar to Counter Strike), at the start of the game you have 60 seconds to buy your weapons and abilities for the round. Valorant/CS is typically a best-of-13, and before each round is a 60 second "buy" period.
In CS you can leave the buy zone immediately. I don't necessarily believe that Valorant's decision to fence players in their spawn for the first minute during the buy period is simply to save on server costs, especially because they realized that optimization possibility after the fact. Being able to buy your weapons quickly may have an element of skill, but it doesn't make for particularly interesting gameplay. They may have just decided that they'd rather level this part of the playing field so people can focus on the core tactical FPS gameplay.
In CS you have a 20 second+ buy phase where you can't move.
Buy phase is also used for navigating the map on your "side" before you can shoot at the other guys. Also for planning.
In CS you have 15 seconds to buy, which is more than enough for any non-newbie.
It's the idea that if they leave more players idling in a lobby, but period, or animation, that it costs them less.
It's a deceptive way to sell people less game.
> It's a deceptive way to sell people less game.
That's a dumb take. The buying phase is an integral part of the game mode. And the game is free.
They bury/obscure a quite important detail in this article:
| We were still running on the older Intel Xeon E5 processors, ...
| Moving to the more modern Xeon Scalable processors showed major performance gains for our server application
But - I was unable to find any mention in the article as to what processors they were actually comparing in their before/after.
Counter-Strike: Global Offensive was also able to handle 128 TPS just fine. They just chose to never implement it in official matchmaking (64 TPS). It did work very smoothly on community servers.
Counter-Strike 2 implements a controversial "sub tick" system on top of 64 TPS. It is not comparable to actual 128 TPS, and often worse than standard 64 TPS in practice.
Lots of things work fine when you throw twice as many dollars at them. It’s not a matter of it working or not. It’s a matter of economics.
Most game servers are single threaded because the goal is to support the maximum number of players per dollar.
A community server doesn’t mind throwing more compute dollars to support more players or higher tick rate. When you have one million concurrent players - as CounterStrike sometimes does - the choice may be different.
Sure, twice as many dollars in the immediate term. In the context of a tick rate being 10 or 20 years old, it's more like deciding whether you can host 50x or 100x as many players per dollar of infrastructure. The upside of a higher tick rate grows as computers and connections improve and the downside shrinks semi-exponentially.
I vaguely remember Counter-Strike Source servers running at 33, 66, or 100 tick. My high school gaming clan was called "10tik", poking fun at the ancient Pentium box that I ran the CSS server on.
You can mess with the code all day long, but you're not getting away from raw latency.
The modern matchmaking approach groups people by skill not latency, so you get a pretty wild mix of latency.
It feels nothing like the old regional servers. Sure the skill mix was varied, but at least you got your ass handed to you in crisp <10ms by actual skill. Now it's all getting knife noscoped around a corner by a guy that rubberbanded 200ms into the next sector of the map already while insulting your mom and wearing a unicorn skin
Good thing they thought of that. Disclaimer: I was at Riot During some of the Valorant dev cycle and the stated goal in this tech blog [0] was a huge goal (keeping latency < 35ms).
This was only really doable because Riot has invested significantly in buying dark fiber and peering at major locations worldwide [1][2]
[0] - https://technology.riotgames.com/news/peeking-valorants-netc... [1] - https://technology.riotgames.com/news/fixing-internet-real-t... [2] - https://technology.riotgames.com/news/fixing-internet-real-t...
I miss playing on consistent <10ms servers in the CS 1.6 days.
The Houston/Dallas/Austin/San Antonio region was like a mini universe of highly competitive FPS action. My 2mbps roadrunner cable modem could achieve single digit ping from Houston to Dallas. Back in those days we plugged the modem directly into the gaming PC.
> The modern matchmaking approach groups people by skill not latency
I work at a game studio and something I have seen is that nobody is on wired anymore. You are poweruser if you are on wired. Significantly the 99% of users will be on mobile or wifi and be 10ms to first hop or two hop.
very interesting read, it seems like management/engineering/vendors were all willing to get on the same page to hit the frame budget. especially the bit about profiling every line of game code into an appropriate bucket - sounds like a lot of work which paid off handsomely.
If you just make a list of “performance tweaks” you might learn about in, say, a game dev blog post on the internet, and execute them without considering your application’s specific needs and considerations, you might hurt performance more than you help it.
nice.
Great irony of finishing a league game just now where the whole game lagged (for everyone in the game) to find this at the top of HN.
not very ironic at all really, since they're different games?
Same company, different focus
I wonder if any game servers are implemented in Erlang?
You'll never get a modern FPS gameserver with good performance written in a GC language. Erlang is also pretty slow, it's Python like performance. Very far from C#, Go and Java.
The other reason is that the client and the server have to be written in the same language.
> The other reason is that the client and the server have to be written in the same language.
This isn't true at all.
Sure, it can help to have both client and server built using the same engine or framework, but it's not a hard requirement.
Heck, the fact that you can have browser-based games when the server is written in Python is proof enough that they don't need to be the same language.
Browser game don't need performance, I'm talking about AAA online games here, which 99% are built in c++ and the rest in c#.
You can just turn off GC for the duration of the match, or during rounds.
I am currently doing this! Working on an MMO game server implemented in Elixir. It works AMAZING and you get so much extra observability and reliability features for FREE.
I don't know why its not more popular. Before I started the project, some people said that BeamVM would not cut it for performance. But this was not true. For many types of games, we are not doing expensive computation on each tick. Rather its just checking rules for interactions between clients and some quick AABB + visibility checks.
I've been told that Erlang is somewhat popular for matchmaking servers. It ran the Call of Duty matchmaking at one point. Not the actual game servers though - those are almost certainly C++ for perf reasons.
I distinctly remembered that Eve Online was in Erlang, went to go find sources and found out I was 100% wrong. But I did find this thread about a game called "Vendetta Online" that has Erlang... involved, though the blog post with details seems to be gone. Anyway, enjoy! http://lambda-the-ultimate.org/node/2102
Eve used Stackless Python.
CoD Black Ops used/uses Erlang for most of its backend afaik. https://www.erlang-factory.com/upload/presentations/395/Erla...
It sounds like they're so heavily invested in Unreal Engine that it's become the entire stack.
I was imagining some blindingly fast C or Rust on bare metal.
That UE4 code snippet is brutal on the eyes.
Network connection, lobby, matchmaking, leaderboards or even chats, yes. But the actual simulation, probably not for fast paced twitchy shooter.
Also not just for performance reasons, I wouldn’t call BeamVM hard realtime, but also for code. Your game server would usually be the client but headless (without rendering). Helps with reuse and architecture.
In the case of Call of Duty: Black Ops 1. Thee matchmaking + leaderboards system was implemented by DemonWare (3rd party) in Erlang.
Erlang actually has good enough performance for many types of multiplayer games. Though you are correct that it may not cut it for fast paced twitch shooters. Well...I'm not exactly sure about that. You can offload lots of expensive physics computations to NIF's. In my game the most expensive computation is AI path-finding. Though this never occurs on the main simulation tick. Other processes run this on their own time.
The biggest hurdle to a game server written entire on the BEAM is the GC. GC pauses just take too much time, and when you need to get out (for example) 120 updates per second, you can't afford it. Even offloading stuff to C or C++ does not save you, because you either have to use the GC, do a copy, or both.
Game servers typically use very cheap memory allocation techniques like arenas and utilize DOD. It's not uncommon for a game server simulation to be just a bunch of arrays that you grow, never shrink, and then reset at the end of the game.
IIRC, Activision/Blizzard uses Erlang for their matchmaking systems (or used to... I saw it in a very old talk)
> In VALORANT’s case, .5ms is a meaningful chunk of our 2.34ms budget. You could process nearly a 1/4th of a frame in that time! There’s 0% chance that any of the game server’s memory is still going to be hot in cache.
This feels like an unideal architectural choice, if this is the case!?
Sounds like each game server is independent. I wonder if anyone has more shared state multi-hosting? Warm up a service process, then fork it as needed, so there's some share i-cache? Have things like levels and hit boxes in immutable memfd, shared with each service instance, so that the d-cache can maybe share across instances?
With heartbleed et al, a context switch probably has to totally burn down the caches now a days? So maybe this wouldn't be enough to keep data hot, that you might need a multi-threaded not multi-process architecture to see shared caching wins. Obviously I dunno, but it feels like caches are shorter lived than they used to be!
I remember being super hopeful that maybe something like Google Stadia could open up some interesting game architecture wins, by trying to render multiple different clients cooperatively rather than as individual client processes. Afaik nothing like that ever emerged, but it feels like there's some cool architecture wins out there & possible.
It does sound like each server is its own process. I think you're correct that it would be a little faster if all games shared a single process. That said, then if one crashed it'd bring the rest down.
This is one of those things that might take weeks just to _test_. Personally I suspect the speedup by merging them would be pretty minor, so I think they've made the right choice just keeping them separate.
I've found context switching to be surprisingly cheap when you only have a few hundred threads. But ultimately, no way to know for sure without testing it. A lot of optimization is just vibes and hypothesize.
Take notes, Valve.
Sub tick is probably more accurate overall but I do think the cs2 animation netcode is crap and hides a lot of the positives. Hopefully moving to Animgraph 2 will help that, who knows
The partially-implemented animgraph2 heroes in Deadlock are looking pretty good and hits feel accurate in-game.
In general, Valve designs software that is incomparably better than Riot. Compare the League of Legends client to the Dota 2 game (which doesn't even have a client/game distinction), for instance - the quality gap is massive in favor of Valve.
This is from 2020. Valve wanted to be smart and invented a new "subtick" system in 2023 which isn't as good as 128 tick. To make things worse, CS is a paid game, not free like Valorant, and makes probably much more money. They seemingly just don't care enough about the problem to solve it correctly. That or there is more work to be done on subtick to make it work better than 128.
Nowadays Counter Strike 2 is free to play, although a paid prime upgrade is almost required if you want to play decent matches with less cheaters. FACEIT requires prime status too.
https://help.steampowered.com/en/faqs/view/4D81-BB44-4F5C-9B...
CSGO could do 128 tick, Valve just doesn't want to pay for it, but you can easily find private hosted servers with 128 tick. Riot did put in a lot of work to get it down so much though.
With the CS2 update, CS can no longer do 128 tick, even on private hosted servers.
> CS is a paid game, not free like Valorant, and makes probably much more money
(Veering offtopic here) Remember that Valve invented the free-to-play business model when they made TF2 free. As Gabe Newell said in some interview long ago, they made more money from TF2 after it went F2P ("sell more hats!")
Point being, being a paid vs free game is largely irrelevant to the profitability & engineering budget.
That said, I'm not sure why you say CS is a paid game. It is also free-to-play. Is some playable content locked behind a paywall?
[dead]
As a lifelong Valve/CS fan, I've been so disappointed with subtick. It was pitched as generational evolution to the games netcode. Yet years later they're still playing catchup to what CS:GO provided..
Hopefully competition from Valorant and others puts more pressure to make things happen at Valve.
I didn’t notice any difference between 64 and subtick.
There’s a long history of subtick bugs that have been identified and patched over the years. CS2 still isn’t quite as stable as 128-tick CS:GO (perhaps benefitting from a decade of patches and simpler architecture)
But animations are now lerped after each 4 frames. Do tickrate is 32 with interpolation. Not sure if sudden direction changes now might result in ghost hits. Some hardcore quake fans probably know the answer.
This post reads less like an engineering deep dive and more like a Xeon product brochure that wandered into a video game blog. They casually name-drop every Intel optimization short of tattooing "Hyperthreaded" on their foreheads.
well of course they would. they bought all intel hardware. and they are making one of the most perfromant multiplayer servers ever. they should be mentioning every optimization possible. if they had amd thread ripper servers they would mention all those features too.