It’s slightly buried in there, but I spotted this:
“I had guessed conservatively, 600 players online would be our max.”
(a sensible estimate, in 2000-2005, for an unoptimized, badly-written server where you threw it together quickly because it was more important to get your game launched than waste time on “beautiful code”)
In mid April we hit 2,000 concurrent users … our server began to buckle. Round trip to punch something could take a full second and people were constantly being disconnected.
(Both “1 second RTT” and “people being disconnected a lot” are classic signs of a server that is FUBAR and needs some emergency work to fix the scaling problems)
“I would write a “V2 server” … that would [use] multiple cores; it made no sense that our hardware had 16 cores and 32 threads but our entire Growtopia server process was run in a single thread.”
Wait, … what? You’re single-threaded?
Everything I wrote above (about “normal” conservative estimates etc) was valid assuming you used multi-threaded code, that was badly written, that had poor synchronization design (it locked often because you’d been too lazy to think about your code carefully), etc.
Growtopia isn’t unique – but I believe it’s symptomatic of what’s happened more widely. Off-the-shelf tech has reached the point that Scalability is finally irrelevant for Indie MMOs. Just write your code (badly) as multi-threaded, and you’ll be fine. (by “off the shelf” I mean: standard libraries + standard VM’s + standard hardware + standard OS’s)
Your assumption today – when you hack together version 0.1, quickly – should be “this should be OK for 5,000 concurrent users (pessimistically)”.
Things have changed…
5k concurrent is somewhere between 100k and 200k actual users. If you’ve got 100k users and you’re not making a considerable chunk of money … you’re doing something very wrong with your business model :). By the time your scalability becomes an issue, you’ll have the cash to pay someone (maybe yourself) to write “version 1.0” of your server code.
To be clear: Not “something amazing, highly optimized, super-slick and efficient”, but rather: “doesn’t suck”.
When I started in online games / MMOs, scalability was a “mega-critical” issue. I did a lot of work on theory and application of server optimization (from the architecture design, through the choices of programming languages, to the usage of low-level calls on specific OS’s) – I even got a patent for my work, and wrote a chapter for the Game Programming Gems series.
Today, there’s still a lot of FUD around scalability – and it’s made worse IMHO by the branding/commercialization of scalability (cloud computing, noSQL etc are really marketing ideas from startups and corporates, *not* tech ideas). People are *afraid* of scaling.
For a while, that was a good thing: there are many “failed” MMOs from the 2000’s (names omitted to protect the innocent) that were hurried along to their deaths by terrible, non-scalable, tech.
But the world has moved on. Server speed tends to advance slower than client speed (compare servers to graphics cards…) – but we’re at the point now where your servers are so fast already that IT DOESN’T MATTER. Yay!