botsin.space postmortem

21 December 2024

I switched botsin.space to read-only mode on December 16th. I essentially just prevented the creation of new toots -- you can still login, migrate your account, request an account archive, etc. Unsurprisingly, the load on the website has plummeted. In the next couple of weeks I hope to spend some time downsizing the instance, cleaning up some junk, etc, etc, to have a lean archive of the server that I can keep online without much expense for as long as possible.

After I made the announcement that I was shutting down the instance, a few things happened. First, I received an overwhelming amount of thanks and support. I'm so incredibly grateful for all the kind words. So many people reached out to offer their thanks for running botsin.space, and their understanding for my need to end things. A few people (literally just one or two out of hundreds) were angry about it, which I understand.

Several people/organizations reached out with offers to either fund the service, offer spare servers/bandwidth, promote their new cloud platform, or take it over in one way or another. I rejected these for several reasons. First, several of them involved me continuing to run botsin.space, and as soon as I made the announcement that I wanted to shut down the instance, I realized just how badly I didn't want to do that anymore. Second, a few of the offers involved established orgs/people with a decent presence on the fediverse, but I was concerned that handing over control would change how botsin.space would be run in ways that made me uncomfortable. For example, I blocked Threads/Meta on botsin.space, and I know that some of the orgs that offered their support are either cool with Meta or even supported by Meta. That's their right for sure, but it's not something I was comfortable with.

But even more importantly, I was really worried about the privacy implications of giving the service to someone else. It has always been really important to me to maintain the privacy of the people with accounts on the server. As a rule, it's good behavior to identify the creator of a bot/automated account in the profile of the bot, but it's not strictly required, and there's certainly good reasons to remain anonymous. From time to time someone would request the contact information of a bot owner, and I basically never handed out that information. I would try and be an intermediary instead -- someone contacted me, I would contact the bot owner and pass along their message, etc. I have also been concerned for a few years now about the possibility of law enforcement/state actors asking/demanding information from the instance. It's extremely plausible that the database of an average mastodon service includes some toots about someone's abortion for example, which is the sort of information that police officers are suddenly interested in having in parts of the US. Ultimately, my concerns are probably irrelevant because my hosting provider could hand over the information without me ever knowing, but I worried that handing the server to someone else would leave these questions unresolved in a way that was unacceptable to me.

I was also contacted by at least one known bad actor that definitely had malice in mind when they offered to take over the instance, and went so far as to setup and populate a throwaway account to contact me.

Finally, one or two archivists have reached out, but that's a little different from maintaining the service, and I'll probably see what the options are there.

Numbers

I thought it might be interesting to cobble together some stats, graphs, etc. for the instance. They show certain trends, and might be helpful for people thinking about their own instances.

Expenses

Here's a chart of the monthly expenses for botsin.space:

Monthly expenses

The red line is the bandwidth/storage cost, and the blue line is the server/CPU cost. The yellow line is the cost for the mail/SMTP service I used, which changed a couple of times over the course of the project. It started off being essentially free, then $24/year, then $10/month when the number of emails sent by the instance hit a certain threshold. The green line is the sum total of everything.

Observant folks will notice that the bandwidth expense trends up over time, but then has a big drop near the end of 2023. I'd been worrying about the creeping expense for months at that point, and I implemented a cache layer to try and save some money. Originally I used AWS to host files, but I moved to Digital Ocean in 2022. Both services charge separately for storage and bandwidth. As the bandwidth expense grew, I experimented with a proxy/cache system on the botsin.space server itself, which gets a certain amount of 'free' bandwidth every month. This helped a bit, but not for very long, as the bandwidth usage continued to grow and grow.

Toot/Media stats

Here's some charts for the number of toots tooted each month, as well as the number and size of media files processed by month. See if you can guess when Elon Musk bought Twitter.

Statuses created each month Media files per month Media size per month

In many ways, that giant spike in 2022 was the beginning of the end for botsin.space. The monthly active accounts total went from 700 to 1100, from just under a million statuses created to 3 million statuses (!). The media usage and bandwidth grew similarly. There's not a whole lot else to say about these three charts except that Elon Musk is a real chaos agent.

Tech Specs

I've written a bit about how I ran botsin.space already, and it's still fairly accurate. In the end times, I added a varnish caching layer to try and help keep things running, and I also added two extra servers, one devoted to web requests and one to background jobs.

Originally the instance was a 2 CPU instance with 4GB of RAM. In the end, it was an 8 CPU instance with 16GB of RAM, with the two extra instances being slightly smaller. It moved from a s-2vcpu-4gb to a s-8vcpu-16gb-amd on this table, with a few steps in the middle. I think one takeaway from these graphs are that if I'd been running an instance for a small, limited number of accounts, it would've been manageable for a very long time. It took a long time for the expense to creep past $50/month, and another long time to go above $100/month.

Starting in 2020, I moved the database storage off of the instance and onto a separate volume. Having data on a volume separate from everything else meant that I was able to take instant snapshots of the database before running upgrades, and this really saved my ass several times. There was a Mastodon upgrade sometime in 2022 that involved some involved data migrations, and I think an upgrade to Postgres. When I ran the upgrade, something went horribly wrong and botsin.space was pretty badly broken. I ended up taking the instance down for several days while I dealt with the issue. Ultimately, I needed to rewrite the migration scripts from scratch and run them on a backup snapshot of the database. This was really hard work -- I had to setup a new server, attach a copy of the database, debug my script, etc, etc. I was extremely likely to be qualified for this work -- I've been working with Ruby on Rails since it was beta software. But I still came pretty close to calling it quits right here and shutting down the instance forever. Anyway, make good backups of your server!

Assorted Challenges

Technical challenges

There were long stretches in the beginning when the service ran with little or no attention from me. Other than the event I described above, there weren't any epic challenges involving downtime, server crashes, etc. I'm fortunate to have a lot of knowledge and expertise regarding running servers. I tend to know how to identify a problem, I can figure out how to fix it, etc. Docker doesn't particularly bother me and using it made it easier for me to tune the service. I wrote scripts to handle things like expand the database volume when needed, run cleanup tasks, etc. I maintained the configuration for the server in git. I also have an instance of Jenkins that I use to deploy my personal projects, and I used it here too. Every time I pushed an update to the setup for botsin.space, Jenkins would grab the changes, build a new docker image and push it to the right place, and take care of a few other tasks. Then I would ssh into the server and run a command or two to switch to the new version. All things considered, I had it running pretty smoothly for years.

Using S3/DigitalOcean spaces to store files meant that I never really needed to worry about hard drives filling up, although when I decided to move from S3 to DigitalOcean, I ended up with a huge AWS bill for the expense of transferring all of those files. Oops!

Two annoyances were spiders/spam, and background jobs. Every now and then some web scraper, spammer, or well-meaning but malfunctioning bot script would tie up a huge chunk of system resources until I blocked their IP address or found some other way to deal with the issue. It was never a catastrophic problem, but it would be annoying from time to time.

But background jobs could frequently become an issue. Basically, every time someone creates a toot, a bunch of background jobs are created to send it to all the followers of an account. There's a similar set of jobs to deal with incoming toots. Sometimes this would slow down and there would be a backlog of work to be done. This was probably happening for several reasons. First, it involves communicating with remote servers, some of which might be slow to respond, or they might be offline. Second, sometimes a job would create multiple other jobs. Essentially speaking, a popular/viral toot can create an unexpected cascade of background jobs. The main solution here was usually to run extra copies of sidekiq to process the backlog, but sometimes this meant doing a lot of juggling to keep from overwhelming the db/web instances that also needed to run.

Other challenges

Dealing with new accounts was never too much of a big deal, especially after I shut down unverified signups, and asked for a 'magic word' in the signup flow. It probably got a little tiring during the months that a couple hundred people signed up.

Moderation was generally not much of a problem most of the time. Most months the number of reports was low, and sometimes there weren't any reports at all.

Moderation reports created each month

There were definitely some real problems in these reports -- bots behaving badly, accounts on remote servers being assholes. Lots of complaints about spammers on mastodon.social. But in general, a huge chunk of these reports contained literally no information in them -- no description of the problem, and no links to specific statuses. I generally classified these as "I don't like this account/content, but I'm going to make someone else do labor by clicking the report button instead of the block/unfollow button." They were really frustrating, because I'd generally end up doing some digging to see if anything was going on with the reported account, and inevitably nothing was. I've said this in the past, but I think it would be really easy for a bad actor to weaponize reports to cause a lot of pain for people running instances. I also think that I didn't have a typical experience regarding moderation, since as a rule any interaction between a remote account and a botsin.space account involved a non-human, and while bots can definitely say terrible things and make awful mistakes and are ultimately a human product, in general this didn't happen a whole lot (or at least, I rarely heard about it).

Weirdly, I didn't get a single DMCA takedown request, it seemed basically inevitable and never happened.

There were a few times that I added an instance to the blocklist and would get brigaded by people being total assholes about it. They'd hop on new accounts from other servers to harass me, they'd email me spewing hate or trying to convince me to undo the block (or both at the same time). Sometimes threats were involved. But I recognize that I had it pretty easy compared to a lot of other people running instances. I rarely announced blocks, which I think helped, but maybe wasn't the greatest way to do things.

Once or twice I needed to deal with content that was CSAM/CSAM-adjacent, and honestly it was very upsetting. I think all the articles out there about moderators at big social networks struggling to deal with the mental hardship of dealing with content like this are actually underselling the problem. It really sucked, I lost sleep over it, and what I needed to deal with was pretty trivial compared to other people.

Some final thoughts

When I first started drafting this post, I thought I would write something about how you shouldn't launch an instance without having a fallback plan, an exit strategy, a team to help manage your community, etc, but having spent a couple of days writing everything here, I'm not sure I feel that way anymore. I actually think that ideally, a person or group can decide they want to start a community on the fediverse, do that, and then when it's time to end, they end it, and have that all be fine. Communities don't have to last forever.

I think a lot about phpBB. It's easy to forget about in the post social media era, but there are a lot of phpBB/etc forums out there that host vibrant, active communities. They might be small, but that's ok, and it's likely even a good thing. A friend of mine runs an instance of phpBB devoted to cars with manual transmissions. I know of another forum devoted to parenting that started as a splinter group from yet another forum that was mostly about feminism. I used to frequent a board devoted to the Red Sox, and another one devoted to a 2d airplane game I used to love. Today I spend time on forums related to ambient music and making tape loops. There's forums devoted to medical issues, where people can go to talk, ask questions and find support. And most of these places have off-topic sections, where you can discuss random topics, what's for dinner, shitposting, etc. And those areas exist because phpBB is a great tool for building a community.

phpBB runs on PHP, arguably the most important programming language of all time, and certainly one of the most successful ones. Almost every hosting platform supports it by default. It runs on cheap hardware. You can get a Dreamhost/etc account on a server shared with 1000 other websites and run your forum without thinking twice. It's far from perfect, but you can do it with a relatively low amount of technical knowledge, especially with one-click installers and things like that. It's an incredibly successful and historically important software project.

Right now, there's basically no equivalent to phpBB for the fediverse. I think this is likely true in part because everyone's brains broke in the early dotcom era, and we all collectively decided that something wasn't worth doing online unless it could scale up to 10 million users. A web forum is useless to someone with that mindset. Mastodon requires quite a bit of technical knowledge, and if something goes wrong you need to be an expert to deal with it. Mastodon the software platform is built and maintained to support the scale required to run mastodon.social, an instance with two million users. And if you use Mastodon to run your instance, you're locked into the choices made to support a giant instance. There's alternatives to Mastodon, some of which make better choices for smaller instance, but I don't think any of them are really any easier to use.

I worry that this makes it very difficult to resist the inclination to centralize everything on a couple of big instances. And that inclination absolutely must be resisted. I would love to see a fediverse full of thousands of small instances of software platforms that are as easy to manage as a PHP web forum. But there are a lot of challenges to tackle before something like that could really exist. I could probably ramble about this for another thousand words but I'll stop now. If you made it this far, thanks for reading, and thanks for supporting botsin.space!