In 2018 we realized we had a problem that needed a radical solution. As the global political climate was becoming increasingly unpredictable, our members were growing, experiencing sudden popularity and being hit with denial of service attacks and we were not able to adjust their Internet resources quickly and easily to meet these needs.
Although we had an entire cabinet of servers with enough resources to go around, our infrastructure was not designed in a way that allowed us to move web sites, email boxes and databases from servers under a high load to servers with spare capacity.
Our previous network design was created in 2005 when the Internet was a very different place and May First was a very different organization. We started with three physical servers:
- hay: members control panel
- leslie: mailman email lists
- viewsic: all member services - web sites, databases, email accounts
Our design was based on necessity: we only had one server for member services, so we put email, web sites and databases together, all on the same server. As we grew, we simply added new servers (we even gave them a special name: MOSH servers).
As we grew, we kept adding new MOSH servers:
During our first five years, we grew very quickly and added a new server almost every year (keep in mind - this was before the era of virtual servers, so when we added a "new server" that meant adding a real, physical machine to our cabinet). Since we were quite small, we could not always afford new, enterprise grade servers.
All of these factors made our design very useful: by concentrating all services in a single server, it meant that each server was 100% independent of the other servers. If one server had a hardware failure (yes, a common occurence at the time) it did not affect any of our other servers. This design really worked for us!
By 2011, May First was already a very different organization in a very different world:
- We had more servers than could fit in our cabinet, so we began renting a second colocation cabinet
- The capacity of new servers was growing more quickly than hosting needs - allowing us to host far more resources on a single server than before.
- We could afford to purchase enterprise grade servers which were significantly more reliable
- Virtualization technology become stable, allowing us to add multiple "virtual" servers to each physical server
At this point, our old design began to show it's age:
- Hardware failure was no longer a dire problem - so the "isolation" benefit of our design was not as valuable as it once was
- The number of MOSH servers was proliferating - we had over 50 of them. Some had hundreds of web sites, some only had a dozen, leading to servers way out of balance
- When a single web site got hit hard with a denial of service attack or just became popular, it degraded the performance of other sites on the same MOSH
- Since each MOSH hosted email, web sites and databases together on the same server:
- it became hard to optimize each MOSH for the service it provided
- it was hard to move a single resource (like a single web site under heavy load) to a new server because in order to move the web site for a given member, we had to move all the member's resources (not just the web site, but the web site, database and all email boxes).
In hindsight, we should have begun re-designing in 2011, but at the time we didn't fully grasp the scope of the problem. It wasn't until 2018 that we finally accepted the idea that the only way to fix this problem was to start from scratch and re-imagine a fundamentally different network design. At this point, we put out a dramatic proposal for re-imagining our network.
The primary goal of the re-design: more flexibily allocate our computer hardware to meet the needs of our members
To achieve this goal, the new network design has two significant changes:
- Web, email and database services are separated into dediciated servers for each task
- A layer of "proxy" servers surround our data servers. The proxy servers receive all traffic and direct that traffic to the appropriate data server for the given request.
The new network design provides the flexibility we need because:
- With email, databases and web sites separated, we can easily move only the resource that needs to be moved
- Since all traffic goes through the proxy server first, we do not have to wait for a domain name change to propagate throughout the Internet. Instead, we can move the resources and simply update the proxy server to indicate the new location - a process that takes seconds instead of hours
- With web sites, we can more effectively block malicious traffic or cache frequently requested pages at the proxy stage before the traffic reaches our web origin servers.
We have tried out best to minimize disruption for members. When it comes to the change in email and databases, it should be nearly transparent:
- The email change requires a change in your domain name's DNS records - so that your "mail exchange" (aka MX) record will point to our new proxy mail servers (a.mx.mayfirst.org, b.mx.mayfirst.org and c.mx.mayfirst.org) instead of the name of your previous MOSH (aka chavez.mayfirst.org, malcolm.mayfirst.org, etc). And that's it! For members whose domain name is hosted with May First, we will be moving your email to the new infrastructure through 2023. For members with domain names hosted else where, we will be contacting you to ask that you make the change.
- The database change will be fully transparent - we are running a "proxy" SQL server on each web server, so your web sites will continue connecting to the database on "localhost" but that connection will be transparently proxied to the network database server holding your database.
For web sites, however, the change requires you to modify some aspects of your workflow:
- New SSH hostname: To connect to your web site via ssh or sftp, you will always connect to the same hostname:
shell.mayfirst.org
. In other words, you will no longer use the domain name of your MOSH or the domain name of your web site. No matter which web origin server you are hosted on, you will always connect to <your-username>@shell.mayfirst.org
. The shell server will transparently connect you to the proper web origin server.
- Pre-authorization: You must first login to the Members control panel to authorize your ssh or sftp access - when you login to the control panel, you will be authorized to ssh or sftp for the next 24 hours. You can login manually via your browser, or you can automate this control panel login.
- Paths: The absolute paths to your web site files have changed. Previously, they were in the format
/home/members/<membername>/sites/<site-domain>/web
. Now, they are in the format: /home/sites/<web-id>/web
. When we move web sites to the new infrastructure, we have automated the process of finding old paths in your configuration files and updating them to the new paths, but this automation is not perfect, so you may need to review your configuration files to ensure they have been properly updated.
- Scheduled Jobs: Scheduled jobs now use systemd timers instead of cron. When we move your scheduled jobs we have automated the process of converting from cron to systemd timers, however you may want to review your new scheduled jobs to ensure they are working properly.
Fortunately, the new infrastructure brings some welcome improvements to the web developer's experience:
- Now, you can have as many different ssh and sftp user accounts as you want - all of them will have access to the web site without any annoying permission issues. This means you can provide ssh or sftp access to a web developer without given them access to your control panel.
- Scheduled jobs now support systemd services. That means if your web site depends on a service that runs forever (like a nodejs or django app) you can configure a scheduled job to run that service. The service will be run as a systemd service - ensuring that it restarts if the server restarts. Furthermore, you can check on the service output by logging in via ssh and running
journcalctl --user
or systemctl --user
commands.
The database change began back in 2020, but only for a handful of members so we would have plenty of time to test the results. The email changes began the end of 2022, but also only for a handful of members (and for all new mailboxes).
During the second half of 2023 we will more aggressively be moving existing members to the new database and email infrastructure. These changes should be transparent to you as everything will work as they did before.
For the web site changes, we will also be moving members over during the second half of 2023, but all members will be personally contacted prior to the move of your web sites so you are aware of the date in advance and can ensure that you are prepared for the change in workflow.