Design Considerations for a High Volume Mail Server
These notes (originated by Craig Mattson), are based upon his experience supporting large mailing lists with up to a few million subscribers.
The purpose of this page is to discussion the issues of supporting large mailing lists and high volume delivery in an industrial strength mail server, in hopes that the functionality can be designed the right way from the get-go. As a side note, one of the requirements for Apache to begin using James as its mail server include efficient outbound email. Large mailing list support is a much harder problem, so if these design considerations can be resolved, efficient outbound email in general should be well-supported.
Efficiency at all levels becomes important when you deal with very large mailing lists. Typically, the first problem you'll see is that when you want to send a message and it has to go out to let's say a million people, that hits your machine pretty hard. Therefore the first three key pieces of functionality include:
- ability to separate the MLM (mailing list manager) functionality from SMTP delivery and inbound SMTP
- ability to cluster SMTP delivery servers and mailing list manager servers
- ability to throttle rate of hand-off from the mailing list manager to the SMTP servers
Once you solve the raw delivery problem, the next thing that you notice is that deliveries to some domains are bouncing . Let's say you have 100,000 AOL subscribers on your list, and AOL's mail servers are down (this has been known to happen!) All of the sudden, you're dealing with bounces like crazy, and if your delivery machines aren't set up to handle it, you're dead. Therefore you must be able to:
- control the number of outbound threads per-SMTP machine in your cluster so that you have sufficient capacity on each delivery engine to deal with bounces
- control the number of inbound threads per-SMTP server, to be able to allocate capacity to bounce processing (and MLM functionality, if your outbound SMTP servers and inbound SMTP servers are the same machines)
- control the maximum number of threads that can be allocated to deliver to any specific mail server. Many bounces take orders of magnitude longer to deal with than successful deliveries, so you don't want to "clog" your mail servers by trying to deliver to servers that are malfunctioning
It's sad but true that many SMTP servers out there don't conform properly to all of the standards. So you have to be able to account for all kinds of wierd things (like being unable to specify multiple recipients for a message, etc...). Therefore, it is best right from the start to be able to:
- provide per-domain and per-mail server outbound configuration
In order to be able to monitor what's going on so that you can configure these elements properly, you must:
- provide good tools to monitor the performance of the mail servers, especially being able to view what is going on with the outbound spools on a per-domain basis
Next is a set of features that corporate users will start requesting when they begin using your mailing list manager as a CRM (Customer Relationship Management) tool. Invariably, they will want to store their subscribers in a relational database. They will have lots more information on each subscriber than just an email address, and they will want to be able to use that information to (1) select "target groups" of subscribers to receive mailings, and (2) personalize mail messages. On the personalization point in particular, many personalizations will be sophisticated enough that you will want to acutally embed some kind of code in the email message, to be processed "just-in-time" (that is, right before delivery), where "variables" in this code are substituted by information that comes out of the subscriber database. This would be similar to JSP on the web, except for email. Logging delivery histories and MLM requests in this environment becomes important.
There is more to say, but this is getting pretty extensive already. A product with this level of functionality is sorely missing in the open source community. James is an interesting possibility, because its inherently multithreaded architecture could be used to deal well with many of these problems.