I’m subscribed to a slowly growing number of mailing lists and services that are aimed at keeping journalists and other such folk informed of trends and developments in the IT industry. Last night on one such list run by Ferris Research, I received an article that talks about how the practice of using a backup MX service can cause problems when used in conjunction with an improperly configured SPF/Sender ID screening implementation.
The problem they’re addressing: many people who use SPF or Sender ID forget to whitelist the hosts used by their backup MX service. Thus, when messages get submitted to those hosts and passed along, they may fail the SPF/Sender ID checks and cause mail to get improperly rejected or marked as spam.
While the article is for subscribers only, I received permission to quote from the article here:
One way to avoid this situation is to ensure that SPF/SIDF checks are not performed on mail received from the backup mail services. However, this will either negate the effect of part of your spam filtering or require that the backup MX also perform backup spam filtering.
So far, so good…
We recommend, where possible, to simply not use a backup MX. If your primary MX is unavailable, mail should still queue at the sending MTA for several days. The sending MTA should continue to retry periodically until your site is available again. In many ways, backup MX configurations are an anachronism — a holdover from the days when connectivity was unreliable and some MTAs’ queuing algorithms weren’t great.
Hold the phone! This is where I disagree with the authors. While most MTAs have decent queuing algorithms and connectivity is usually a lot more stable, those vcery factors have combined to convince many mail admins that it’s no longer necessary to hold mail for the once-customary time of 7 days before NDRing it. (Ah, the good old days when everyone ran Sendmail. Of course, I hated Sendmail, but at least you knew what to expect.) And I’m not talking about small, insignificant hosts either. There are many major ISPs and mailing list services that will start bouncing mail in a shorter time than you might think.
And while connectivity is usually good, that’s not to say it’s perfect. When we moved the office, a simple misconfiguration by our data provider left a fair chunk of our public IP addresses unroutable for several days. It’s not hard to conceive of circumstances that could keep your mail servers offline for 48, 72, even 96 hours, and you could very well be losing important messages in that timeframe. Backup MXs are still a very good idea.
The real problem is identified in the first quote: never use backup MX hosts that aren’t doing all of the same basic filtering you are. Now, some types of filtering are hard to push out to backup MX hosts (legitimate recipient filtering) and don’t really need to be done until the message is ready to enter your organization. However, position-sensitive mechanisms like SPF/Sender ID, RBLs, and other IP-based techniques should be done by the first host that accepts responsibility for them in your domain.
Folks I’ve talked to are seeing a rise in the right kind of solutions for these problems: hosted MX services and hosted mail services. If you can’t maintain your own backup MX servers that have consistent message hygience configurations and features as your primary MX hosts, then you should at least have a service that accepts all incoming mail to you, processes it consistently, and passes the resulting clean feed back to your mail server.
 Even so, it’s best practice to have your backup MX hosts do your recipient filtering too. Otherwise, they’re stuck having accepted responsbility for messages addressed to non-existent recipients, and they’re going to generate NDRs. They could very well be contributing to backscatter.
Edit 0723 PST: The author of the article, Richi Jennings, also posted his argument on his blog, so you can go read it there.