Tuesday, May 18, 2010

How to read a bounce message

So, you or your user just sent an email, and it bounced back. One thing to notice or ask the user is "how long did it take for the bounce to come back?". Although not always important it can help isolate the difference (especially if the user already deleted the bounce) between a routing error and a "hard bounce" if the message was actually rejected (rejections are often instant).

If you're lucky enough to actually have the bounce, you usually have all the information you need to identify the server that had the issue and at least narrow down the cause (if not outright fix the issue). Unfortunately, many antispam services have taken to responding with generic status codes instead of explaining that they blocked the message and at least giving some sort of explanation. This can make the status message (which I'll explain shortly) worthless, but generally it'll be a good place to start.

When you're looking at a bounce (and by "bounce" I mean both a DSN and an NDR) there are three main things to look for.
  1. The sending server
  2. The server that encountered the problem
  3. What the issue was
Now, if the email was going from one user to another on the same domain, 1 and 2 will be the same server. In that case you can skip to section three. If the sender was sending the message out to the internet, 1 and 2 MAY be the same server, but not necessarily, and not nearly as often.

According to RFC 2821 (my clarifications in italics):

"If (a relay server) accepts the task (of relaying an email), it then becomes an SMTP client, establishes a transmission channel to the next SMTP server specified in the DNS ... and sends it the mail"
More information on client/server can be found on the main site, but basically the "client" is the sending server (#1 above), and the "server" is the #2 server above.

1. The Client (sending) server. Also called the "Reporting-MTA".

All email has to have a sender. Even in the event of a "Null sender" the sender exists, it's just Null :-). Generally mail servers will send bounces as a null sender, but forge their name into the headers like "root@sending.server.com".

One thing to note about the "from" string is that Exchange 2010 has a new "feature" (quotes very much intended) where it forges itself as the "from" string in the headers on incoming bounces regardless of whether or not it generated the bounce. This is outrageously annoying as it removes one of the first places end users and novice admins look to see where the error occurred.

It's often presumed that the server where the bounce originates from is the server that had the issue. Although this is possible if it's a local error, most often it is the NEXT server in the route that caused the problem. More below...

You can find out what server is sending the message by looking in a few common places:

  • The "From" header - Often the "From" header in a NDR is "@sending.server.com", this shows that "sending.server.com" is the sending server.
  • The "Generating server" - If the body says "Generating server: sending.server.com", this is the server sending the bounce.
  • The "Reporting-MTA" - The "Reporting-MTA" may be listed in the bounce. If so, this is the sending server.
2. The "Server" server, or the "Remote-MTA"

Remember the section from RFC 2821 before where it says "If (a relay server) accepts the task (of relaying an email)"? Well, that bolded part is important because that's where 99% of your bounces will come from on a daily basis. Most bounces come from a server NOT "accepting the task", and instead delivering a 400 level response (which is like a "not right now, maybe later") or a 500 level response (which is like an outright "absolutely not").

The Remote-MTA is often the only place to find logs of WHY the message was refused. The Reporting-MTA will have logs THAT the message was refused, but will sometimes have less information even than the bounce message itself.

A few common places to find who is the "Remote-MTA" are:

  • The "Remote-MTA:" field in the bounce
  • The "Remote Server:" field in the bounce
  • Where it says "While talking to:" in the bounce
3. What the issue was

If you don't have access to the Remote-MTA's logs (or if it's an internal bounce and there is no Remote-MTA), the bounce will usually have some type of diagnostic text. It won't be as much as the server's logs, but it's a whole lot easier to read a bounce than read through logs. I recommend starting with the bounce :-)

Here are some places to look/things to look for:

  • If the bounce says "while talking to: receiving.server.com" it will say after that "receiving.server.com said:"
  • The bounce may say in the body "Reason:", "Error:" or "SMTP 5.x.x [...diagnostic text...]"
  • The bounce may have text that is configurable by the Remote-MTA such as links explaining the error or general diagnostics. For instance "Rejected due to abuse. See www.some-website-or-other.com/bounce.html for more information"
There can be other information in a bounce like the "Received-From-MTA" or additional headers so you can see the route, but reading headers is often extraneous at this point, and probably deserves an article all to itself. If you're super excited, read RFC 2821 for information on SMTP, and then RFC 3464. That'll cool your heels :-)



Regards,


TEA

Monday, May 17, 2010

What is a Border MTA?

A Border MTA (Mail Transfer Agent) is the first mail server responsible for handling your email. It operates at the "border" of your network, hence the name.

This is commonly an antispam service or appliance, or sometimes your firewall will have an "SMTP Proxy" built in (common with SonicWall, Tumbleweed, Symantec and Cisco router/firewall combos). Sometimes all the packets are sent directly to the mail server, then it is the "border MTA".

The easiest way to determine your Border MTA (although not always 100% accurate) is to see what server returns its name in response to the SMTP HELO command. In the event of an SMTP proxy on the gateway, it can proxy even these responses, but that's uncommon enough to ignore for most troubleshooting.

The reason an MTA in this location has a specific name is that there are some services that should run ONLY on the Border MTA, and if they are run anywhere else in your email's route can cause disastrous issues.

For instance, SPF authentication should ALWAYS be run on the Border MTA. Microsoft, in their infinite wisdom, decided to include SPF (under a proprietary name, surprise surprise) as a default option in their IMF settings (which stands for "Intelligent Message Filtering", please keep your guffaws to yourself) in Exchange 2003 SP2 through 2010. Keeping in mind that many of their enterprise level customers would have several layers of protection, and therefore probably several MTAs in front of theirs, this was a less than educated decision IMHO.

SPF (or Sender Policy Framework) is a relatively new standard (spearheaded by MS oddly enough) that adds DNS records for the domain advertised in the "mail from:" command. More information can be found here, but essentially if the connecting IP does not itself exist or resolve to a record that does exist in the SPF record the email is bounced or flagged. Let's see how it's supposed to work:

Sender: me@domain.com
Sending server's IP: 1.2.3.4
domain.com's SPF record: "IP4:1.2.3.4 -all"
recipient's border MTA's IP: 10.0.0.1
recipient's mail server: 10.0.0.2

The recipient's Border MTA does an SPF lookup on domain.com since that's in my email address. My DNS server says the SPF record is "IP4:1.2.3.4 -all", which means that all email should be coming from a server with the IP of 1.2.3.4. Since the connecting IP is in fact 1.2.3.4 it passes this check and is either passed to the mail server or is at least allowed to continue being processed.

Now, how it DOESN'T work.

The recipient's Border MTA does an SPF lookup, and it passes the SPF check due to the reasons above. Then, the Border MTA passes the message to the mail server which now does another SPF lookup on it. This time however, the IP of the connecting server is actually 10.0.0.1 NOT 1.2.3.4. So, since this IP is not authenticated by the SPF record, the message bounces unless other steps are taken to prevent this (such as whitelisting your Border MTA's IP).

Other examples of tests that should NEVER be run after the border MTA include RBL checks, reverse DNS tests (and "Forward Confirmed Reverse DNS"), greylisting, or really any IP based authentication techniques.

Of course, all anti spam tests should also be completed at the Border MTA, so greylisting, antispam, antivirus, and other tests should be run there as well to prevent backscatter. Only in special circumstances should even rate limiting be done after the border since this can cause unforeseen bottlenecks that can take an essential route out of commission.

In conclusion, if you only read one paragraph this should be it. Run ALL tests, especially IP based tests, at the border. The only thing that should happen past the border MTA is routing/load balancing to different destination MTAs (where the email is delivered to the user). Ignoring this rule is guaranteed to add to your headaches, and to your employer's dissatisfaction.

Regards,


TEA