I recently received a report that attachments sent to Gmail from some servers were being corrupted. At first, I assumed that the reporter was mistaken, or that perhaps the problem was with the sender's mail client or server. One of my colleagues had already conducted some tests of his own and found that PDFs and TIFFs he tested with were indeed being corrupted. I had to investigate. Some quick tests proved that the reporter and my colleague were correct. Below is detailed information about the tests I conducted and my findings.
For my tests, there are three groups of servers involved: my personal mail server (we'll call this the "PWB server"), my employer's mail servers (the "LT servers") and Google's mail servers (the "Gmail servers"). The PWB server's MTA is Postfix. The LT servers include Postfix relays and Kerio Connect mail servers. Mail sent out from the LT servers is first handled by Kerio Connect, then relayed to the outside world by Postfix.
I decided to limit my test to a single attachment - a TIFF file I picked out of convenience. This file is named eyes_color.tif and it is 196926 bytes in size.
I conducted several tests, but limited this analysis to a representative batch:
- Test 1 - An email sent from LT to PWB. The attachment arrived in-tact. The result, as saved from PWB, is in the file test-good-lt_2_pwb.mbox.
- Test 2 - An email sent from PWB to Gmail. The attachment arrived in-tact. The result, as saved from Gmail's web interface, is in the file test-good-pwb_2_gmail.mbox.
- Test 3 - An email sent from LT to Gmail. The attachment was corrupted. The result, as saved from Gmail's web interface, is in the file test-bad-lt_2_gmail-1.mbox.
- Test 4 - Another email sent from LT to Gmail. The attachment was corrupted, but not in the same way as Test 3. The result, as saved from Gmail's web interface, is in the file test-bad-lt_2_gmail-2.mbox.
From all four tests, I extracted the base64-encoded attachment. The results from Test 1 and Test 2 matched, and decoding those gave back the original TIFF. The SHA1 hashes verified this. This correct base64 content is in good.base64. Both Test 3 and Test 4 included corrupted bytes - just a few each. The extracted base64 content was the correct size, but each had a few bytes replaced with non-ASCII characters. The corruption was different between the two and seemingly random. Test 3's extracted base64 content is in bad-1.base64, while Test 4's is in bad-2.base64. Running diffs between Test 3's attachment and the correct base64 content and between Test 4's attachment and the correct base64 content yielded the bad-1_v_good.patch and bad-2_v_good.patch, respectively. When viewed in Gmail's web interface, the attachments from Test 3 and Test 4 fail to show previews and, when downloaded, are not viewable in an image viewer. The attachments downloaded from the web interface do not match the original file sent in those emails, verified by SHA1 hashes.
These tests are representative of all of the tests I conducted. Gmail seems to corrupt attachments sent from some, but not all, servers. I do not know why, and I do not see a pattern in how the attachments are corrupted. When the same sending servers deliver mail to other servers, the attachments arrive in perfect condition. The corruption seems to be conditional upon the sending server, as I get consistent results with repeated tests from any given account. I have only used a few accounts on each server to conduct my tests, so it may be conditional upon the sending mail account, but this seems less probable. In each corrupted attachment, a different handful of seemingly random bytes have been replaced with non-ASCII characters. I know that this corruption affects multiple file types, but I do not know if all file types are affected.
With assistance from folks at Google, I have identified a probable source of the corruption in the network path between the affected sending servers and the Gmail servers. I do not yet know why other receiving servers are unaffected, but it may be a difference in error detection and correction behavior (like TCP checksum behavior) or a performance difference that affects the chances of corruption. If the affected network provider gets their problem fixed, I will conduct further testing.
Comments from this post were discarded during a website migration.