Steven Den Beste writes, on the subject of random text in spam e-mail:
I don't think [bypassing the filter is] what they're trying to do when they include a lot of irrelevant prose from books or newspaper articles. I think they're trying to attack the filter directly. The goal isn't to increase the positive score for that particular message, it's to try to poison the filter. By including a lot of relatively normal text in what gets classified as spam, they hope it will start tagging legitimate mail as also being spam. If the false-positive rate is too high, the user will stop using the tool no matter how good it is at rejecting spam.
I first noted these random-etext spams a couple of months ago. Since I've told my e-mail client to prefer the text over the HTML portion of the message, I usually see the random text instead of whatever message the spammer is trying to send.
My impression is that this text was included so that different spam messages would have different checksums. Some anti-spam methods make a blacklist of known spam by storing a message hash calculated on the body of the e-mail. An anti-spam filter can check an incoming message against this blacklist by computing the same message hash and asking the blacklist server if the server knows that hash. (Vipul's Razor is such a system, which I use through SpamAssassin.)
The spammer's countermeasure against checksumming is to make every message unique, so each message will have a different checksum. First they tried something which could be called "gratuitous uniqueness", where random characters are added to each spam message to make it unique. I suspect this is why spam e-mail always has garbage characters at the end of the Subject: line. Similarly, a few garbage characters can be added to the beginning or the end of the message body to defeat a body-only checksum.
Vipul's Razor employs (actually, employed) a fuzzy signature algorithm precisely to defeat small changes which create uniqueness. It also randomly samples a the message, hashing only a portion of it to compute the signature. This approach works around small changes to make the message unique.
I believe the the inclusion of large passages of English text (drawn from etext archives, articles, etc.) is in part a countermeasure against fuzzy and random signatures. I note that Vipul's Razor has currently suspended the use of its fuzzy algorithm and is only sampling; a new algorithm is under development. Luckily for me, spamassassin also uses a Bayes filter like the one Den Beste describes.
In any case, good luck to Steven and his new toy.