I’ve spoken about ‘considered harmful’ several times, and my favorite article (or anti-favorite, really) is still: Ian Hickson’s article about sending XHTML with a Content-Type: header of ‘text/html’.

That was bad enough, and horrifically damaging to advancing web standards – such as XHTML.

Now, however, the well-regarded and obviously well-educated and well-meaning gentleman has decided that XHTML is so terribly damaged by poor spec-writing and inconsistencies in implementation that he would create a new spec, HTML5. HTML5 is really a successor to HTML4, the last version of HTML we had before XHTML came into play.

This is an even worse step backwards than we had before.

XHTML, and XML, could have ushered in a lot of new technologies. Once enough content on the web was created in valid XML, high-speed parsers might be able to be developed which halt on invalid documents, thus causing higher-performing web-browsing experiences. Instead of serving HTML documents with tags like <html>, <head>, <body>, and others, you might get to the point where you serve documents like <invoice>, <lineitem>, <quantity>, and so on; styling these documents with XSL/XSLT. That could really be great, and usher in a world where you have one canonical format for your data, plus a stylesheet for human consumption. The computer can parse it, as-is. Wouldn’t that be great?

Yes, but hixie has ruined it for us. Because of his ‘concern’ about serving slightly dented HTML content to the unfortunate users of IE, all forward progress must stop. Good job.

Now we’re back to where we started. Thank you so much, Mr. Hickson. I tell you what, since you’re so concerned about validity, why don’t we just scrap HTML in total and make the internet be nothing but text documents. Instead of links, we can have instructions in the middle of our text such as, “To see Brady’s blog, type http://uberbrady.blogspot.com into the address bar of your text browser.” Then all documents will be valid, forever. OK? Great.


Considered Harmful Still Considered Harmful

Autoincrement Considered Harmful.

Yet another post which talks about some drawbacks of autoincrement columns – well, not really autoincrement columns. A post that talks about some problems you can have in databases, which have little to do with autoincrement columns.

But ‘considered harmful’ makes geeks snap to attention. Anyways, the article is actually interesting – and points out some problems with RESTian architecture, combined with serial numbers on tables – but has little to nothing to do with Autoincrement columns being harmful or otherwise.

“Considered Harmful”, Spam, and SPF

So lately we’re getting tons of spam. Any sense of the word ‘we’ you can come up with, we are getting it. The stuff that seems to keep making it through everything tends to be image spam (can’t do bayesian stuff to it, no text) for stock scams (no need to put a URL in the content of the email, which we would catch and block).

So at first I was considering running OCR on all email that came in and had images on it – but that’s really scary. It would mean having the computer figure out that there’s text in every image and scanning it out and then running SpamAssassin or whatever on that image. There seems to be one plugin for this and it seems crappy – it has to filter your image through an image converter, then into an ocr package, then the text that comes out gets checked against a static list. Lame. I would prefer the text be fed into SpamAssassin or something, so we get a little more flexibility out of the setup. But even then – you just start making swirlyer text, more obfuscated, and your OCR plug-in won’t be able to read it.

But I decided to look into some other options – and one I decided to implement is called SPF. Sender Policy Framework, it’s been extended by Microsoft into some sender-ID proposal. You check DNS to see if someone who’s sending you mail is listed in a TXT record to be ‘authorized’ to send mail for that domain. If they aren’t, you can bounce it.

Now, ultimately, the spam problem is a legal problem, that is impossible to enforce because of all the forging that goes on. Pump-and-dump stock schemes are an FTC issue, for example. But we can’t tell who’s spamming us because they’re sending through zombie networks with forged ‘from’ addresses. If we knew who they were, we could refer the FTC to them, and they could attack them from that direction. SPF _may_ end up helping with that kind of thing. Maybe.

But today I had to wade through a ton of articles begging me not to implement SPF because of the horror and tragedy that would ensue. Oh no! But, as before, “X Considered Harmful” is just another way to cause a knee-jerk reaction. If some domain out there in the world chooses to publish SPF records for their domain, and you choose to obey those SPF records, it’s not a big deal. If you don’t like SPF records, don’t publish any, or publish a “+all” record if you want to be a dick about it. Why go on a tirade? If some guy publishes a record and fucks up his email, isn’t that his problem, not yours?

Now, that being said, there are problems with this SPF thing, among which are handling for forwarders. But the bulk of the technical disagreements here don’t seem valid. In the modern era, there are no open relays anymore. If you relay mail, you relay it for someone. Whoever ‘someone’ is, if they want, they can publish an SPF record that says so. If you’re trying to do some tricky thing with moving around and sending mail from dynamic addresses, you’re likely getting marked as spam anyway because of your address dynamicness.

But forwarders seems to be a legit problem. Domain A sends mail to Domain B. foo@b.com forwards to bar@c.com. So now we have the mail server at b.com sending mail from somebody at a.com to c.com. Wait, that’s not a problem, is it? No, it is – imagine c.com checks the SPF record – mail is coming from Domain A, so it will be checking A’s SPF record. A’s SPF record says that A will only send mail from A’s server. So that’s the infamous Forward problem. Eh, not good. But still, it’s A’s problem, not my problem (being Mr. C). Shit. Basically, the actions of the recipient on server B will affect whether or not his email will forward properly. He goes into his account settings, says ‘forward to server C’, and mysteriously finds that some messages (from servers other than A, who don’t use SPF) get through, whereas others (from servers like A, who _do_ use SPF with some kind of restrictive setting), will get mysteriously bounced or marked as spam. Well…I dunno. The user at C who changed his forward on server B is going to find his mail kinda does get delivered, kinda doesn’t. Depends on who it comes from. And that’s because I (owner of server C) turned on SPF checks. It is only in the case of a ‘forward’, and it can be fixed by mangling the envelope sender so it appears to be from the B server’s domain…but…ugh. In any case, it’s a setting on A’s server that seems to cause the problem. If the user on C isn’t getting mail from A that’s going through his forward at B, well, don’t do the forward, or use a new-style forwarder thingee.

Shit, maybe I do have to do some kind of OCR thing after all. Ugh. I hate this crap. And after I _manually_ went and applied patches onto qmail. I need a new mailserver, too.