XHTML 1.0 vs. HTML 4.01

If I may quote from:

http://hixie.ch/advocacy/xhtml

which is considered the ‘official’ piece on the subject –


Why using text/html for XHTML is bad
————————————

What usually happens to authors who decide to send XHTML as text/html
is the following:

1. Authors write XHTML that makes assumptions that are only valid for
tag soup or HTML4 UAs, and not XHTML UAs, and send it as
text/html. (The common assumptions are listed below.)

2. Authors find everything works fine.

3. Time passes.

4. Author decides to send the same content as application/xhtml+xml,
because it is, after all, XHTML.

5. Author finds site breaks horribly. (See below for a list of
reasons why.)

6. Author blames XHTML.

Steps 1 to 5 have been seen by every single person I have spoken to
who has switched to using the XHTML MIME type. The only reason step 6
didn’t happen in those cases is that they were advanced authors who
understood how to fix their content.

I don’t know whether or not I agree to this whole thing yet. I like some of the thinking. But here’s my thing – I like the style of XHTML. All my tags nesting properly and stuff. That’s cool. But I don’t like things busting. That’s not cool. So can I do something really vile and put an HTML DTD at the top? Then it gets parsed as soup.

Anyways, let’s look closer at those issues he lists. –

  • 1 – Number one sucks. We hate when that happens. You can’t stop ’em though.
  • 2 – Number two is unfortunate – our lives would be easier if this didn’t happen. Too late, genie, bottle, etc. Not a Christina Aguilera reference.
  • 3 – Yes it does.
  • 4,5, and 6 are ALL THE SAME THING. I don’t care about ‘blame’ – that has no technical merit in any thing. I don’t care who gets blamed for what. That’s not my problem.

4 and 5 happen at the _same_ time, unless idiot Author changes his Content-type and then doesn’t check his site. If so he is stupid, and we cannot design for people who are stupid. (Average is OK. Clever is OK. Typical is OK. Really Stupid is not OK). So the author actually has Tag Soup, and cannot serve it as Application/xhmtl+xml. He has to revert back to text/html. WHO CARES? I don’t care who he blames. I don’t care at all. As soon as he tries to serve application/xhmtl+xml, his site explodes, and he has to switch it back. Sounds like application/xhtml+xml is a very, very good ‘toggle’ of Real XML Content compatibility.

So what can we do, instead? Serve your semivalid XHTML as text/html. If you don’t like polluting the DTD that is XHTML, make your doctype HTML 4.01, and have invalid documents. The whole point of this is that we can’t fix the past (HTML 4.01), we can only try and set things up to work well in the future. And that’s precisely what’s happening in Hixie’s 4,5 and 6 steps, above. Those are in the future. And until Joe Blogger is serving his content as application/xhtml+xml, we can’t be absolutely sure that it’s XHTML, i.e. an XML document. We may want to treat it like Tag Soup. That’s what we do already.

I think also, that we’re dealing with a very, very subtle problem here, as well. Hixie publishes a nice plain text document, with the “Considered Harmful” lingo which sounds so RFC-like. And says some stuff about how valid XHTML is good. And people (programmers) like that. So they quote him and cite him. And it looks formal and correct and well-debated. But he’s wrong. Instead of moving us forward, he’s keeping us mired in HTML 4.01. I say, better to be invalid HTML 4.01 because you’re secretly striving for XHTML. Being valid HTML 4.01 doesn’t help you. But that tone and style in that document make it sounds already settled, and it’s not. So beware documents that sound like that – plain text, “blah considered harmful”, etc.

Ultimately, your choices are: do you want Validity? Do you want it to Work? Do you want future-ness? Do you want XML-itude? And if your primary concerns are Validity and Workingness, use HTML 4.01, and validate it. If you want Futurey bits, and XML-itude, you can try to make valid XHTML 1.0 and serve it properly. But you’ll probably botch it because it’s hard. If it’s too hard for you, and you don’t care about validity, don’t serve it as application/xhtml+xml.

I don’t care about HTML 4.01 being valid. Fuck it. But I do care about XHTML being valid, and I think the “Content-type as validity toggle” is a good step for now.

Leave a Reply

Your email address will not be published. Required fields are marked *