Which should we use, HTML or XHTML, and why?
HTML began with HTML 3.2, code named
Wilbur, which was followed a few years later by HTML 4.0, then HTML 4.01. HTML 4.01 is the last version of HTML, and is also the final W3C specification. Over the years, some of the upgrades have included:
- Separation of presentation from structure
- Improved document rendering
- Improved accessibility
- Better internationalization
XHTML 1.0 was created shortly after HTML 4.01 to help the transition of hypertext to a new generation of mark-up languages. XHTML 1.1 is a more flexible version of hypertext with the benefits of XML architecture. XHTML 1.1 has improved the semantics of HTML 4.01 by including the Ruby module, used in multibyte languages such as Japanese scripts (Reference:Ruby Specification).
For our purposes, we will look only at meta-tags.
<META NAME="description" CONTENT="This is a sample page">
<meta name="description" content="This is a sample page" />
The syntax in these examples are still very similar with just a few differences between them.
For meta tags, the main difference you will observe is that XHTML tags must always be closed, so you will see a slash at the end of each tag />. Also, all attributes must be lower case – in HTML, they may or may not be.
Both languages have three types:
- Strict (recommended)
The “strict” version is strongly recommended by the W3C for regular documents. Using strict versions removes problematic elements and forces a significant separation between the structure of your document and its presentation. Transitional versions allow deprecated elements to assist those implementers to upgrade smoothly their software or their content – you might want to consider this if you are a sloppy coder.
Does HTML 4.01 have any advantages over XHTML 1.0?
There is no simple answer and the benefits you will gain are tied to how you’re using the language in a given situation.
Switching from HTML 4.01 to XHTML 1.0 has no benefits for your site visitors.
Web authors change for some of these reasons:
XHTML is easier to maintain
XHTML syntax rules are far more rigorous than HTML. XHTML requires the following:
- all elements and attribute names must appear in lower case
- all attribute values must be quoted
- non-empty elements require a closing tag
- empty elements are terminated using a space and a trailing slash (<hr />, <br />)
- no attribute minimization is allowed
- in strict XHTML, all inline elements must be contained in a block element
On the other hand in HTML:
- case, quotes, termination of many elements and uncontained elements are allowed and commonplace
- The margin of error in HTML is much looser than in XHTML, where the rules are very clear.
XHTML is XSL ready
XHTML 1.0 is the reformulation of HTML 4.01 in XML. So it follows that XHTML documents are hypertext documents AND XML documents.
XHTML is easier to teach and to learn
The syntax rules defined by XML are far more consistent than those found in HTML and therefore easier to explain than the SGML rules on which HTML is based. Most web generation software applications use XHTML as default – applications such as Adobe (Macromedia) Dreamweaver now close all tags by default and generate XHTML compliant code.
For more information on XHTML see the W3C website: Using XSLT and XHTML.