Back to index of articles

HTML From the Ground Up

by L. Downs

Structure vs. Appearance

By now you're probably becoming a bit more comfortable with writing pages in HTML, and have begun to view it as a simple way to define a page so that it will appear the same on a wide variety of browsers and operating systems. If you're reasonably computer-literate, in fact, you may have decided that it's just an ultra-simple relative of PostScript, which is another computer language used to do exactly that. Such a language is called a "page-description language."

However, HTML was never intended to be a page-description language, and the increasing use of it as such has considerably diluted its potential and made the Web a very difficult place to index. It's also made many pages extremely difficult to use by those with visual disabilities. In fact, strictly speaking, most Web pages out there seriously misuse HTML as it was originally conceived.

The original intention of HTML was to organize content, not appearance, and the tags were chosen and defined for this purpose. There are six levels of heading tags, for example, not so that you have access to six different sizes of text, but so that you could mark the various topics of your page in terms of their relative importance. Similarly, such tags as <DL>, <DT> and <DD>, often used to move text over from the left margin to avoid collisions with left-border graphics, were actually intended for lists of defined terms, using <DT> and </DT> to mark the term being defined, <DD> and </DD> to enclose the definition, and <DL> and </DL> to enclose the entire list of defined terms.

So what's wrong with the way things have developed? If a tag does what you want it to do, who cares whether it does what its creator intended it to do?

On a purely philosophical level, it's probably a moot point. The idealistic hopes of the Web's creators of an easily accessible worldwide repository of human knowledge have been hopelessly drowned in a sea of porn sites, blogs and commercial hucksterism. The utopian dream is dead, and it ain't comin' back. Get over it.

Not so easily dismissed, however, is the utter inaccessibility of most Web pages to someone with visual disabilities. For such a person to effectively use a Web page, they must use software which extracts the actual information and formats it in such a way as to render it audibly intelligible. While almost all such software can read aloud all text present in a page, this effectively ignores the many visual devices a good designer can use to direct attention towards the most important elements. Instead, the page becomes a morass of undifferentiated text, with the most important and least important elements treated equally.

A well-written text reader can in theory extract the important elements from a page if they are properly enclosed in appropriate HTML tags, or "marked up." This will work effectively if, for example, the most important concepts of a page appear in appropriately ranked heading tags. However, if a pair of <H1> tags are used for nothing more important than to make the word "Beware!" jump off the page at you visually, this isn't going to be very helpful.

This use of tags to "mark up" the intellectual content of a page didn't originate with HTML; in fact, HTML is a subset of an older, much larger language called SGML (Standard Generalized Markup Language). SGML is used by such organizations as jet plane manufacturers for very large collections of data because of the ease in extracting key information from such marked-up text.

What does this mean to you, the Web page designer? It means, first of all, that you should avoid using visual tags if the intent is to flag important information. For example, if you want to strongly emphasize a bit of text because it contains an important concept, consider using <STRONG> and </STRONG> tags instead of <B> and </B> tags. The former is aimed at the content of the text, while the latter is only concerned with its visual appearance. On the other hand, if all you want to do is make your text a bit heavier and there is no informational implication involved, then use the bold tags instead.

Some other useful tags for describing content rather than just changing the way it looks are the <CITE> and </CITE> tags, for enclosing citations to other works, <CODE> and </CODE> tags for enclosing computer code displayed on a Web page, and <BLOCKQUOTE> and </BLOCKQUOTE> tags for enclosing quotations from other works.

Unfortunately, these tags (like the heading tags) are easily misused. Although the <CITE> tag has the side effect of displaying enclosed text in italics, it's much easier to use the <I> tag, so this tag is rarely abused. However, the <BLOCKQUOTE> tag causes enclosed text to be indented from both margins, and is often used as an easy way to get this effect. Unfortunately, software which actually assumes that such text is going to be a quotation is going to give misleading information about the page as a result. Similarly, the <CODE> tag can be used to display text in a monospaced font (such as Courier), and such text may be misinterpreted as computer code as a result.

Even if you have the social conscience of a hate radio host, there's another very powerful reason to use the rrright tag for the rrright job. Tags such as <STRONG>, <CITE>, <BLOCKQUOTE>, etc. are defined in the HTML standards only in terms of content. Their visual appearance is not prescribed. Even though the most popular browsers have converged over the past few years in their ways of displaying these content-based tags, there's no guarantee that all browsers will continue to do so. (Remember how the first versions of Internet Explorer displayed heading tags in different colors rather than different sizes?) Using these tags for their appearance (based on how one or two currently popular browsers display them) risks eventual disaster. If you want your pages to display as similarly as possible across different browsers, your choices are: A side effect of properly marking up pages is that some search engines are growing intelligent enough to attempt to extract the key concepts in such pages, and this may make them more accessible to the rest of us.

In the next lesson we'll take a moment to cover various odds and ends that don't really fit under any one topic but which you still need to know.

Terms to know from this lesson
<STRONG> and </STRONG> tags: Enclose text to be strongly emphasized. Browsers may differ on how they do this, though rendering in bold is most common.
<CITE> and </CITE> tags: Enclose a bibliographic citation to another work.
<BLOCKQUOTE> and </BLOCKQUOTE> tags: Enclose a quotation from another work.
<CODE> and </CODE> tags: Enclose computer code displayed on a page.
Blog: If you don't know what this means you don't want to. Trust me on this one.

Portions of this tutorial originally appeared in Technotes, a publication of the UNLV Libraries, and are copyright by the University of Nevada, Las Vegas; used by permission. All remaining material © 2003 Lamont Downs.

Home Anime Cel Gallery Fiction
Music Trains E-Mail HTML Tutorial
This page last updated 2/5/2010.