XHTML Word Processing

By Ed Hurst | Posted at 4:13 AM

The basic purpose of a word processor is to format text for printing. If you aren't going to put it on paper, you really have no need for a word processor. However, I find a huge portion of the computer using population don't make a distinction between documents for printing and webpages. That is, not consciously. They may know instinctively if they want to print the contents of a webpage the way they want it to print, they'll have to copy from the page, then paste into a word processor, format, then print. They focus on the presentation, and the information is a separate issue. Indeed, the former often takes precedence.

The problem here is such users seldom have much more than a semi-conscious awareness of their own preferences, even when they focus on presentation. They have no idea what makes presentation effective to their audience; they simply assume people are impacted as they were at that particular moment. Tomorrow the same printed page becomes trash because it doesn't speak the same. No one seems to notice. People who have studied the issue of how the average brain processes information are more likely to reduce all the trashy, flashy extras in presentation, and stick with the essentials. They know all caps, all bold, all italics, all underlined, all brightly colored means the average brain responds to the message as unimportant, because it requires such a hideous paint job to convey the importance the sender attaches to it. And we all know most people are more impressed with themselves than most other people are. Save the theatrics for the live presentations; in print there are precious few with a talent for it.

Unload all the silliness, and you arrive at the information itself. Adding attributes to plain text has a well-established meaning. The framework was established long ago, and various attempts were made to produce software which approximates that framework. Some details have shifted around a bit from those early standards for text displayed on a computer screen. For example, we don't worry about wasting paper or other resources which feed into printed matter. What we used to do in print fails to exploit the capability of computers. For most uses, print has shifted to match what we do on computers. For example, it's no longer acceptable to indent the first line to mark the beginning of a paragraph. Now we use vertical space between paragraphs, and leave lines unindented in the main text. There is some overlap: We still set off large blocks of quoted material by indenting the whole paragraph. Whole books have been written on the details. The point has always been, “What's the best way to inform the widest audience?”

In the end, we have web publishing standards as reflected in the system called HTML — HyperText Markup Language. That framework continues to develop, often an ugly political process, but it's now the reflexive standard in the minds of most people who can read. I choose to rely on the XHTML (Extended HTML) standard. Over the years, I have found myself moving away from the word processing model altogether. While I still have a printer attached to my computer, I seldom use word processing software. For longer documents which must be printed, I prefer a typesetter (Lyx), but for most of my personal archives I prefer either plain text or HTML. All the more so since, even for printing, HTML processed by your browser seems a better way to go.

That is, it's possible to write formatting instructions (CSS — Cascading Style Sheets) for inclusion in the webpage display which can be tailored just for printing. Even then, the whole idea behind the new and better XHTML is to avoid making choices for the reader, and giving them maximum freedom to alter the presentation, if they are so inclined. Of course, that means they'll need some awareness of what they can change and how, but for those who don't know, the basic standards are probably fine. I use minimum formatting, with one CSS for browser display, and another for printing. This lends itself to a certain global accessibility, since the printer version does not direct the result to fit a particular size of paper. It allows the user to print on any paper, because it's solely an issue of browser and printer controls. All I do is provide relative guidelines, and those who really must change those can do so by telling their browser to prefer their own CSS over mine. I construct my CSS files specifically to be advisory, and easily overridden by anyone who has the savvy to set their own user-CSS.

On the one hand, this forces me to focus on the content. It's pretty much the same for typesetting software. In my best academic style, I try to limit the use of italics or boldface, and reserve underline for marking hyperlinks. If you really need white-on-black, you can tell your browser to make it so, and it should work fine. If you really like blinking text, you are free to learn how to make it happen and apply it to my webpages. You can print any color you wish, because I only suggest black-on-white as the most common style. Depending on how your browser and printer cooperate, printing my webpages should work about the same as with a word processor.

I used to have a huge cache of word processor documents, and always worried about the format and whether anyone else could open them and read or print them. Over the past few years, I have changed them all to XHTML. It's become the most widely acceptable format for just about any purpose.

Ed Hurst is Associate Editor of Open for Business.

Also Filed Under: Home: Technology: XHTML Word Processing

Join the Conversation

1 comments posted so far.

Re: XHTML Word Processing

Did you ever hear of a standard whose very purpose is long-time archiving? http://www.ofb.biz/images/smilies/icon_neutral.gif

http://www.pdfa.org/doku.php #

What actually is… PDF/A?

#

To put it simply - PDF/A is an ISO Standard for using PDF format for the long-term archiving of electronic documents. And what does this actually mean? Read more here…

I also (knowing XHTML) have no idea what you mean by saying you now store your documents in XHTML for reading and printing. With what, XSLT & FOP? This site needs to allow you to post longer articles, cos by the time you get going, you are cut off.

— [email address given to this form is a /dev/nuller] http://grandscheme.org

Posted by Eric Blair - May 13, 2009 | 12:26 AM