Semantic code: What? Why? How?
Posted in Tech/Development on: Tuesday, November 29, 2005 by Paul Boag
Web designers like to throw around a lot of jargon that can prove very confusing for those who have to work with them. With that in mind, over the coming weeks, I want to focus on the more popular techno babble and try to dispel some of the mystery. First up: semantic code.
What is semantic code?
Even if you are not a web designer, you are probably aware that your site has been written in HTML. HTML was originally intended as a means of describing the content of a document, not as a means to make it appear visually pleasing. Semantic code returns to this original concept and encourages web designers to write code that describes the content rather than how that content should look. For example, the title of a page could be coded like this:
<font size="6"><b>This is the page title</b></font>
This would make the title large and bold giving it the appearance of a page title, but there is nothing that describes it as a title in the code. This means a computer is unable to identify this as being the page title.
To write the same title semantically so that a computer understands that this is a title, you would use the following code:
<h1>This is a heading</h1>
The appearance of your heading can then be defined in a separate file called a "cascading style sheet" without interfering with your descriptive (semantic) HTML code.
Why is semantic code important?
I have already hinted at one reason why semantic code is important when I said that without explaining what a piece of content is, a computer has no way of identifying it. The ability for a computer to be able to understand your content is important for a number of reasons:
- Many visually impaired people rely on speech browsers to read pages back to them. These programs cannot interpret pages very well unless they are clearly explained. In other words semantic code aids accessibility
- Search engines need to understand what your content is about in order to rank you properly on search engines. Semantic code tends to improve your placement on search engines, as it is easier for the "search engine spiders" to understand.
However, semantic code has other benefits too:
- As you can see from the example above, semantic code is shorter and so downloads faster.
- Semantic code makes site updates easier because you can apply design style to headings across an entire site instead of on a per page basis.
- Semantic code is easier for people to understand too so if a new web designer picks up the code they can learn it much faster.
- Because semantic code does not contain design elements it is possible to change the look and feel of your site without recoding all of the HTML.
- Once again, because design is held separately from your content, semantic code allows anybody to add or edit pages without having to have a good eye for design. You simply describe the content and the cascading style sheet defines what that content looks like.
How to ensure a site uses semantic code?
There is no tool that can check for semantic code. It is a matter of looking at the code and seeing if it refers to colours, fonts or layout instead of describing what the content is. If looking at code all sounds a bit too scary then a good place to start is by asking your web designer if he codes semantically. If he looks at you blankly or starts waffling, you can be sure he does not. At that point you need to decide if you wish to pressure him into getting up to speed or if you want to find yourself a new designer!









12 Comments
Comments are for the discussion of this post. If you have other questions / comments then post them to the forum or send me an email
Excellent article and reasons to use semantic code, Paul. Also note that semantic code is the underlying theme of XHTML, which if used correctly, future-proofs your site, and makes it more accessible to other devices.
Yes, this is very simple explanation , which even clients can understand.
Great article !
I have a question that has been in my head for a couple of weeks. The question is related to XHTML and semantics of course. Why XHTML 1.0 Reference contains tags like <b> and <i> while <u> is deprecated? From what i understand <b> and <i> should be deprecated too since they define the look and they do not have semantic content (?) . Can you please, explain to me what am I missing here?
@konou I’ve been wondering about similar things myself. It’s true that and are depracated in favor of and , but I’ve been wondering if it makes any sense to even include those, since they are also presentational. Am I wrong? I would think that this…
I am not that good at this!
could rather be..
I am not that good at this!
with the CSS defining emphasis as { font-style: italic; }
Is there something I haven’t understood?
Aalaap -
<p>I am not <em>that</em> good at this!</p>
<p>could rather be..</p>
<p>I am not that good at this!</p>
<p>with the CSS defining emphasis as { font-style: italic; }</p>
Your change would not give you the ability to style only the word “that” italics the way the original example does.
Besides, emphasis is semantic! You are emphasising a certain word or phrase as especially important. Visual browsers simply indicate those emphasised words with italics. Aural browsers could change the voice inflection, textual browsers could indicate emphasis another way. My point is that emphasis (<em>>)by itself does not connote display, while italics (<i>) does. You could always put in emphasis tags and then style emphasised text with bold small-caps if you wanted.
Really great article, thanks for sharing it. I have written a similar article and linked to yours on: http://www.nellen.co.za/Articles/The-web-and-Usability-also-known-as-User-Friendliness/
I have recently found a W3C tool named “Semantic Data Extractor” which I think you might be interested. This tool checks your document and creates an outline. You can see most of your mistakes (I had many!) Here is the link: http://www.w3.org/2003/12/semantic-extractor.html
Thank you for publishing this informative article.
Another related article:
http://www.analoga.com.uy/en/articles/html-based-copywriting.html
this is an amazing article, answered all my doubts i ever had regarding semantic html…the word & the explanation throws a crystal clear meaning about it..i have saved this page on my local & will refer to it again & again.. great efforts
Very nice, good artical of semantic code.
Dear all,
I’m coming back with an answer to my own question. I think that rlively already explained it.
The question was why and are not deprecated while is?
The answer is: because and both define semantic meaning while doesn’t.
: defines a piece of text which is of some importance (like ).
: defines a piece of text of some emphasis (like ).
— — —-
However, I still think that and define feel and look since they refer to bold and italics which is clearly a font attribute and not a semantic reference. The semantic meaning is clearly defined by and . Bold () defines the font type while strong defines that the text is of some importance (and the browser renders it as bold – what if tomorrow this rendering changed to red/bold, then and would not have an equal result).
So, wouldn’t it be much more clear if and become deprecated and let only and ?
CORRECTED:
Dear all,
I’m coming back with an answer to my own question. I think that rlively already explained it.
The question was why <b> and <i> are not deprecated while <u> is?
The answer is: because <b> and <i> both define semantic meaning while <u> doesn’t.
<b> : defines a piece of text which is of some importance (like <strong>).
<i> : defines a piece of text of some emphasis (like <em>).
— — —-
However, I still think that <b> and <i> define feel and look since they refer to bold and italics which is clearly a font attribute and not a semantic reference. The semantic meaning is clearly defined by <strong> and <em>. Bold (<b>) defines the font type while <strong> defines that the text is of some importance (and the browser renders it as bold – what if tomorrow this rendering changed to red/bold, then <b> and <i> would not have an equal result).
So, wouldn’t it be much more clear if <b> and <i> become deprecated and let only and ?