Skip to content

A podcast for those who design, develop and run websites.

Boagworld is the blog of web strategist Paul Boag who lives in the heart of rural Dorset (hence the cows). He produces a weekly podcast with UX consultant Marcus Lillington on building and running websites. They also run the web design agency Headscape.

Latest Shows

216. Thanks for all the fish
This week on Boagworld: Chris Coyier talks CSS and more, we say goodbye to the boagworld podcast and ask what can you listen to now?
215. Web Directions
This week on Boagworld: Emerging trends at Web Direction @Media, playful web design and death to design by committee.
214. When to hire a web designer
This week on Boagworld: When to hire a web agency, user testing on disposable websites and a need for speed.
213. Getting all emotional
This week on Boagworld: Stephen Anderson on emotional design, I review the iPad and we talk fonts, flash and fotos.
212. More skills to learn
This week on Boagworld: 5 new skills every web designer needs to know and how to be inspired while maintaining focus.

or view all shows

Have your say

Become a part of the Boagworld community...

Semantic code: What? Why? How?

Posted in Tech/Development on: Tuesday, November 29, 2005 by Paul Boag

Web designers like to throw around a lot of jargon that can prove very confusing for those who have to work with them. With that in mind, over the coming weeks, I want to focus on the more popular techno babble and try to dispel some of the mystery. First up: semantic code.

What is semantic ?

Even if you are not a web designer, you are probably aware that your site has been written in . HTML was originally intended as a means of describing the of a document, not as a means to make it appear visually pleasing. Semantic code returns to this original concept and encourages web to write code that describes the content rather than how that content should look. For example, the title of a page could be coded like this:

<font size="6"><b>This is the page title</b></font>

This would make the title large and bold giving it the appearance of a page title, but there is nothing that describes it as a title in the code. This means a computer is unable to identify this as being the page title.

To write the same title semantically so that a computer understands that this is a title, you would use the following code:

<h1>This is a heading</h1>

The appearance of your heading can then be defined in a separate file called a "cascading style sheet" without interfering with your descriptive (semantic) HTML code.

Why is semantic code important?

I have already hinted at one reason why semantic code is important when I said that without explaining what a piece of content is, a computer has no way of identifying it. The ability for a computer to be able to understand your content is important for a number of reasons:

  • Many visually impaired people rely on speech to read pages back to them. These programs cannot interpret pages very well unless they are clearly explained. In other words semantic code aids
  • engines need to understand what your content is about in order to rank you properly on search engines. Semantic code tends to improve your placement on search engines, as it is easier for the "search engine spiders" to understand.

However, semantic code has other benefits too:

  • As you can see from the example above, semantic code is shorter and so downloads faster.
  • Semantic code makes site updates easier because you can apply style to headings across an entire site instead of on a per page basis.
  • Semantic code is easier for people to understand too so if a new web designer picks up the code they can learn it much faster.
  • Because semantic code does not contain design elements it is possible to change the look and feel of your site without recoding all of the HTML.
  • Once again, because design is held separately from your content, semantic code allows anybody to add or edit pages without having to have a good eye for design. You simply describe the content and the cascading style sheet defines what that content looks like.

How to ensure a site uses semantic code?

There is no tool that can check for semantic code. It is a matter of looking at the code and seeing if it refers to colours, fonts or instead of describing what the content is. If looking at code all sounds a bit too scary then a good place to start is by asking your web designer if he codes semantically. If he looks at you blankly or starts waffling, you can be sure he does not. At that point you need to decide if you wish to pressure him into getting up to or if you want to find yourself a new designer!

Post to Twitter Post to Delicious

What did you think about this post?

14 Comments

Comments are for the discussion of this post. If you have other questions / comments then post them to the forum or send me an email

  • Dennis says:

    Excellent article and reasons to use semantic code, Paul. Also note that semantic code is the underlying theme of XHTML, which if used correctly, future-proofs your site, and makes it more accessible to other devices.

  • Nemanja says:

    Yes, this is very simple explanation , which even clients can understand.
    Great article !

  • konou says:

    I have a question that has been in my head for a couple of weeks. The question is related to XHTML and semantics of course. Why XHTML 1.0 Reference contains tags like <b> and <i> while <u> is deprecated? From what i understand <b> and <i> should be deprecated too since they define the look and they do not have semantic content (?) . Can you please, explain to me what am I missing here?

  • Aalaap says:

    @konou I’ve been wondering about similar things myself. It’s true that and are depracated in favor of and , but I’ve been wondering if it makes any sense to even include those, since they are also presentational. Am I wrong? I would think that this…
    I am not that good at this!
    could rather be..
    I am not that good at this!
    with the CSS defining emphasis as { font-style: italic; }
    Is there something I haven’t understood?

  • rlively says:

    Aalaap -
    <p>I am not <em>that</em> good at this!</p>
    <p>could rather be..</p>
    <p>I am not that good at this!</p>
    <p>with the CSS defining emphasis as { font-style: italic; }</p>
    Your change would not give you the ability to style only the word “that” italics the way the original example does.
    Besides, emphasis is semantic! You are emphasising a certain word or phrase as especially important. Visual browsers simply indicate those emphasised words with italics. Aural browsers could change the voice inflection, textual browsers could indicate emphasis another way. My point is that emphasis (<em>>)by itself does not connote display, while italics (<i>) does. You could always put in emphasis tags and then style emphasised text with bold small-caps if you wanted.

  • Adriaan says:

    Really great article, thanks for sharing it. I have written a similar article and linked to yours on: http://www.nellen.co.za/Articles/The-web-and-Usability-also-known-as-User-Friendliness/

  • I have recently found a W3C tool named “Semantic Data Extractor” which I think you might be interested. This tool checks your document and creates an outline. You can see most of your mistakes (I had many!) Here is the link: http://www.w3.org/2003/12/semantic-extractor.html

  • Michel says:

    Thank you for publishing this informative article.
    Another related article:
    http://www.analoga.com.uy/en/articles/html-based-copywriting.html

  • suraj says:

    this is an amazing article, answered all my doubts i ever had regarding semantic html…the word & the explanation throws a crystal clear meaning about it..i have saved this page on my local & will refer to it again & again.. great efforts

  • Mohd Kashif says:

    Very nice, good artical of semantic code.

  • konou says:

    Dear all,
    I’m coming back with an answer to my own question. I think that rlively already explained it.
    The question was why and are not deprecated while is?
    The answer is: because and both define semantic meaning while doesn’t.
    : defines a piece of text which is of some importance (like ).
    : defines a piece of text of some emphasis (like ).
    — — —-
    However, I still think that and define feel and look since they refer to bold and italics which is clearly a font attribute and not a semantic reference. The semantic meaning is clearly defined by and . Bold () defines the font type while strong defines that the text is of some importance (and the browser renders it as bold – what if tomorrow this rendering changed to red/bold, then and would not have an equal result).
    So, wouldn’t it be much more clear if and become deprecated and let only and ?

  • konou says:

    CORRECTED:
    Dear all,
    I’m coming back with an answer to my own question. I think that rlively already explained it.
    The question was why <b> and <i> are not deprecated while <u> is?
    The answer is: because <b> and <i> both define semantic meaning while <u> doesn’t.
    <b> : defines a piece of text which is of some importance (like <strong>).
    <i> : defines a piece of text of some emphasis (like <em>).
    — — —-
    However, I still think that <b> and <i> define feel and look since they refer to bold and italics which is clearly a font attribute and not a semantic reference. The semantic meaning is clearly defined by <strong> and <em>. Bold (<b>) defines the font type while <strong> defines that the text is of some importance (and the browser renders it as bold – what if tomorrow this rendering changed to red/bold, then <b> and <i> would not have an equal result).
    So, wouldn’t it be much more clear if <b> and <i> become deprecated and let only and ?

  • Lex Fitz says:

    A client sent me to this article, just now. “Uhuh, that’s what it means to code semantically!” Thank you Paul, you have raised my stocks.

    I’m somewhat of a “semantic code Evangelist” and it can be a pain to always have to explain why semantic code is so important. I will to link this article in an upcoming post on my blog.

    When looking at source code for reference to colors and fonts etc, remember to search for “table” too. Tables are fine for tabular data, but if they’re used for anything else, it is not semantic. It is still possible, and even likely, there is other non-semantic code on your site, despite having passed this test. But it is a great acid test nonetheless.

    It is still fairly common for web developers to fall back on tables, because they’re easy to use for otherwise complex layout problems. Though it’s not always out of laziness, sometimes they just don’t realize what is and isn’t tabular data. And after a decade where all web data was widely viewed as tabular, that’s not really surprising.

    A few days ago I published a related article on my blog at http://www.webdesignfront.com/rants/tableless-css-image-gallery-bulletproof-ie6-browser-support/ about semantically coded image galleries. It reminded me of what I think are the two biggest barriers to writing semantic code. One is how to properly clear floats, and the other being how to deal with Internet Explorer bugs.

    I wrote it because when I recently built a gallery for a client’s new site, I actually caught myself thinking, “Aren’t gallery images really just tabular data anyway?” Of course I laughed at the notion, but it was a reminder how insidious that 1998 coder mentality can be, even years after having crossed to the bright side. So as usual, I coded the gallery semantically as a list, because image galleries are just a list of images. Then I styled the list to look like a gallery. Remembering there are a couple of major “gotchas” when doing this properly, and having resolved them, I figured I’d write and share an article on how to do it right.

    I hope this will help somebody keep up the good fight. Thanks again for the article, Paul.

    Best regards,
    Lex Fitz

    P.S I have subscribed by RSS and will stop by again.

  • It all sounds like semantics to me!!

Leave a comment

Additional Information

Produced by Headscape

Boagworld is produced by the web design agency Headscape founded by Marcus, Paul and Chris Scott. Headscape also has a number of other talented guys who blog. Check them out.

  • Craig Rowe is one of our amazing developers and writes some superb posts on everything from .net to AIR apps.

  • Ed Merritt is a Headscape designer who's blog contains examples of his work and a number of free Wordpress themes.

  • Dave McDermid is a Headscape developer who has an excellent blog. He blogs on everything from AJAX to security.

  • Rob Borley is one of our project managers and blogs regularly on client and project management issues.

  • Leigh Howells is our multimedia design guru (whatever one of those is). He blogs on a mixture of design and music.

You can now download my video presentation of 40 better ways to work with clients for only £9.25.