184. HTML5

On this week’s show: We interview Jeremy Keith about the truth of HTML5 and Ryan Carson shares some more advice about building your own web application.

Play

Download this show.

Launch our podcast player

News

Apple and UI design

Whether you are a fan of the Mac OS or Windows, there is no denying that Apple know what they are doing when it comes to user experience design. There is a lot that can be learnt from them whether you are developing a piece of software or even a website.

This week I have come across two sites which perfectly demonstrates just how deep Apple’s knowledge of UI design goes. The first is Finer Things in Mac, which is a site dedicated to highlighting the small details in the mac OS UI which makes it shine. Examples include:

  • The naming of buttons so they are more descriptive than “Okay” or “Cancel”
  • The way Time machine changes the clock in the menu bar as you go back in time
  • The accuracy of progress bars when compared to windows

The second area that demonstrates Apple’s expertise is in how they have chosen to tackle accessibility on the iPhone 3.1 interface. It is hard to imagine that a touchscreen device with only one physical button could ever be accessible to blind users. However, it would appear they have made it happen.

Marco, a totally blind user who works with Mozilla wrote in his post My first experience using an accessible touch screen device:

I must say this was an amazing experience! My fingers definitely need to get used to gestures such as flicking or tapping, or using a rotor. But having an iPod Nano 4th generation helped with that, since moving the finger over the screen like on a dialer felt most like tracking around the iPod’s click wheel. Even the sound the rotor makes is the same.

This amazing result is made possible by Apple’s VoiceOver technology which is extremely hard to describe. However Apple do provide a video that demonstrates the technology in action and I highly recommend you watch it.

Do you need cake if the icing is amazing?

On last week’s show I talked about 3 Ways to Stand Out From the Crowd. One of the things I mentioned as part of that post was Paul Annett’s talk at SXSW called Oh… that’s clever. In this talk he looked at ways to add design delights that excite and amuse users.

Although I am a massive fan of this approach, it can be taken too far. A Sitepoint article entitled: Do You Need Cake if the Icing is Amazing? looks at one example of where this happened.

The article is about the HEMA website where as the author describes it:

The site renders as what appears to be a garden-variety, IKEA-like online store: navigation tabs along the top and popular products displayed in a grid. Yawn. yawn.

That’s when reality seems to break, and strange and wonderful stuff start to happen.

It all begins when a plastic cup tumbles over, bumps the sticky tape, and a domino effect is set in motion. The tape then crashes onto the stapler before squishing the cake, noisily sliding down the xylophone, and knocking over the fluorescent pens like skittles.

This chain of slapstick events continues, drawing ironing boards, blenders, yo-yos, coat hangers, and kettles into the growing maelstrom before eventually breaking out into parts of the site navigation and text.

By the time this sequence of events has all played out, the tabs are torn and frayed, the navigation text has collapsed into a puddle, and confetti flutters about from above. Very, very cute.

The post is an interesting analysis of the pros and cons of this kind of approach. It concludes by saying:

But as sublimely clever as the animation is, I have to wonder if this project, and the buzz it created, has translated into anything particularly useful for the HEMA business.

What’s more, I wonder how many users have ended up feeling disappointed, frustrated, or confused by being unable to find some of the “bread and butter” basics like locating a store, giving feedback, or asking a question.

I have to say I agree. Although adding design delights and easter eggs is nice, it is easy to overstep the line and end up damaging usability and accessibility.

GIT: Your new best friend

If you develop websites as part of a team you probably use a version control system. We do at Headscape and it seems to be common practice. At the most basic level a version control system allows you to manage files, prevent overwriting by multiple contributors and allow rollback to previous versions of an entire site or a particular file.

The dominant player in this market is Subversion but there is a new kid on the block which seems to be getting a lot of press.

Git is a free & open source, distributed version control system designed to handle everything from small to very large projects. Every Git clone is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on network access or a central server.

This distributed approach has a number of obvious benefits, but it would appear this is just the tip of the iceberg. Think Vitamin has published a post entitled “Why You Should Switch from Subversion to Git” which explains the advantages of Git in full.

There is also a detailed introduction to Git on Sitepoint entitled “Git: Your New Best Friend“.

I freely confess that I know nothing about version control systems (that is Dave and Craig’s areas of expertise). However, I have seen a lot of love for Git online and if you are a developer this is almost certainly worth checking out.

Project management and simplicity

Building and running a successful website is about a lot more than having a great design team. It is also about having a visionary website owner who can think creatively and manage projects successfully.

However, being able to manage projects and have time to think creatively about your site can be a challenging combination. Life is simply too hectic.

Fortunately there are two posts this week that might go someway to inspiring and organising you better.

The first is “How Simplicity Can Help Creativity“. It looks at how simple changes to your routine can help grow your creativity. It contains good solid advice, although admittedly some of the suggestions are primarily aimed at writers.

The second is a Smashing Magazine article entitled “How to Find Time For Everything!” This is essentially a productivity post. However, it is particularly aimed at web workers. It covers everything from your working environment to prioritising and planning.

Both posts are worth a read if you find yourself constantly caught up in the day to day running of your website and never get the opportunity to think more strategically.

Back to top

Interview: Jeremy Keith talks about HTML5

Paul: So we are doing our first interview of the day at dConstruct, and joining us, of course, is Jeremy Keith – hello Jeremy.

Jeremy: Hello.

Marcus: Such a happy hello.

Paul: It is, he’s a happy, jolly person; well known for his jolliness.

Jeremy: It’s true.

Paul: So, HTML 5.

Jeremy: Yes.

Paul: There has been so much talk about HTML 5 recently, we thought, as we’re going to be seeing you, as we’re going to be at dConstruct, we ought to grab you and talk about the subject of HTML 5. Now, I haven’t thought through in a huge amount of depth want I want to cover…

Marcus: Surely Jeremy in cartoon form is want you want to cover?

Paul: Yes, that’s a really good point.

Jeremy: I was really shocked you haven’t prepared.

Paul: Yeah, but actually I managed to watch the cartoon, or read the cartoon.

Jeremy: So that was about you’re level.

Paul: That was about my level, yeah, I felt happy there. So maybe just run through this fundamental issue that XHTML 2 has gone away; HTML 5 seems to be the new thing that all the cool kids like, so what’s going on there?

Jeremy: Okay, well first of all HTML 5 has been around for quite a while; they’ve been working on this, so nothing has particularly changed. Well, Google had their IO Summit thing a few months back, and they came out and said, “We totally support HTML 5, and we think it’s the future”, and suddenly a lot of people got very interested in HTML 5 who hadn’t been paying attention, which is a bit weird. It’s like it’s been going on for ages, but as soon as Google says it’s interesting then people are interested. So that was one thing that kind of sparked a bit of interest in HTML 5 which is kind of good to see. But then the whole XHTML 2 thing was just really bizarre because I hadn’t realised the misconceptions that a lot of plain old working web developers were under about the format. Basically, XHTML 2 has been dead for years, and I thought everyone understood this.

Paul: No.

Jeremy: Okay, so HTML 4.01 was finalised around 2001, and after that W3C said okay, HTML is done, and we think the future is XML based… all these various XML based technologies, and amongst that is XHTML. Now, we all rated XHTML 1, but all that was, was just HTML 4, really. I mean all it is, is just an XML version of HTML. Nothing new was added, it’s all the same elements; everything works pretty much the same. So, really, XHTML 1 is just HTML, and most people used it that way, they would serve it as HTML. This upset a lot of people who thought, “Oh no, you’re serving XML as HTML and that’s bad,” but most of us didn’t care, you know. Browsers rendered it, we were able to validate against it, and that was good. I mean that’s why I use XHTML 1. I serve it as text/html, I don’t care that browsers then treat it as HTML, I get the validator telling me if I forget to close a paragraph, or if I forget to quote an attribute; that’s useful to me as a web developer. But it’s not a completely different format to HTML, it’s just HTML redefined as XML.

XHTML 2 is what the W3C started working on after HTML 4 was finished. XHTML 2 was going to be a complete break; an utterly new format. I mean things like you wouldn’t have an image element, you would have ‘object’ to cover any kind of thing you would put on a page. All sorts of things that really, theoretically, were wonderfully pure and abstract, but practically, just no way anybody’s ever going to use this, because it wasn’t backwards compatible with all the content on the web today.

So there was this kind of split in this W3C around 2004. There was this one particular meeting where the W3C really put their foot down and said, “The future is XML and particularly, XHTML 2. We know it’s not backwards compatible, we don’t care, we’re going to plough forward with this.” And some people in the W3C said “That’s it, we can’t deal with this”, and split off into this separate splinter group (this was Opera and Mozilla at this point), and they called themselves the ‘WHAT Working Group’. Web Hypertext Application something… I don’t know. It’s clearly a ‘back-ronym’ right? They came up with the word ‘what’ and figured out words to go around it.

So, they start work on this thing which they call HTML 5. That’s kind of this buzzword umbrella term to cover one, the next version of the mark-up language we know as HTML, two, XForms which is a good thing that the W3C have been working on to make life better for web developers, but mostly this idea of what they were calling web applications one, which was the idea that we’re moving from documents to web apps, and that’s what real people working on the web really need; fill in the gaps in HTML… We’re working with HTML now to kind of force things to act like applications, we don’t have native things to build applications like, let’s be able to make applications natively so that we don’t have to rely on a plug-in like Flash, or Silverlight, or any of these other non-open technologies.

So that started in 2004. Meanwhile XHTML 2 is being developed at the W3C in the ivory tower where no-one is actually going to use it, and in 2006 Tim Berners Lee wrote a blog post and said, “Yeah we were wrong, we were wrong.”

Paul: Good for him.

Jeremy: Yeah he said, “Actually HTML is still good and is actually the future on the web,” and restarted a charter for the HTML working group. So in 2007, the W3C start work on HTML 5, as in the sequel to HTML 4.

Paul: Corr, this is complicated!

Jeremy: Working with the W… what, working with the WHAT working group?

Marcus: I need a coffee.

Jeremy: I know, I know.

So basically now you’ve got two groups working on this one spec. You’ve got the WHATWG which has a very open process, you’ve got the W3C which is generally more closed, although the HTML working group is more open than most W3C things. A lot of things with W3C happen behind closed doors. Actually with the HTML working group now, anybody can join; it’s a little bit convoluted how you join, but anybody can join.

Meanwhile XHTML 2 was still being developed, you know, down in the cellar they’re working on XHTML 2. But everyone knew, or I assume everyone knew, that it was like this kind of joke really – it wasn’t really intended for working web authors to use on the public web. So earlier this year, finally, 2009, W3C announced, “Oh buy the way, we’re not renewing the charter for XHTML 2″, and most of them went, yeah, no surprise there – really it’s been dead for years.

But a lot of people got really confused, “Oh no I have to change from XHTML to HTML now; I have to change all my mark-up back.” No, if you’re writing XHTML 1, that is basically HTML anyway, and HTML 4 and XHTML 1, those doctypes aren’t going anywhere, your documents are still perfectly good, you don’t have to change anything; everything’s absolutely fine. And what also happened was that some of these people, who were sort of traditionally, as I said, hated it when people like me would serve XHTML 1 as HTML, “Oh no the browser’s treated it as HTML”… They started gloating and saying, “Ah, this is great, this proves that the problem is XML and XHTML in particular, and everyone who ever used XHTML was a fool”, more or less… which really bugged, because like I said, Gareth Rushgrove write a great blog post about this. The reason I use XHTML is to make the validator more powerful, it catches those little things. Basically, it’s like a best practices coding style. So in the same way you’ve got things like JSLint that catches JavaScript coding practices, even if your JavaScript is perfectly valid and will execute, this thing will catch little things which aren’t best practices. That’s what I use the validator for when I write XHTML – it catches little things that wouldn’t be caught if I was writing HTML. So even though I think that’s the reason most developers use XHTML, these people who really looked down on XHTML from the start were like, “Hah-hah, we won!” and were gloating now, and that kicked off a whole shit storm. I’m sorry I shouldn’t say that!

Paul: That’s okay.

Marcus: You can say what you like!

Jeremy: Yeah, so that needed clearing up and a bunch of us were like, “Okay, this is the story and this is what it’s like with XHMTL 2. I know it sounds like XHTML 1, but actually they’re different as chalk and cheese.” I wrote a blog post, and that was okay, and this guy Brad turned it into a comic, and then everyone read it. Then they were, “Oh now I get it – cartoon Jeremy says it’s like cheese, so now I understand it.” No seriously, it did make a big difference, it was more accessible…

Marcus: You must have been happy with ham and hamster…

Jeremy: That I did like; that got quoted on Twitter a lot.

Marcus: Fantastic.

Paul: So this leaves you writing what now? Are you still writing XHTML?

Jeremy: Let’s take sites I’ve built already, so my own personal site, adactio.com, that’s XHTML 1, it has been since 2001, I’m not going to change that. Actually I’ve made a vow not to change my mark-up anyway. Because I kind of assume when people redesign their sites, they change their CSS and they change their mark-up, and I think…

Paul: That kind of destroys the whole point of it, yeah.

Jeremy: So whenever I do a redesign, what I actually do is I add sort of a skin on top. So I’ve got a bunch of skins to my site, I kind of made a vow to myself that even if I’m doing a redesign and think it’ll be really useful to change the mark-up here, but that’s just me personally. Actually at work, and on other projects, I do now use HTML 5.

Paul: Right. Now how does that work, because everybody says, “Well, HTML 5 isn’t supported.”

Jeremy: It’s not ready yet.

Paul: No, exactly.

Jeremy: 2022…

Paul: Yeah, yeah – all of that kind of stuff.

Jeremy: Here’s the thing. If you want to write HTML 5, you take whatever doctype you’re using today, XHTML 1, HTML 4, HTML 3.2, and you change that doctype to ‘doctype HTML’, and there you go you’re writing HTML 5.

Paul: Okay!

Jeremy: So most of HTML 5 is HTML 4. The vast, vast majority of it is exactly the same language, and yet even that is really, really ambitious because one of the things they’re doing is they’re going to define error handling for HTML. Every version of HTML before this, 3.2, 4… All of these previous version, they just defined here’s what authors can do, you’ve got this element, you’ve got that element, boom – the end.

They never defined what happens when you do something wrong, or what a browser should do when it encounters mis-nested tags, things like that. So the browsers have had to kind of invent it, and what happens is that most browsers… Talk to a browser maker and they spend fifty percent of their time figuring out how does Internet Explorer handle this error, and recreating that. A huge amount of time is wasted on this. Or, whatever the leading browser maker is today, how does that browser handle errors? We need to copy that and reverse engineer it. The spec doesn’t tell them how to handle errors. So HTML 5, one of the ambitions is to take HTML 4 and define error handling for everything in it. That alone is massive; massively ambitious. And that alone would take a long time.

In addition, there’s new stuff. So you’ve got this web form stuff, you’ve got new types of inputs, user agents will be able to give us, you know, calendars and sliders and all this stuff, really cool. You’ve got some new structural elements, you’ve got some new rich media stuff like audio and video, you’ve got canvas… So that’s on top of the error handling. So, yes HTML 5 has a lot of new stuff that, frankly, you can’t use today, although a lot of it’s quite well supported, obviously not Internet Explorer, but in a lot browsers, you can use stuff like audio and video, and canvas and stuff like that.

You can’t use all of HTML 5 today, that’s true. But we probably don’t use all of HTML 4 today. For example, if you say, “I’m not going to use a mark-up spec if all of it isn’t implemented in the major browser vendor,” which would be like Internet Explorer, you say, “I’m not going to use a mark-up spec unless Internet Explorer fully implements it.” And yet you would have been using HTML 4 for years. Now, Internet Explorer didn’t fully implement HTML 4 until Internet Explorer 8 when they added support for the abbreviation element.

Paul: Right, okay.

Jeremy: Internet Explorer 7, Internet Explorer 6, Internet Explorer 5, didn’t support HTML 4, if you define support as every single thing in the spec. And you’ve been using CSS 2.1 for years. There isn’t a browser out there… Actually Internet Explorer is the first browser to fully support CSS 2.1, although you could quibble on some stuff, and yet we’ve been using it for years. Because if, ideally, you have to wait until something is fully support until you use it, it’s kind of silly, you just support features. You decide this feature is supported enough that I can use it. Enough browsers support video or canvas that I can use it today. So when people say, “I can’t use it because it’s not ready or it’s not support,” that’s too binary; it’s too simplistic a way of looking at it, because all the technologies we use today, CSS, HTML, are partially support in different browsers – that’s just the way it is.

Paul: Let’s talk about the new structural elements that have been added in to HTML 5. So you’ve got header and footer, and these various other things. I mean obviously you can’t use those as is at the moment because the support isn’t wide enough, is that correct or incorrect?

Jeremy: Well, support’s actually pretty wide in a lot of browsers. What a lot of browsers will do… It’s not so much that they’re supporting these elements, it’s just that they allow you to use arbitrary elements. So in Firefox and Safari, I could write a tag called ‘foo’, and in my style sheets I could say ‘foo, display block, colour red’ and it would work. So if you define working, or support as that, then actually yeah in Firefox and Safari and Opera you can use these new elements today.

Internet Explorer doesn’t have that, Internet Explorer doesn’t let you use arbitrary elements, but there’s a little JavaScript hack you can use to make Internet Explorer behave. In fact we’ve been using this for years with the abbreviation element. If I wanted to use abbreviation and style it in Internet Explorer, in JavaScript I would say ‘document.createElement(‘abbr’)', and suddenly I could style it. It doesn’t make any sense, but then again whatever does with Internet Explorer. So we use the same hack to say create all these new elements.

Paul: Okay, so are you using them at the moment?

Jeremy: No. What I’m doing is, I’m trying to get my head around how these structural concepts work, but I’m not ready to make the move to new elements, and I’m not ready to do client side work that relies on JavaScript, which is what I’d be doing if I used this JavaScript hack. So what I do is, I take the concept, I take the names – header, footer, section – and I use them as my class names. So I’ll have div class equals section, div class equals header, div class equals footer, and it’s partly so I can get my head round how these things work together, and also, it allows me to basically have self documenting code in that when I’m handing it off to developers that are back end developers who are building a system, I often give templates – HTML templates. And I would usually have to write out this is how this class works, and I’d have to comment everything and say the class of ‘foobar’ is used for this kind of content. And now I can kind of say, just link to the HTML 5 spec; header is for this, footer is for that, section is for this, and okay I’m using classes not elements, but the definitions…

Paul: Are the same.

Jeremy: Are the same. So there’s benefit for me getting my head round this new stuff and sort of getting a jump up on what we’re going to be using in the future, and it’s a benefit for documentation because it’s a great big spec documenting all this stuff. So I’m using them but not really in day to day work.

Paul: Because there’s some debate, is there not, about these pre-defined elements, you know, things like footer I mean. I was reading Zeldman about, well footer isn’t what you expect it to be. So potentially, as this is still in draft, you might be using footer in what will ultimately end up being the wrong way because they might change the way footer works and then you’re going to be out of sync with things.

Jeremy: Yeah, but it’s by using these things that you get to know these problems, and flag them up and bring them to the working group. In Zeldman’s thing, he said… So Zeldman asked a bunch of us to get together with him in New York about two weeks ago, because basically we’re all working web developers, we’re all interested in HTML 5, but we’re kind of sceptical of it, so like we’d be hearing conflicting things people say, there’s a lot of rumours going round, we don’t know what to believe, let’s all get together in one room and go through the spec.

We spent two days literally just going through the spec, and figured out how it affects us. And we’re not interested in the browser features, and the DOM stuff and the APIs, we’re interested in the structural stuff; we spent most of the time talking about this and the semantic meanings of these new elements, and that’s where we came against these issues. Like, wait a minute, the content model for what they call footer is totally different from what any working web developer would call a footer. So in our element, footer, you can’t put a nav element, and you can’t use headings, you can’t use H1s or H2s. But you ask a working web developer what footer is, and they point you to Flickr, or their blog they have, you know, “Oh a footer’s where I pull in my pictures from Flickr, and I have other navigation, and stuff like that.” So instead you have a flat footer which has kind of got a bit more popular in the last few years. So there’s a clash there in the semantic meaning of footer in English for most working web developers, and the semantic meaning of footer the element, and that’s something that came out of that meeting, we realised okay, this is an issue. That’s one of the things that we flagged up.

And there’s a few things like that, that when you really sit down and look you realise hang on, these two elements sound really, really similar, why are there two of them? So there’s a section and an article… For historical reasons I can see why they’re different, but they’ve actually evolved over time now to converge and get really similar. So we found a bunch of these things as authors we realised there were issues with, and we’re bringing them to the working group, we’re flagging them up, we’re publishing them, we’re blogging about it. And I think the reason this hasn’t really come to light before, is that most of the people on the list, and in IRC and the WHAT Working Group, they’re kind of hard core developers making web apps. These are the people building Google Wave, and building these big, big apps, and they’re not really thinking so much about the semantic meaning of documents. And also, there’s the browser makers, which is good – you want browser makers involved in the spec so that you know this stuff’s going to be implemented…

Paul: Yeah, of course.

Jeremy: As Hicksy keeps saying, you know, if one browser maker says I won’t implement this feature, that feature comes out of the spec, because otherwise the spec is just fiction, and it wants to be practical. And of course, that means the spec won’t be the best possible spec it could be, but it will be the best possible spec it can be and be implemented…

Paul: …Realistically.

Jeremy: Exactly, realistically. So, there hasn’t been enough working web developers involved in the process, in my opinion. There’s a lot of programmers, a lot of browser makers, not enough just, you know, working web developers. So, recently, Zeldman made this effort, let’s all get together. I’d been researching it a lot, and so I was explaining stuff to them, they were telling me about how they felt about this stuff, there was Eric Meyer there who’s got a lot of history with working with specs, and Tantek was there and he knows about how to read these kind of specs. So together we had a really good group of people, and we were able to come to the consensus that we can all… I mean, we disagreed on some stuff, I’d like to have this feature, and I don’t care about that feature, but there was a core set of stuff we all agreed on that was in the spec: this was confusing, that needs to either change, be dropped, stuff like that, and that would be our concerns. And what I what I’ve been encouraging people to do on my blog, which has kind of turned into an HTML 5 blog…

Paul: Yeah it has, I’ve noticed this!

Jeremy: …Is to get you, the kind of people who would read my blog, to get involved in the process, to get involved in the WHAT WG, and I’ve seen it happening, it’s great. I’ve seen web developers going on the IRC channel asking really basic questions like, “I’m confused by this element, how am I supposed to use it?” And then these people who are writing the spec, having to explain in normal words how it’s supposed to work, and then realising hang on this might be a problem, if I can’t explain it well then maybe this element is going to be an issue, and things like footer are obviously a problem; it’s got to change.

Paul: I read some article which seemed very left field, which was basically saying why is the W3C actually defining things like nav, and footer, and article, and section and all the rest of it? Why can’t we make up our own tags? Which kind of almost brings you back to XML I guess.

Jeremy: Yeah, the idea of extensibility. I think it’s probably John Allsopp’s List-A-Part article you talking about.

Paul: It wasn’t actually, but yeah I was aware of that one as well.

Jeremy: So there’s basically two schools of thought. What you need to provide is a mechanism for extensibility that allows anybody to extend it, and this kind of is the W3C idea of RDF and RDFA, that potentially you could encode any data in the universe, it’s infinitely extendable, and any author can define a vocabulary. And then there’s the other school of thought which is you keep the extensibility deliberately limited, and deliberately centralised to a community, and that people have to cooperate to decide what extensibility is to be used. Now that basically comes from the micro-format school of thought, where you don’t try and encode everything in the universe. You look for what’s the most common use cases, what’s the minimum you need to allow people to encode those cases, and you quantify that. You say we’ll create an element for that because eighty percent of people are publishing it, but we’re not going to create an element for something that’s really fringe, and sort of left field.

And that’s the way the HTML 5 group have gone, it’s like we’re going to keep things scarce and controlled, and if we create a new element it’s for a really good reason; we’ve thought about it, and it’s actually going to help web developers. And if anything, I think they might have created too many, not too few, and it’s only a handful, there’s maybe like ten new elements, you know structural elements, and if anything I think so of them could go.

So there’s these two very different schools of thought, and I was reading a blog post from 2006, nothing to do with HTML 5, it was to do with this idea of namespaces in HTML, which is what you get in XML – allows namespaces to allow you to create your vocabulary, you can put anything in there. It said if namespaces had been allowed in HTML then during the browser wars in the late nineties when this browser was inventing this tag, and that browser was inventing that tag, and it was just this mess of stuff, if there was a way to infinitely extend HTML, it would have legitimised that. They would have been okay with that, and we couldn’t have complained. As it was, they had to sit down in the end and sit around the table and hash this stuff out, because we complained and said, “It’s a messed up landscape and you’ve got some browsers supporting some stuff and some browsers support another thing.”

So, because HTML is limited, you have a certain amount of interoperability, and so I’ve come around to that point of view. That actually I can see the case for extending HTML and we have some mechanisms, we have the class attribute, a fairly limited extensibility thing, and I can see why for your own personal needs you might want to extend HTML, but I do actually think there’s a benefit to having a community deciding what’s the best elements, as long as that community is listening to all concerns, and that includes authors. My concern is that the community isn’t getting enough input from working web developers, but I see that changing. So I’m actually pretty reassured.

Paul: Okay, that makes a lot more sense. There’s one other question that I want to ask before we wrap things up, which is this canvas element. There seems to be a lot of excitement about canvas, but very little descriptions about what it does and why it’s exciting that is accessible to a lot of people, and understandable by a lot of people. Can you kind of summarise the canvas?

Jeremy: The canvas is a dynamic image.

Paul: Right.

Jeremy: The image element is static, and the canvas element is dynamic over time, and it can be interactive. So basically I believe you can have interactive graphs, you can have moving things like animation, you can draw on it and define movements… So it can do a lot of stuff that say, you know, Flash 1 could do, all this basic sort of stuff, and it is very exciting that where you would have needed a plug-in, you can now write an element and write some script, and you don’t need a powerful piece of software to write this stuff, you just need a text editor and everything’s cool.

So that’s good, and it’s already implemented in a lot of browsers; Safari, Firefox – they’ve got canvas, and people have already done really exciting things. John Resic has ported the processing language into JavaScript, processing.js, that uses canvas; it’s amazing. You’ve got this ball bouncing, and lines going, and generative art; all this amazing stuff – no plug-in required, it’s all native to the browser. That’s great. So the spec that defines canvas is for interactive or dynamic images. It is not for text. The spec doesn’t ever say… in fact it will forbid you from using text. But because it can, people have started putting text in there and then dynamically…

Paul: Because doesn’t Google Wave use it?

Jeremy: Err, I’m not sure about Google Wave, I haven’t really checked that out…

Paul: Perhaps I’m getting confused…

Jeremy: But Google is certainly experimenting with canvas with some things. But there’re things like… Have you seen Bespin? It’s this Mozilla project which is basically a text editor in your browser, or a code editor in your browser, and it’s all built using canvas. It’s incredibly powerful, it’s really impressive. But, those bits of text that are in canvas are just shapes to the DOM, there is no DOM saying this is a string, this is an element, it’s just there are some vector shapes, because canvas is just vector. So any piece of assistive technology just sees a bunch of… like there’s an image here of some kind. And you can find some fallback to describe the image, this is a chart showing blah blah blah, but if you’re putting a text editor in there, there’s no way they can interact with it, it’s just impossible.

Now, you could say, “But that’s not the point,” and the spec should maybe say you don’t use it for this, but people are going to use it for that because you can. So you say, okay, then the challenge is, if people are going to use it for that, even if they should be using a different technology like SVG or Flash, if people are going to do that how do we make it accessible, and that’s where there’s a lot of work going on. There’s people at Apple working on this, James Craig, who’s a really smart guy, he’s working on this idea of a shadow DOM that’s in canvas, so there is work on that. I personally think it’s such a big issue, it’s such a big thing, it’s such a powerful thing, I think it might benefit from being split off into its own spec.

Paul: Oh, okay.

Jeremy: Which has already happened with some HTML 5 stuff. Things like storage and client-side database, there’s all this powerful stuff. A lot of that ends up being spun off into a separate spec because it’s like it’s getting too unwieldy for the mark-ups.

Paul: Yeah, yeah.

Jeremy: And I kind of have this suspicion that this might be the case with canvas, because otherwise it may hold up the rest of the spec, and we’ve seen this happen with CSS 3. There were certain parts of the modules that were really important to solve for internationalisation reasons like text modules, but because that one part of CSS 3 was hard to solve it held up all this other stuff like multiple background images and border radius, and it held the hold thing back, and I would rather see things split off and worked on separately than be part of a spec and holding it back. So we’ll see; canvas is a big, big issue.

Paul: So you wouldn’t be doing a huge amount yourself in Clear Left, doing huge amount with canvas at the moment?

Jeremy: No, but that’s not the kind of work we do, it doesn’t really require…

Paul: But then the kind of work you do is the kind of work that a lot of the people listening to this will be doing. Rather than big fancy web apps.

Jeremy: Yeah, what we tend to do is documents that have interactive elements to them rather than applications that sometimes have document-like parts, and I think most people working on the web are like that, you know documents… Some things you want to be more interactive. So there’s other things in HTML 5 I would use, but that wouldn’t be one of them.

I’ll use some of the new input types already, because the default behaviour, if you say input type equals foo, the browser doesn’t know what foo is, it will just display a text input. That’s the default fallback behaviour for every browser. So if you try one of these new things like input type equals search, well in Safari will get this nice search field the same way you get in the Google search part of Safari; you get this nice styled thing with a little ‘x’ in the corner. Every other browser just gets a text input, which is what you would have got anyway. And there’s a few things like that I use today, because there’s no harm and because there is that kind of degradation that works well. But it’s the little things I use.

Paul: So, let’s finish things by saying okay, there’s some people that have listened to this and thought, “Yeah, really cool. I want to be doing some of that kind of stuff that you’ve just mentioned.” Where do they go to learn that stuff? And don’t say the spec!

Marcus: Where do normal people go?

Jeremy: The full spec has got loads of stuff for browser makers; we don’t care about that – we’re not browser makers. Michael Smith at W3C has actually spun off… Parse the spec, take out all the stuff for browser makers, and that just leaves the stuff for authors.

Paul: See, now didn’t that… That was like today that came out wasn’t it?

Jeremy: No, no, it’s been out for a while, but I pointed Zeldman to it and he blogged about it.

Paul: Ah, that’s where I read about it. There we go.

Jeremy: Yeah, because we’ve got this basecamp going for our HTML 5 super-friends group, and err…

Paul: You’re a super-friend!

Jeremy: Yeah, with unicorns.

Paul: Okay, that’s good!

Jeremy: And Eric was saying, “We should create a version for the spec that’s just the author’s stuff without the browser stuff,” and I was like actually that exists and it’s over here and then Zeldman linked to it. If you look at Zeldman, he’s linked to it; I’ve linked to it in my blog a few times as well. So there’s that, that’s really useful for just getting the semantics. Going to the IRC channel and the chat there, there’s also sites like html5doctor.com, that’s people like Bruce Lawson, and Remi Sharp, and these people. You have a question, you write to them, they answer the question.

Paul: Brilliant.

Jeremy: They also get questions from people with genuine medical problems, because they give out good Google juice for the word ‘doctor’.

So that’s very useful for web developers who say, “How do I do this? I don’t understand this element.” Yeah, start using it. I would say you’ve basically got three options today. You can just swap out your doctype to ‘doctype html’ and stop there, and just carry on doing what you’re doing – that’s one way. You can swap out your doctype and start using the class names that are taken from these structural elements, that’s what I’m doing, that’s another way. Or you can be hardcore. Swap out your doctype and start actually using these new elements, and use this JavaScript hack for IE.

So you’ve got these three levels of how far you want to go. I mean, on your personal site you could use that third option, push the boat out and go for it. I wouldn’t really do it on a client’s site yet. So you’ve got those things, there’s a validator, html5.validator.nu, and actually the W3C validator uses that, so you can validate HTML 5 today, and that means you can use Aria roles in HTML 5, which you can’t do in HTML 4 or XHTML and validate at the same time. So that’s really, really useful having Aria and HTML together. So you can do it today, keep up with HTML 5 Doctor, hang out in the RSD channel, and if you’re up to it, join the mailing list, although there’s a lot of talk from browsers makers. It goes way over my head.

Paul: Okay, thank you so much Jeremy, that was really useful. I think that has certainly clarified a lot for me, and I’m sure for a lot of people listening too.

Jeremy: I sew clarity.

Paul: Yes, thank you for passing your wisdom on to us.

Jeremy: Any time Paul.

Paul: Okay thanks.

Thanks goes to Gareth James for transcribing this interview.

Back to top

Ryan Carson: advice on building web apps part 3

Hey everyone, I’m Ryan Carson, founder of Carsonified.com, the makers of Think Vitamin, DropSend and the events Future of Web Apps and Future of Web Design.

We’ve both failed and succeeded in building web apps so I’m going to share a few tips that we learned the hard way. Hope you find them useful!

  1. Measure marketing success. As developers and designers, we often shy away from marketing as it’s perceived as dirty. This will absolutely kill your web app. So make sure every time you place an ad, send out an email newsletter or ask someone to link to you, use the Google URL Builder and track the results as a campaign in Google Analytics.
  2. Write out the signup steps for your app. Before you even begin wireframing, spend a day jotting down the rough steps that the user will go through to sign up for an account. This helps you spot potential areas for improving the user experience. Make sure to pay special attention to which form fields should be required and what kind of help text you’ll write to accompany the page.
  3. Give away free-upgrade coupons. When we were running DropSend.com, we placed a simple status message at the top of every page of the app. We changed it weekly to make sure people noticed it and we also made it really noticeable with colour. We regularly offered a coupon code which allowed users to upgrade for free for the month and it worked surprisingly well. I think this is because it removes the risk of trying out the more expensive plan. I’d definitely recommend it.
  4. Create content for your potential users. One of the most powerful ways to do marketing for your product is to offer valuable content to your potential customers. This is expensive and time consuming but it’s highly effective. For example, if you’re building a todo list web app, write a blog that offers a free daily productivity tip or GTD hack. Make sure to offer a weekly newsletter – they’re very effective if they’re based on good content (and not cheesy sales offers).
  5. Get your hosting and deployment nailed down. The thing that will haunt you forever is poor hosting and deployment methods. I’d highly recommend that you make a 1-button deployment system that quickly deploys from a development environment to a staging environment and then to a production server. Make sure the database on the staging environment is identical to production, as this will help you spot bugs. I can’t tell you how many times we’ve had trouble because no one except the developer new how to deploy the site. It needs to be easy, bullet-proof, and reversible if things go wrong.

That’s it – hope everyone found those simple tips useful.

By the way, we’ll be discussing other important web app topics at The Future of Web Apps on Sep 30 – Oct 2nd in London. Speakers include Twitter, Facebook, Mozilla, Gary Vaynerchuk, Kevin Rose of digg, Mike Arrington of TechCrunch and more. Feel free to stop by the site at http://bit.ly/fowa-london-09

Back to top

180. Backend

On this week’s show: The Northeners are joined by the Headscape duo Craig and Dave. We talk about why you should care about .NET MVC and answer your questions about managing your code and friendly URLs.

Play

Download this show

Launch our podcast player

Our most complicated setup to date! Ryan in the studio, Craig and Dave in the barn and Stanton on the phone.

Housekeeping

Vote for our SXSW Panels!

It’s that time of year again and we’re asking our beloved listeners to vote for one or both of our SXSW panels.

Pain Free Design: Getting Client Sign Off

Boagworld Live – Open Mic

Any votes would be greatly appriciated!

News

Expanding your development skills with Creative Tech Projects

This post by Smashing Magazine tries to pursued you into doing something different every once in a while and points out that even if you’re a web developer, your next project doesn’t have to be a website! You can learn a lot by doing something outside your normal comfort zone, and there’s some great examples of different things you could play with, such as:

  • Write your own desktop application, using Air for example
  • Extend Firefox
  • Create interfaces for your favourite gadget, such as your iPhone or Wii
  • Play with Hardware, such as the WiiMote, Arduino kits or Lego Mindstorms

One of the things I love about working in this industry is the sheer amount of cool stuff available for us to play with. Admittedly, it can often be hard to find the time, or even justify spending time playing with cool stuff when client work is stacking up, but who knows, you might find that people out there would pay you good money to build things using the skills you acquire!

5 Advanced Photoshop techniques for web designers

Yes, this is a ‘top 5’ type post, but it’s quite a good one so I thought I’d tell you about it. This article on the Think Vitamin blog gives you a decent rundown of 5 popular visual effects in modern web design, and tells you how to replicate them.

There’s tons of screenshots and explanations of how to make awesome buttons, navigation menus, inset typography, faded shadows and depth. It’s a post to bookmark for those times when you have a spare few minutes to mess about in Photoshop and try new things.

Digg’s move to GIT

This is the first of a two part article detailing how the developers at Digg are making the move from Subversion to Git. I realise that source control doesn’t get discussed much on this show, but it’s something that every designer and developer should be using if they’re not already.

I’m not wanting to start a SVN vs GIT argument here, but I’m very interested in seeing how big established teams work in regards to source control and this is quite a candid account from the Digg team about the scaling issues they experienced as the development team expanded and SVN struggled under the load, and how they’re starting to use GIT to solve some of these problems, highlighting both the good and bad points of the new system.

Everyone will have their own source control preference, but if you’re part of a large team and have source control issues (don’t we all) then give this a read.

Back to top

Feature: Why .NET MVC? (and why should we care?)

Having previously written about the highs and, perhaps more importantly, lows of working as a .NET developer. This article will continue the trip into Microsoft World, only this time it’s to the land of MVC.

Read the Why .NET MVC? (and why should we care?)

Back to top

Listeners Questions

Managing your code

Question from Jamie…

I have recently started developing my own system for building web applications with. I have found that as projects have ticked by i have ended up with a large assortment of code of different versions and functionality. How do the backend development guys at headscape manage this code mountain beyond the project by project SVN style management?

Headscape has a strong design and consultancy background, however the development side of things has also been done internally since
the beginning.  In fact the design and Tech teams are of equivalent size and we have a large number of legacy and currently running projects
at any one time.  Source control and code management is therefore vitally important.

As a development team we rely heavily on three main methods of knowledge transfer over time, projects and team members.  This includes
the standard approach of code commenting, a source control system and an internal wiki for snippets, interesting decisions, rationales,
product roadmaps etc. The wiki, in the context of code, provides a space for longer descriptions and reasoning behind technical design and
implementation approach.

As many of you may be aware a large number of Headscape projects utilise our in-house CMS.  This acts as the base for our common code and
contains multiple projects – A common code repository (the equivalent of our ‘System’ namespace), a CMS class library project and
a base CMS Web project. When a new CMS project comes in we create a new project in source control, with the most recent labelled stable
version of the CMS code as the initial check in.  Changes on this development are then logged only within the context of the project.

Throughout key stages of development and during project washup changes and enhancements that can be generalised from this project will fed
back in to the main project after review with at least one other member of the tech team.  As some projects can be very bespoke we do not
currently utilise branching within our Source Control repository for the purpose of each project.

Friendly URLs

Daniel Farrell writes:

My university has a ridiculous URL naming scheme!

I can see what they are trying to achieve: human readable and logical ordering of pages. But by nesting on such a microscopic scale, they produce the opposite result. The pages are no longer memorable, and not even easy to read because you need a huge screen wide screen to see the whole URL.

Furthermore, because ‘software’ is a service provided by the ICT department is must be nested underneath it. This reflects the management structure of the department not necessarily the way a user thinks! For example, why couldn’t the URL be, /softwareshop/adobe?

What are your thoughts on human readable URLs and how would you tackle the problem of making such a huge site easy to use? Should I have more sympathy for the web team or do they need a good kick up the arse!

There are a number of reasons that large organisations use long and often convoluted URL schemes. One possibility is that different parts of the site could be hosted on different servers and managed by different people. There may be different systems running different sub sections such as a shop which generates its own URL structure under an already long base path.

Firstly, it doesn’t always matter. Unless it’s a URL you want people to remember, the majority of web users don’t really care what ends up in their URL bar once they start navigating a site. It makes no difference to a bookmark and can be shortened easily enough by any number of URL shortening services such as tinyurl or bit.ly.

So when does it matter? It matters when you want users to easily find something that could be tedious to find by navigating a site. A good example is TV or poster adverts where people need to remember the URL for a period of time or a subsite that isn’t linked to from the main site. (administration logins for example)

A good example of a website that manages this well is the BBC site. This is a huge site with many smaller subsites. It’s important for them to advertise concice and memorable URLs so many of their subsites are directly below the site root. One solution could be for the university to setup a series of shortcuts that redirect to the full URL.

Some tips for constructing easy to remember URLs
  • Keep sections concise. “personalcomputersupportandmobileservices” is bad, “ict” is good.
  • Try and use words that are easy to spell.
  • Avoid numbers and hyphens
  • If possible and necessary, create a couple of versions that are equivilent and redirect to the correct version (ie. Wikipedia’s redirection)

Back to top

144. Scale

On this week’s show Paul talks to Joe Stump from Digg about scalable websites, we review the best apps for web designers and investigate services for sending bulk emails.

Play

Download this show.

Launch our podcast player

News and events

How much should you charge?

If you are starting your freelance career the number one question you will have is ‘how much should I charge?’ It is important and yet strangely it is not something you are taught at college. Perhaps they don’t teach it because it is a damn hard question to answer. It is certainly something we have avoided talking about on this show.

Fortunately an article entitled ‘Six things to consider when setting your freelance rate‘ has been released this week. Although the article does not give a magic number, it does provide 6 valuable insights that will inform your final decision. These include…

  • Young freelancers and recent grads almost always ask for too little.
  • You can do things your clients can’t.
  • Your rate influences your perceived value.
  • You don’t get to keep it all.
  • An hour worked is not an hour billed.
  • The higher you start, the less you’ll need to increase.

I couldn’t agree more with everything said in this article. However, the one that really resonated with me was ‘You do not get to keep it all.’ Your rate has not only got to cover your billable hours but the cost of sales and marketing, as well as your various overhead. The article has a link to a superb rates calculator that helps you work out your chargeable rate based on these various costs. Definitely worth checking out.

A plethora of accessibility posts

With the implement arrival of WCAG 2.0. we are seeing a resurgence of interest in accessibility. This has led to a plethora of accessibility posts over the last few weeks. These include…

  • Writing good ALT text – This is a simple post about the use of the ALT attribute. It suggests two rules of thumb when it comes to writing ALT text. First, if you were to describe the document to someone over the phone, would you mention the image or its content? If you would, the image probably needs an alternative text. Second, does the alternative text of any images in the document make sense if you turn off the display of images in your web browser? Simple advice, but well worth remembering.
  • Designing for Dyslexia – This is a series of 3 in depth articles that look at the subject of Dyslexia. It asks what Dyslexia is and how we as web designers can improve our sites to accommodate the needs of Dyslexia users. Its interesting stuff. Part 1 defines what Dyslexia is. Part 2 looks at some of the conflicting requirements with users who have visual impairments. Part 3 suggests some specific things you can do to improve the legibility of your designs.
  • Accessible forms using WCAG 2.0. – This extensive post aims to provide web developers and others with practical advice about the preparation of accessible HTML forms. It compares the WCAG 1.0 accessibility requirements relating to forms with those contained in WCAG 2.0. Important stuff but not a 5 minute read!
  • Too much accessibility – The RNIB explains how the LEGEND tag can cause more harm than good if not concise and relevant. The reason? LEGEND text isn’t read at the start of the FIELDSET, it is read at the start of the label. It repeats at the beginning of every single text label in that FIELDSET.

A business case for deleting content

I find myself using the word ‘simplify’ a lot when I talk to clients these days. So many website owners are constantly wanting to add features or content to their site. However, in reality we should be removing not adding to our already bloated websites. This is particularly true for large institutional websites. However it does also apply to smaller sites.

Take for example the Headscape website. When we started the redesign process for our site, I sat down and really thought through what information prospective clients wanted. The answer was very little. In the end our large text heavy website was reduced to a single page. That is the power of simplicity.

Gerrry McGovern summed it up perfectly this week in his post entitled the ‘Business case for deleting content‘. He wrote:

The more you delete, the more you simplify. The more you simplify, the more you increase the chances of your customers succeeding on your website.

We may think that we cannot delete content because ‘somebody might want it’ or because we believe ‘it will help our search engine ranking’. However, bloated sites bring complexity and with complexity comes confusion. The more content on your site, the less chance a user will be able to find the content they need.

12 principles for keeping your code clean

We finish today with a great post for those who need help with their HTML code. Whether you are a student learning HTML or a designer who is more comfortable in Photoshop than Coda, this is a very useful article.

The post provides 12 excellent tips for keeping your code clean. These include…

  • Use a strict doctype
  • Set your character set and encode those characters
  • Indent your code
  • Keep your CSS and JavaScript external
  • Nest your tags properly
  • Eliminate unnecessary divs
  • Use better naming conventions
  • Leave typography to the CSS
  • Add a class/ID to your BODY tag
  • Validate
  • Order your code logically
  • Just do what you can!

The article explores each of these points in depth and communicates clearly current best practice in coding HTML. Well worth the read even if only as a reminder.

Back to top

Interview: Joe Stump on Building a Scalable Site

Paul: Ok, so joining me today is Joe Stump from Digg. Good to have you on the show Joe.

Joe: Oh, good to be here.

Paul: Have we had you on the show before?

Joe: Ah, not that I’m aware of.

Paul: Oh, wow, well we need to rectify that then. It’s good to have you on. Well, I have to say, this interview was arranged by Ryan, who is our producer. And he’s a developer, and I’m a designer. And he suggested we got you on the show, not that I wouldn’t like you on the show, obviously. That we got you on the show, obviously about scaling websites. Now, I’m going to be out of my depth very quickly here, so you’re going to have to be very gentle with me Joe.

Joe: Sure

Paul: So, in fact, it was so bad, that as I sat down to write questions I thought: "I don’t know what I’m doing here" , so I went and talked to some of the developers at headscape, and I asked some of the Boagworld listeners, and so we’ve got a little selection of questions for you, that, hopefully we can learn a little bit about how you go about doing things at Digg. Lets start off, what’s your job title, what is it that you do at Digg?

Joe: Ah, I have a really fancy job title that doesn’t mean a lot of anything, but ah, my official job title is "Lead Architect" and um, I think what best describes it, is that I manage the technical implementation on the code side.

Paul: OK

Joe: So, Digg’s broken up into a lot of different arenas on the tech side, we’ve got, R&D, which is headed up by Anton Cast, we’ve got operations, which is headed up by Scott Baker, and then under that are the people that I work with, ah, probably most closely in implementing solutions for Digg. Ron Gorodetzky is our lead systems engineer, Tim Ellis, also known as timeless, is our chief DB wonk, and then, Mike Newton is our lead network guy. So I think us four kind of steer the technical implementation along. The managers, ah, the manage, and handle the strategy and partners, and stuff like that.

Paul: You managed to say the word manager with real distain.

Joe: Oh, no actually, I have a great manager, John Quinn, he’s our VP of engineering, he’s by far the best direct manager I’ve probably ever worked with. Yeah, he’s really good.

Paul: OK, well lets go back in time a little bit. And start by, well, when was the point when you realized, that you were going to start having scaling issues with Digg? When did you start thinking about the whole subject of scaling?

Joe: Um, well Digg was pretty big when I came on board, so Digg was about 10 – 12 million uniques when I joined on.

Paul: Wow.

Joe: And I think we’d just cleared 35 million last month. So scaling was obviously an issue, but the big difference is that, I think sites generally go through a few different levels of scaling, where like the first one’s like, "I’m just going to throw it on a virtual server, or an Amazon server, you know, you’re basically just seeing if things are going to just "stick to the wall", and then they do. Ah, so the first thing you normally do is start breaking services off onto separate boxes. I want to put my DB on one box, my server on another box, and maybe memcached on each of them. And then you hit, read saturation on that one DB server, so then you go to the kinda next level of scaling. Which is where Digg was when I started, where you start dangling, a whole bunch of read slaves, off of your DB master, so, and for those who are not familiar with the master / slave terms, you send all your writes to one database server, and then that disseminates those writes to a whole bunch of slaves, and then you send all your read traffic to those slaves. So that’s where Digg was when I started. It’s write http traffic across a whole bunch of servers, its read traffic across a whole bunch of slaves, and then we have one master. And we’re now going through, what I think is the third phase, where you hit write saturation on your master, which is a bigger problem, because you then need to start sending some write traffic to some masters, and we’re actually going with a strategy that’s common with Facebook, and Flickr, and those kind of websites, where it’s called horizontal partitioning, where you put some of your records on one server, and the other records on another, so it’s like, you can say, for users, all users whose names start with A through J, would go on this database server, and K to Z live on this other database server. So we’re in the middle of implementing the first swipe at that. So we’ll be pretty aggressively into where everything will be federated and partitioned across a whole bunch of servers.

Paul: OK, one of the questions which kinda came up, which kinda relates to that, is, once you start spreading things across multiple servers, how do you handle things like user sessions, which have obviously got to be persistent.

Joe: Aha, so there are a couple of ways to handle that which, I’d say most people are handling it by.. There’s two ways, probably that you can do it easily. One of them, is if you have what they call "session affinity" on your load balancer, so the load balancer will say: "Oh, well this person, last time I had them here, they went to server A, so we’ll send it back to that server". So the session always lives on only one box. That’s one way to do it, we don’t do that here, we have a custom session handler in PHP which sorts the session in Memcache, and that allows you to.

Paul: Can you just clarify what memcache is, for idiots like me who don’t know.

Joe: Sure, memcache is a distributed caching system that’s actually, basically what it allows you to do, is expose a machines RAM over the network, and cache stuff into another machines RAM across the network.

Paul: Ah, OK

Joe: Yeah, it’s insanely fast, it was developed by Danga back in the day, and Brice Fitzpatrick, yeah so it’s heavily used by anyone whose scaling with LAMP, even a lot of people who aren’t. They all use memcached.

Paul: Wow

Joe: So, yeah, we store all of our session data in memcached, so PHP creates a unique session id, and we just stuff session data into that in memcached, and we can distribute that across, I don’t know, 50 or 60 memcached servers, and what not.

Paul: So how many servers do you guys have, it must be a staggering number by now.

Joe: Um, yeah, it’s kinda funny, every time I ask Ron that, he’s always like "Ah, I don’t know"

Paul: Laughs

Joe: Because we really can’t I mean, I couldn’t give you a specific number because on any given day, we’ll pull or push into production, a dozen servers, so, hundreds, there’s definitely hundreds in production. So.

Paul: I mean, with that many servers, so obviously you’re talking about taking servers on and offline, and all that kind of thing, I mean, making updates to the site, when Daniel comes up with some stupid idea, that you’ve got to apply to the site, of a new feature that he wants to apply on the site, and you’ve gone through the process of making it work. And you’ve then got to push it live.

Joe: Aha

Paul: How does that work? How do you go about pushing something like that live when there are so many servers involved.

Joe: So we have Ron Gorodetzky our lead systems engineer guy. So us developers have a bunch of M4 make files, that, when you check the code out, you run basically Make, Install, and it, for lack of a better word, it builds or compiles the website into a cohesive package, and then Ron pushes that to each server, I think he is still doing it by rsynch, but I know we are migrating over to Puppet, so it may happen via Puppet soon. The production side of things, is something that’s handled completely by operations, so I couldn’t tell you specifically how it happens, but generally, we make a tag of the website, and tell Ron, we need you to push "9.4.15" or something like that, and he does a checkout, builds it, and pushes it to all of the different servers.

Paul: So is that – do you actually have to take the site offline to do those updates? How do you minimize the downtime that’s involved with that.

Joe: Oh, well there’s a bunch of different ways. Um, we don’t bring the website down normally for pushes, it depends on the size and complexity of the push. But like, day to day pushes, we probably push I guess, a minimum of once a day, just little bug fixes and stuff like that. And those happen generally in the middle of the day, and nobody notices, it’s no big deal. Ah, the outages these days, are completely dependant on database activity, you know, like database fixes and stuff like that. And the ways that we’re getting around that these days, is using a different method of partitioning called vertical partitioning. Where, like for instance, like I think our comment Diggs table, of like, who’s dug a comment, up or down, that’s like 15 billion records in it.

Paul: Wow

Joe: that’s like, yeah, if you’re like to alter that table, you’d probably crash mysql, but if you were, it would probably take a week to alter it. So instead we probably create another table, where we have like comments, and then we have another one called comments_made_up, or something like comments_diggs, comments_diggs_beta, and that has a couple of extra fields in it. And so we’ll say, OK, we’re about to push the code, at the end. When we push the code, the first comment id that was added to the table was 15,000,000,001, so then you start at 15,000,000,000 and work my way back in the table. So we do some of that live as well. For the next push that we’re doing, we’re using a migration table, which will tell us how far along each record is, and which records we’ve actually migrated, and stuff like that. And then we’re going to use this package called "GearMan" which is a queuing system which we’ve had in production for a while. And we’re basically turning our servers into a giant BotNet, so we’ll back fill all those partitions quickly.

Paul: Wow, that kind of amount of data, it must create huge problems, even adding a new piece of functionality onto the site, to actually code it in a way that’s not going to have a momentous impact on the database. This must be something that’s always constantly on your mind I guess?

Joe: Yeah, I’ll tell you a really funny story that highlights that perfectly, we have these little green badges that are on the Digg button, and they indicate, that a friend of yours has dug that story. And when you hover it shows the last four friends to dig it or something. So that’s a pretty knurly query, against a very big table, and we’ve actually had to, what I would call "dial it down a bit", so that it only shows up on the stories that are 48 hours old, and it won’t show up if there are more than 500 diggs or something. So the features fairly usable, but it’s not like… Well before if someone went to the top of 365, it was basically crashing our servers. So we’ve been rewriting that, and we basically, the way that we’re rewriting it is: If you write something, we take that Digg and we drop it into each of your followers buckets. So we make a bucket for each story for each person. Any time one of their friends digs it, we drop that dig into their bucket, but the problem is, like Kevin Rose is followed by 40,000 people, so every time he digs something, I need to drop 40,000 things into 40,000 different buckets. And we did the math, and just to get that feature up and running in a vast sane manner, so that it will scale, so we can bring it back in full capacity so everyone can use it all the time. We need 1.25 GB of storage, and we need to be able to sustain 3000 writes per second in order to keep just that small feature online.

Paul: So that really kind of illustrates that a relatively small feature that someone comes up with, has massive ramifications from your point of view.

Joe: Yeah, this is something that has kind of been something that I always talk about. I mean even back when I was doing consulting, I’d kind of have clients come to me, and it would be: "Check out this awesome design", and I would be like "that designs awesome, but that little feature down there, that’s going to cost you know, $100,000 in servers, and 500 man hours. But it’s, like, well the designers think of sizes and shapes, and so Daniel always jokes around and says: "Well I can make it purple" if that will make it easier for you" you know, it’s like…

Paul: Laughs

Joe: Laughs – well that doesn’t make it easier!

Paul: Well, we’re going to get you and Daniel back on the show to talk about this whole design / developer relationships, so you can lace your side of it now, and get your side in first. Before he defends himself.

Joe: Sounds like a plan.

Paul: So are you at the point now where you’ve got an architecture that’s kind of infinitely scalable, or are you going to have to go back to the drawing board if Digg just goes even more, you know off the scale than it already is?

Joe: Yeah, well we’re actually at the drawing board right now.

Paul: Yeah?

Joe: Yeah, Ron, myself, and some of the other senior peeps, about 8 or 10 months ago, we started putting together… well we knew that we were going to start to have serious limitations, especially since we were going to be scaling internationally. You know there is a problem with latency. With you guys across the pond hitting the west coast and other things like that. So we want to be in multiple data centers. We want to be actively serving traffic from multiple data centers, so we’re are, well we need to horizontally partition, we need to make sure we can do that. And we need to live in two different data centers. We need to be able to survive one data center disappearing. So we spent basically a week in front of the white board, and we created this thing called IDDB, which is kind of an elastic storage engine, built on top of SQL, and memcachedb, that we’re going to be putting the first bits and pieces into production, probably over a month or so. And really, the whole partitioning thing isn’t the difficult thing to figure out. The difficult thing is basically spanning multiple data centers, and also we’ll have a couple hundred gigabytes of data, and I need to spray that across, you know, a couple dozen different servers, without bringing the site down. So we have a couple million – 3 or 4 million users, and I need to take all of their records, and all of their auxiliary records, here’s like your user record, and there’s also a bunch of cruft related to that. So I need to take all of that, and migrate it to different partitions. But I need to do that while the site’s still up and running, and I need to do that without breaking the site for you. So, that’s the more complex problem at this point.

Paul: I mean you talk about spreading across multiple data centers, and if one of those data centers goes down, the site does too, and whatever. How are you currently handling redundancy? How are you making sure the site stands up at the minute?

Joe: Right now, our only single point of failure at this point, is our actual data center, so if the data center falls off the planet, then we’ve got problems. But we’ve got a general architecture. We’ve got a couple of general balancers up front. And those two have, what we call a "heartbeat" between them, and if one of them falls off, the other instantly takes over traffic for it. And that balances traffic across, I couldn’t even tell you, dozens and dozens of web servers, and of course, the load balancer does help checks on those, so if any of those falls over, it will drop it out of the pool. Behind that, we have, I think, 4, I guess our masters are technically single points of failure, but we have multiple masters as well, and we have dozens of read slaves hanging off of them. I think right now it takes about 10 minutes to bring a new master into production if one fails. So, and then we have, to store our files, we have a thing called MogileFS, which is a distributed web dav storage engine of sorts, and we can loose multiple nodes on that, and not have any problem with that as well.

Paul: Yeah, so at the moment it’s a physical problem that you have, that if your data center gets hit by an earthquake or whatever, then you have problems. Please tell me it’s not in San Francisco?

Joe: It’s not in San Francisco.

Paul: Ha ha, yeah, you’re not actually going to say where it is are you?

Joe: No I can say we have one on the west coast, and we have one on the east coast.

Paul: Oh, well that’s at least something. Um, I mean so far we’ve concentrated a lot on scaling technology, but there’s kind of another side to this, as well, where you get something like Digg, that has grown incredibly rapidly, over a very short length of time, and that is, kind of scaling the team behind it. You know, I don’t know how many developers were working on Digg when you joined it, but there would certainly be a heck of a lot more now. And I’m quite interested in how you went about growing the team. And how you deal with that kind of rapid growth, and making sure everyone knows what they’re working on, and not overwriting others work, and all the complexity that goes along side of that. How have you dealt with that?

Joe: Sure, I guess, to put things into context a little bit, when I was hired, we had both Kurt Wilms and I, were both hired on the same day, and were respectively employees 18 and 19, and developers, I think there were 7 of us. So, now we’re at the low 20′s as far as developers, and we’re in the mid 80s, as far as total employees, and that’s been in a year and 9 months. So as far as scaling the teams go, some of the building blocks were well in place by the time I got here. Like, source repository, stuff like that. But I think the crucial things that we’ve done, since I’ve come on board, that were, we had some coding standards that were out there, but they weren’t really in force. And then we had, we didn’t really have any frameworks in place. I think one of the problems, you know, when Jay, our CEO, was asking, where do we find more senior developers, I told Jay, like that’s not the right question, the right question is like, how do we get the code, and how do we get the technology, in a position, where we don’t have to hire really senior people. So I think the keys to that are, we do have pretty strict coding standards, so we do enforce those rigorously, through continuous integeration environment, and code reviews. Every piece of code that gets pushed to production has to be reviewed. And that’s literally 4 or 5 coders, sitting in a room, going line by line through change sets, and stuff like that. And that sounds really time consuming, but without question, on every code review I’ve sat in on, we’ve found one show stopper bug. So, those have been crucial, in getting things put together. The other things we did as we grew, is we broke the team up into smaller teams, so we have a development team of 20 – 25 developers, but that’s broken up into teams of between 3 and 5 developers. This really helps in a couple of areas. 1, it enforces code ownership. So everybody has this problem. I code this, then I go and work on something completely different. And 6 months later I come back to this code and I’ve forgotten. I don’t remember any of that. Where as if you’re always working in the same area of the sites, not only do you remember a lot more, you’re a lot more familiar with that. But also, you feel a bit more of a sense of ownership over that. You’re not just this hired gun that’s come in and ploughs through this feature then moves on to something else. You have your own territory that you need to keep track of. You need to keep really nice and what-not. So we did that, and then we have a bunch of core frameworks, that we’ve built. We have a small application framework, we have an AJAX framework, and now, we have this data access layer that we’ve been building up called IDDB. So I think those are pretty crucial too. It’s difficult to find coders that are intimately aware of memcached, and race conditions that exist in memcached, and um, we have to be very selective about what fields we add indexes on in mySQL. We also have to be very selective about how we store that. Normalization flies totally out the window, if you’re a DBA guy. A lot of concepts, they are not bad developers, by any means, they do great AJAX work, they do great application level PHP work, but if you don’t have frameworks in place for them to not have to worry about the super-super complex stuff. It’s going to be really difficult to hire, and it’s going to be really difficult to, you know, get a lot of stuff running in parallel and stuff.

Paul: Wow.

Joe: Yeah, and then we also, another one of the things we’ve adopted, is the agile SCRUM. So we’re doing sprints, and we’re running those in parallel now across all the teams. So right now we have 4 major projects going on right now.

Paul: Ok. So you talk about a sense of ownership there, and the developers are split down into multiple certain areas. You know, does that not get boring, for the developer, having to work on the same bit of code long time, or do you rotate people?

Joe: Well, we don’t currently rotate people, the team structure’s been in place for about 4 or 5 months now. And you know, most of the work they get is fairly immediate, and we’re moving major projects like Facebook connect, so the "Tools and integration team", their doing facebook connect now, and after this, they will maybe work on a new version of the API and so on. It’s written out to wide swaths of the site, so that we have "Site Apps" which does like, all the different applications on the site. And then we have "Tools and Integration" where we have the external projection of Digg, then we have "Core Apps" which is like, search, R&D stuff like recommendation engine, and what not, and then we have "Core Infrastructure".

Paul: So it’s probably enough to be interesting?

Joe: Yeah, we have pretty broad teams, and also, when we put people on those teams, even if someone has an amazing core infrastructure background, but they say, look, like, one of our UI guys, used to be really heavy into core infrastructure stuff when he worked at Quest, and managed massive warehouses, but he says, like, "That’s not what gets me up in the morning any more". It’s like, "Javascript UI interfaces are". So we try to put people on the teams where, you know, where their passions lie. And that’s kind of another thing that people need to recognize. And that’s like, not all developers are driven by, or interested in the same things. We have some, what I would call "UI / Frontend" developers, where like, they could care less about PHP, but we have PHP guys who could care less about Javascript. So I think, recognizing strengths and weaknesses, and capitalizing on those, is pretty important too.

Paul: OK, one last question to finish off, and that is, well you know, the kind of things that you’ve been talking about are fascinating to hear, about the kind of challenges that you have to face with the size of Digg, and the amount of traffic you have to handle. But obviously a lot of people who are listening to this podcast, aren’t at that stage. They are not working on massive projects like that. So I’m really interested in what advice you would have, for those who are just beginning to suffer from scalability problems, and they feel that it’s coming, and it’s something they need to be paying attention to. But it’s not on the enormous scale that you have to deal with. What things can they do right now to prevent problems down the line?

Joe: Um, I think, the easiest things you can do, is you need to re-think the LAMP acronym, because that stack is actually no longer really that stack. I would take Linux, and I’d take Apache out of that stack, and it doesn’t matter, as long as you’re running on a Unix. And as far as Apache goes, Lighty and EngineX are much better at getting a lot more money out of your box, as far as scalability goes. The two areas, that I think people, they sound hard, but they are really easy. One of them is installing and using Memcached, and the other one is installing and using a queuing system of some sort. And I think, like, recently I went through this with a little side project, called "Please Dress Me" which AJ and Gary Benashack and I did.

Paul: Oh, yes yes.

Joe: And we had a very small virtual server at MediaTemple, that survived pretty crushing blows from TechCrunch, Digg, BoingBoing, with almost no load. And that was like, beforehand, memcached is so second nature to me at this point, that I was just like, "Oh, well I’m just going to cache everything in memcached", and it was literally just like this RAM spewing machine. It didn’t actually hit the database. Actually my sysadmin at MediaTemple was like "Something’s really weard, MySQL is only doing like 1 query a second, and you’re doing like 500 requests per second from BoingBoing. So I’m cached. Yeah memcached is just like, it takes literally 10 minutes to install, and probably another hour or two to implement.

Paul: Wow, that sounds excellent, that’s really good advice. Joe, thank you so much for coming on the show, and I can’t wait to get you and Daniel fighting with one an other in the not too distant future. I’m hoping to get a good violent argument about designers and developers, just because I can.

Joe: Laughs.

Paul: I was a little bit disappointed when you guys sat down at Future of Web Design, were far too nice to one another, compared to the evening before, when you’d had a bit to drink, and you were talking in the restaurant. That’s the kind of conversation I want, that real vicious session.

Joe: OK, I’ll make sure that Daniel and I get liquored up before coming on then.

Paul: Yeah, that’s the answer. Ok, thank you so much Joe, that’s really good advice, and we’ll talk to you soon.

Joe: Thanks Paul, bye.

Thanks goes to Nathan O’Hanlon for transcribing this interview.

Back to top

Listeners feedback:

Top web designer applications

Often this section of the show consists of questions for myself and Marcus. However for a change, I thought we should ask the questions. Via the forum, the boagworld site and twitter I recently asked you to vote for your ‘favourite web designer application‘. The response was overwhelming and you can see the complete list of suggestions on UserVoice. However, here are the top 5…

  1. Firebug – Firebug is a Firefox addon that puts a wealth of development tools at your fingertips while you browse. You can edit, debug, and monitor CSS, HTML, and JavaScript live in any web page. A less popular suggestion was the Web Inspector in Safari.
  2. Web developer toolbar – The Web Developers toolbar is a Firefox addon that provides a variety of web development tools. You can disable CSS and Javascript, visually highlight elements, manage cookies and much more. A less popular alternative was the IE developers toolbar.
  3. Adobe Photoshop – A professional image-editing and graphics creation software from Adobe. It provides a large library of effects, filters and layers. This is the grandfather of such applications and many (like myself) use it out of habit more than anything else. Less popular suggestions included Fireworks, Illustrator and Inkscape.
  4. Coda – Coda is a one window development environment for the mac. It includes FTP, text editor, browser preview and even terminal window. A beautifully designed app it appeals to the more visual web designer. Simple, easy to use and elegant.
  5. TextMate – TextMate is a powerful text editor for the mac with an extensive plug-in architecture. From its code highlighting in near endless languages to support for numerous version control systems, TextMate is probably the most powerful text editor out there.

If you disagree with the Boagworld Listeners top five or want to see the other entries then head on over to UserVoice and vote for yourself.

Sending out bulk emails

My second listener contribution comes from the forum. It is a question from Richard about sending bulk email.

Richard writes: I need to send out bulk emails to approx 200k registered (opted in) users on a monthly basis.

Does anyone have any recommendations for an outsourced bulk email provider?

As with the previous contribution I want to focus on your responses rather than my own. This is what the Boagworld community had to say…

Jamie was the first of a number of people to recommend Campaign Monitor. Judging by the feedback from the forum they offer an excellent service but are expensive when compared to others.

As well as recommending Campaign Monitor Nick mentions Silverpop, which he described as ‘an enterprise affair’. Apparently it is not as polished as campaign monitor but considerably more powerful.

Phil recommended two more, Mail Chimp and Mad Mini. He hasn’t used them personally but the prices look good and he says the user interfaces appear polished.

Doug doesn’t recommend a specific service but does refer Richard to a post on Creative Tips which provides a comprehensive review of Campaign Monitor, MailChimp, AWeber, and Constant Contact.

If you have suggestions for Richard or would just like to share your experiences of using bulk email services then contribute to the thread in the boagworld forum.

Back to top

131. Version Control

In this weeks show Ryan and Stanton return to talk about the importance of version control and answer your questions on project  management and invoicing applications, download sizes and page weight.

Play

Download this show.

Launch our podcast player

News and events

Twitter Cuts UK SMS

This week the team over at Twitter announced that they would no longer be delivering outbound SMS over there UK number. They go on to explain that the bill which up until now they’ve been footing is simply too great and that even with a limit of 250 messages per week they estimate a yearly cost of $1000 per user.

Thanks to established relationships with SMS services in Canada, India and the United States the outbound SMS service will be continuing uninterrupted in those countries.

Twitter has suggested a number of alternatives to the service, links to which can be found on their blog. It would also appear that a number of start-ups are rushing to fill the void as TechCrunch have also reported.

A large portion of Twitters popularity is due to their SMS facilities and it is feared that “freezing” out the UK and other countries from this service will be detrimental to their future.

It reminds me of when Pandora, the online radio station, closed its doors entirely to its UK audience due to licensing constraints and it begs to question do we poor souls in the UK miss out on all the good toys?

facelift (FLIR) Image Replacement for Fonts

Facelift Image Replacement (or FLIR, pronounced fleer) is an image replacement script that dynamically generates image representations of text on your web page in fonts that aren’t otherwise supported in web browsers. The generated image is automatically placed on your site and works in a similar way to sIFR, the big difference being the lack of Flash.

This script uses PHP and javaScript and utilises actual .ttf font files to generate its replacement images, so you can simply specify which elements you want to replace, h1, h2 tags etc, download a font you want to use, point the script to it and your done.

I’m looking forward to having a play with this script as it seems to be simple to use and the fact that you don’t have to mess around with Flash like you do with sIFR is a big bonus in my book.

Take a look at the number of examples they have on their website and see for yourself.

Gmail went down!

So Gmail went down for a few hours this week and as Josh Catone said in his sitepoint article article:

Judging by the reactions on Twitter and in the blogosphere, you’d have thought that the world ended.

There’s nothing really more we can say about this that Josh hasn’t already mentioned, but suffice it to say, no web sites/app is going to have 100% up time and this echoes what Stanton and I were talking about the other week in regards to S3 going down. It’s important to always have a backup and not to put all your eggs in one basket because when the service you’re using goes down, and invariably it will, you need a plan B.

Back to top

And Now For Something Completely Random

During the recording of this weeks podcast we were thrown completely when we spotted Paul Annett from Clear:Left dressed up as a Gorilla on Yahoo Live! and then proceeded to start dancing… always aiming to share the hysterics here’s proof. Random indeed.

Paul Annett Dresses as a Gorilla

Feature: To Version Control or Not?

Version control can seem like a very daunting thing to incorporate into your work flow, but once it’s there you can be left wondering how you ever lived without it. In this week’s feature Stanton shares his experiences with you in a bid to convince you why you need it.

Back to top

Listeners feedback:

Project Management and Invoicing Applications

James writes: I would like some boagworld advice. I’m a web designer and SharePoint specialist at a large company in Cambridge, UK. Over the last 3 to 4 years i have been messing around with web design etc. I now am very busy outside of work and it is getting busier every month.

I started of with a server under the bed at home with UPS hosting these sites. They ranged from personal sites, to company profile pages to shops. This server has now been replaced with a VPS hosted externally.

My plan is to keep working full time and manage my time very carefully outside of work and keep these sites coming in and out etc and then one day take the big leap into the self-employed world.

What could you recommend for me to manage my tasks, projects, time-management and invoicing etc?

I love the podcast and would be quite happy to chat further with you. Look forward to hearing your experience comments.

Well there is a multitude of online and desktop applications designed specifically for managing your business.

Probably the most popular project management app I know of is 37 Signals’ BaseCamp and that’s certainly the first one that springs to mind when I’m asked this question. Depending on what package you have, BaseCamp allows you to create projects, set milestones, to-do lists, manage time spent on tasks among other things, however BaseCamp is tailored more towards collaborative projects for when you’re working with a team of people. It doesn’t provide facilities for invoicing clients and managing your accounts and so it might not be the perfect choice if you working alone.

Another app I know of and which comes highly recommended is FreeAgent. FreeAgent like BaseCamp allows you to create and manage projects, clients and timescales, however in addition it provides you with the facility to generate invoices, manage your bank accounts as well as your expenses and incomes. It’s designed for sole traders, partnerships and limited companies and is wrapped up in a nice, user friendly interface.

A final mention goes to a Microsoft app that I came across a couple of years ago now, and has only this year been release in the UK. It’s called Office Accouting Express 2008 and it’s actually free to download and use. As you would expect it integrates with other Office applications and provides you with all the facilities you would expect from an accounting package, invoicing, client management etc. So if you’re working on a PC it’s worth having a look.

Luckily you can have a play with all these apps before you buy. BaseCamp has a free account which allows you to create 1 project so you can get in and see how it all works, FreeAgent has a series of demos you can use to see if the interface and facilities are to your liking and as I’ve said Office Accounting Express is free. So my advice would be check out them al
l and see what works for you and no doubt there will be several suggestions in the show comments on other apps that I haven’t mentioned here.

Download Sizes

Bob writes: After reading a recent post from Smashing Magazine on textures I started to wonder… what is a good rule of thumb regarding document size per page on the web? Most of the example pages in the article ranked in at close to 900kb per page… am I behind the times?

Very good question, and one I think we all worry about at points. There’s more than just the filesize to really worry about, there’s the general ‘page weight’ which is affected by many factors, such as:

  • The number of HTTP requests made – if you’re pulling in a lot of external javaScript or CSS files, each one has to be requested seperately. You can combine these into single files to reduce load times, but at the expense of readability, maintainability and organisation
  • The size of any javaScript files you’re pulling in – you can get minified versions of most libraries, for example, which strip out all the extra spaces and line breaks in the code, which aren’t needed in order for the code to execute
  • CSS expressions can be a useful tool, but are bloody slow, especially when used a lot
  • Image filesize can have a massive effect on load times, which is one of your main concerns as you mentioned textures. I’m assuming you’re already familiar with image optimisation, but also test to see if you can squeeze images into a GIF, or a PNG8 if possible, these formats will give you a nice small filesize if you only need a limited colour pallete.

In this day and age it’s nice to think that we’re all cruising on nice fast broadband connections, but in reality we know that’s not the case and you really have to consider your audience, and the context in which they may visit your site (Paul’s talked about this quite recently). If you expect an older demographic to your site, or people in remote areas, then they might still be hitting you on a dial up connection. Some visitors may be using poor public wifi (I get suicidal on the train to and from London as the wifi is usually worse than dial-up), or mobile devices where the data charges can be ridiculously high.

There are a couple of tools I use to get an idea of how my pages weigh in:

There is a Firebug addon called YSlow which provides some nifty statistics on what’s happening under the hood of the pages you visit, and also grades the page performance and suggests methods to improve the loading time of your page.

I tested 2 sites quickly with this extension to give an idea of what you can expect to see, Amazon and Boagworld.

  • Amazon.com weighs in at 501k with 85 HTTP requests and a performance rating of D
  • Boagworld.com is a bit lighter on it’s feet at 57.6k and 79 HTTP requests, but has a performance rating of F, due to (among other things) including 37 external javascript files compared to Amazon’s 8, and 33 CSS background images compared to 9 with Amazon.

I also use a Firefox plugin called Firefox Throttle which lets you simulate a specific network speed (such as 56k) and get an idea of how long your site will take on certain connections.

Unfortunately I don’t think there’s a good rule of thumb here. Personally, I don’t let the page weight issue affect or limit my design, but try and make savings where I can nearer the end of the project, by optimising images, switching to minified JS libraries and reducing the amount of HTTP requests where possible.

Back to top

 

To Version Control or Not?

Version control can seem like a very daunting thing to incorporate into your work flow, but once it’s there you can be left wondering how you ever lived without it. Paul Stanton gives his thoughts and experiences on the subject.

This post has stemmed from an e-mail by Simon Hamp whom has just started working for a PHP development team and has been given the task of assessing their current systems. Simon writes:

I wanted a professional opinion on a proposal I have made to move from file-based version control (i.e. manual folder management) to full-blown source code management.
Is it worth making the change from an existing system and move to something like Subversion for a team of our size, considering that this would change the processes for those who have been here the longest quite considerably?

Version control is one of those things that you already know you should be using, and if you’re like me, have promised yourself for a while that you’ll get round to learning and integrating into your own workflow. Chances are that one day, disaster will strike and you’ll realise why it’s so important.

I’m going to be honest with you here, I can’t profess to be an expert in version control. I’ve been playing around with it, and the team I work with are very close to making it a mandatory part of our design and development workflow, but I hope i can provide some form of assistance in your decision to adopt a version control system, and maybe convince myself in the process.

That being said, I don’t think there even is a decision to be made. You – need – version control. There’s no real question about it. The biggest challenge we face is integrating this into our existing workflows.

Let me take a step back, and explain what version control is for anyone unfamiliar with the concept.

Version Control Systems (VCS) (also known as Revision Control, Source Control or Code Management) is a way of managing multiple versions of the same information, which is most likely – in this case – the source code of your website.

In the simplest terms, VCS allows you to keep track of the development of your source code, keeping track of the changes you make as you go along with all the changes being stored in a repository. You may already use a crude method of version control yourself, naming files such as index1, index2 etc. VCS systems keep track of all these changes, and let you revert to any previous version if the need arises.

The real benefit of VCS comes when you have more than one person working on the same file. Instead of you overwriting each other’s changes, or attempting to manually merge the changes from one developer with the changes of another, the VCS can take care of all this for you, creating branches for each developer to work on which allows multiple developers to work on the same file simultaneously, then merging the changes into a single version, highlighting any conflicts or problems that may arise.

There’s plenty of more detailed descriptions of VCS systems online if you’d like to learn more (Wikipedia might be a good place to start), so I’ll save diving too deep into the rabbit hole for now, I’m sure that if you’d really like to learn more, Paul can work it into future shows. For now, I’ll return to our case in point; “Is it worth making the change from an existing system and move to something like Subversion for a team of our size”.

I work in a team of 11, with 2 other designers and 3 other developers, and we ran into issues recently with a site that needed some last minute changes made. I was asked to do some markup modifications and CSS changes which required me to modify multiple files. This project wasn’t in a VCS system, so I was given a copy of the files on a flash drive. I made my changes, re-saved the files to the flash drive and handed back to the designer. In the mean-time, the designer had made other changes to the files, and needed to merge my changes with her own manually by copying and pasting from my copies of the files. To further complicate matters, the developer then needed to take these files and integrate the markup and CSS into their .NET templates.

I don’t need to tell you that this was an absolute nightmare, at one point we were managing 5 local copies as well as 2 remote copies (one on the dev server, and one on the live server) To be fair I’m not sure if the team has ever hit a problem such as this before, with multiple people working simultaneously on a project and unfortunately we don’t have VCS as a mandatory part of our workflow as yet, but after this experience, I’m sure it won’t take long.

If we did have VCS, I could have checked out the necessary files, made my changes, and checked them in, the other designer could have checked out the same files and made her changes simultaneously, and also checked them in when she’d finished. Then we could have simply merged these together into one, any conflicts would have been highlighted for us to easily rectify. The developer, when ready, could easily deploy these from the dev server to the live one with a couple of commands.

There will be issues with integrating this into your own processes, especially with people who have their own tried and tested systems, but personally I think you really have to put your foot down here. VCS should be a mandatory part of your workflow, especially as your team grows and there are more people involved with projects.

If possible, I’d recommend arranging training sessions on whichever VCS you choose to adopt, you should only need a day, or even a half day to go through the basics of using a VCS for your dev team, I’d even wager (or should that be “hope”) that those who have been professional developers for some time will already have experience of one form of VCS or another and shouldn’t be resistant to using one.

My last point would be to explore the various VCS systems on offer. The main 3 that come to mind when writing this article are Subversion, GIT and Mercurial.

Subversion (SVN) is the system that most listeners will have probably heard of. SVN is a centralised VCS system which centers around a single repository. Subversion is a well known, and generally well supported tool with a wide range of user interface tools available such as such as IDE extensions for most popular IDE’s and Windows Explorer shell extensions like TortoiseSVN. There’s also native Windows and Mac GUI tools available to make the normal command line interface a bit easier to digest.

GIT and Mercurial are slightly different in that they’re Distributed Version Control Systems (or DVCS for short). While Subversion manages a single repository, a DVCS system like GIT has multiple repositories. The best way I can explain is that imagine if my team was working on a project, and we had a repository on our dev server, I can then clone that whole repository to my laptop, and take it away to work on at home, on the train, or at the top of a mountain if I so wished. With a centralised VCS like subversion, you really need access to that central repository to commit your changes, with a distributed system, al
l your developers can mainta

in their own local copies of the repository, commiting changes to the master repository when needed.

GIT is actually the system I’m getting to grips with at the moment, the branching, tagging and merge tools are more stable than the ones within Subversion. Please note that I’m not trying to bait anyone here, and I don’t want to get into a flame war about which VCS is better than the other. GIT’s distributed nature makes more sense to me personally, and allows me to manage my branch without network access if needs be, and in the event that our development web server with the master repository blows up, each developer has a copy of the repo in some form, and there is a good chance that the whole codebase can be rebuilt easily.

So I hope this persuades you into adopting some form of version control, but if you really ask yourself, you already know that you need to use it, and to return to Simon’s original question, I bet your developers already know they should be using it, they just need a little encouragement to integrate version control into their processes, the benefits far, far outweigh the disadvantages.

By Paul Stanton

121. Coda

In this weeks show we discuss 5 quick fixes to accessibility, and we review the mac code editor Coda.

Download this show.

Launch our podcast player

Watch the behind the scenes video

News and events

Skipping Photoshop

The biggest news this week is a post from 37Signals entitled ‘Why we skip Photoshop‘. The article outlines some excellent reasons why they choose to bypass designing in Photoshop, instead going straight from sketches to HTML/CSS. Reasons include…

  • Mock-ups are not interactive
  • Photoshop draws you into the details too early
  • Text on Photoshop is not like text on the web
  • Photoshop is not productive
  • Photoshop does not aid collaboration
  • Photoshop is too complex

They are all valid points. However, although I accept this is right for 37Signals, it is not right for Headscape. Our view is echoed completely by the response of Jeff Croft at Blue Favor. He argued…

  • 37Signals are working with an established visual aesthetic
  • That 37Signals aesthetic is simple and so is better suited to pure HTML/CSS
  • That 37Signals do not work with clients
  • That working in HTML/CSS can lead to constrained design.

That said, the post has made me consider experimenting occasionally with the approach. For me that made it worth reading.

It is a great discussion and I am really glad Jason at 37Signals brought it up. It has certainly created a lively debate including posts from Jon Hicks and Mark Boulton.

Web Designers should do their own HTML/CSS

But we haven’t finished with 37Signals yet. They have posted a second blog entry this week entitled ‘Web designers should do their own HTML/CSS‘. The title is fairly self explanatory and they put forward a good argument as to why designers should never produce a design and then simply hand it off to ‘code monkeys’ who make it work.

At the end of the article they write…

We simply don’t consider designers who don’t get their hands dirty with the materials relevant to the kind of work we do.

If you’re a designer working with the web who still doesn’t do your own implementation, I strongly recommend that you pick up the skills to do so.

Whether you agree with 37Signals or not, the message is clear: You will struggle to get a job if you do not know how to code pages as well as design them.

We would certainly never hire somebody unless they know HTML/CSS just as well as they know Photoshop. The nature of the web means that an understanding of the medium is crucial to creating a great user experience.

Beyond CAPTCHA

I hate SPAM. I hate it with a passion. I particularly hate comment/forum SPAM because it not only inconveniences me but also affects my users.

One common approach to the problem is CAPTCHA. CAPTCHA presents the users with a distorted word(s) that they have to type in before they can comment.

An example of CAPTCHA in action

Although in principle CAPTCHA sounds great it does have a number of weaknesses…

  • It creates accessibility problems
  • It are hard for normal users to complete
  • It can be beaten by spammers
  • It make SPAM the users problem

In short, CAPTCHA doesn’t work. So what is the alternative? Well, that is what James Edward (AKA Brothercake) explores in a post on Sitepoint entitled ‘Beyond CAPTCHA‘.

He looks at server side solutions, services like Akismet and honeytrap approaches. He also looks at OpenID and other forms of authentication.

The conclusion is that there is no perfect solution. However, he argues we need to stop making this the problem of users and take on the responsibility ourselves.

I can certainly see his position and generally speaking I agree. However, when you are faced with limited time and budget it can be necessary to cut corners. Personally, I cannot stand CAPTCHA and I regularly fail to complete them first time. However, I have no problem completing a basic question such as found on the boagworld website.

Read the article and make up your own mind. At the very least it will offer you some alternatives to CAPTCHA that can be implemented quickly and easily.

Website Owner’s Manual

Our last news story is a little bit of news about the book I have been working on. For a start it has a title; ‘The website owners manual‘. However, the big news is that you can start reading it and contributing to the final version.

Back to top

Feature: Quick Fix Accessibility

Complying with accessibility guidelines can seem like a massive undertaking. However, addressing 5 simple problems can make a huge difference to your sites accessibility. We discuss these in this weeks feature

Back to top

Review: Coda

Find out why I am seriously considering abandoning the code editor I have been using for over a decade in favour of Coda for the mac in this weeks review.

Back to top

Listeners feedback:

Team working environment

Gareth writes: I have been “promoted” from a support desk position for an Oracle based financial system to the company’s single web designer. We are not by trade a dedicated web design firm and as such i am having to develop procedures and polices by myself. I have been reasonably successful in this thanks in large part to your podcast, which has in turn led me to blogs and websites such as A List Apart, Sitepoint, Headscape (obvious one that) and many more that have also helped me.

Due to the sheer volume of work that is coming in this year we have found ourselves needing to recruit an additional web designer. At the moment i have all of my work saved on my laptop and all my tasks and appointnments stored in my Outlook.

What tips can you give me in relation to creating a centralised working environment that can be used by both myself and this new person as well managing our work loads. What do Headscape do? I should probably point out that we will be office based in the sane room rather than working from home.

Why is it that Ryan our producer, keeps picking questions he knows I am not an expert on. I am a front end interface guy. What do I know about this kind of thing! Also we primarily work remotely so have a different setup anyway.

That said, I am willing to give anything a go and ignorance has never stopped me before.

Okay, if you are sitting in the same office communication is not going to be the primary problem.
However, you still may want to take a look at Basecamp. Its a great way of organising team working.

The main problems will come in the form of file sharing, backup and overwriting each others work. One thing you might want to consider is a version control system like Subversion. At Headscape we use something called Source Anywhere however this is just personal preference. These systems allow you to…

  • check out files, preventing others from overwriting them,
  • rollback to previous versions of a file,
  • branch files, allowing multiple versions of the same file.

However, for some this might be an over the top solution. The biggest danger is overwriting files. There are a number of code editors which prevent this including Dreamweaver and Coda. This just leaves the problem of shared storage and backup. You could solve these problems separately. However, personally I like the look of Drobo. Its not that cheap ($499 plus the drives) but it provides an incredibly expandable solution that minimises the problem of data loss.

No doubt my ignorance is showing in this question so if you have better advice please post it on the show notes.

Internal Search

Stephanie writes: I have a question regarding internal site search. I am wondering what types of solutions there might be for enabling a site search when one does not have a development team to turn to. All I can come up with is Google custom search and it has some drawbacks (ad serving in the free edition and blog posts do not get indexed right away).

Love the new site!

So you want to add search to your site eh? If you’re using a popular engine such as MovableType, then there will be a built in search, so let’s assume you’re not. If you’ve just built your site using HTML, or aren’t happy with the results of your CMS’s out-of-the-box search, you still have options.

If PHP is your game, you can install a spider on your server, such as Sphider. This will index your site and provide a very customisable solution, that doesn’t send queries off to a third party server. If you’re looking after a large site, with huge numbers of pages and documents to index, you might consider a program called SearchBlox. SearchBlox is expensive, but powerful. It runs as a java based web app on your server, with many fine tuning features that will keep even the most fastidious of clients happy.

If it’s a free, third party service you’re after then you might consider Atomz or Google. Atomz is easy to setup, free and customisable but does include text based ads, similar to Google. The indexing schedule is regular, but only weekly. Google is an established name in search, but also has the downside of irregular indexing and ad supported results. It is of course possible to spend a little extra money to remove these, with Google Site Search

There is however an interesting alternative service called JRank. JRank don’t stuff adverts into the results, they only require that you provide a link to their website on the page that you set as the index for crawling. They also have a REST API, so without much work you can integrate the results in your website, as the PHP code below demonstrates:

<?php
$jrank = file_get_contents('http://www.jrank.org/api/search/v2.xml?key=[API key]&q=[query]');
$xml = new SimpleXMLElement($jrank);
$result = $xml->xpath('//entries/entry');
while(list( , $node) = each($result)) {
echo '<h3>' . $node->title . '</h3>';
echo '<p>' . $node->content . '</p>';
echo '<a href=”' . $node->url . '”>' . $node->url . '</a>';
}
?>

An interesting point in the question was that Google doesn’t index blog posts right away. In my experience, search is used to find old articles or those that can’t easily be found by tags or menus. Newer articles should be easy to find from the home-page of the site, particularly if it is a blog site. If powerful search is required, then you’re going have to put up with the ads, or fork out for a bespoke solution.

Back to top

104. Give us your money

On this week’s show: Paul shares 10 tips for getting designs signed off. Marcus looks at how to present to a prospective client and Michael Slater introduces us to Ruby on Rails.

Play

Download this show.

Launch our podcast player

News and events | Marcus: How to present to a prospective client | Paul: 10 tips for design sign off | Michael Slater talks about Ruby on Rails | Question of the week

Housekeeping

All change

I have a bit of housekeeping news before we go any further with the show. I am changing things around a bit with my podcasting line up. After a chat with Dan Oliver from .net magazine we have decided that I will no longer be doing their show. They have some great plans for it in the future but it just didn’t make sense for me to keep doing two very similar shows. Before people start emailing, no we haven’t had a falling out and I still love Dan very much… if only I wasn’t already married.

The good news is that this allows me to introduce some more guests onto this show and bring in a bit more discussion. In order to accommodate this we will be having just one feature section each week instead of my bit and Marcus bit. Some weeks I will do it and other weeks it will be Marcus.

The final aspect of all of this is that we are going to start recording the show together rather than over skype. This should deal with the audio problems we have been having as well as making for a much better dynamic.

Christmas giving

I thought it might be nice to use the mighty power of the Boagworld listeners to raise a bit of money this Christmas and wondered if you might all be so kind as to help.

We have been doing this show for well over 2 years and have never charged or done much in the way of advertising. We are therefore wondering if just this once you would dip your hands into your pockets and give a bit of cash.

I want to raise some money for a charity I have been personally supporting for a while. A friend I grew up with now runs a school and orphanage in a very rural part of India. The kids they work with have far from the best background and the school is the only hope they have of breaking out of their circumstances.

I wont emotionally blackmail you with sob stories (because I know you are hardened cynical geeks) but simply ask that you give me some cash in return for the two years of free shows I have given you.

Because I am unorganised and only thought of this a couple of days ago we are going to simply use my paypal account to collect donations. I will then pass the money on to the charity. So to give a donation just use the bottom below (be warned its not the most intuitive system ever but you are all clever chaps. I am sure you will work it out).

Give to the Boagworld Christmas Appeal

News and events

24 ways is back

My first story of the day is actually 12 days late because it is the re-launch of 24 ways. In case you haven’t come across 24 ways before I should explain that it is an advent calendar for web designers.

Each day in December leading up to Christmas they publish an article written by some of the leading lights in web design (oh yes, and me). The articles are somewhat random but also incredibly practical and hands on. Articles range from creating a never-ending background to working with online maps.

But don’t panic that you have missed the first half of advent. You can access all of the previous days. In fact you can even access the last 2 years of articles. Ample to keep you amused while we are away over Christmas.

Tips for development and design

If 24 ways isn’t enough to quench your thirst for knowledge then let’s throw two more articles into the mix both of which provide some top tips.

The first is for all you developers out there. The guys at Blue Flavor have shared their top 10 tips for a successful development project. The article includes great advice like, always create a functional spec and talk to your clients. Interestingly one of the suggestions is to use a version control system. This is also a tip in our second article which is aimed instead at designers.

Jina Bolton has written an interesting article for Think Vitamin entitled “creating sexy stylesheets“. Like the blue flavor article this one lists 10 tips. However this time they are for producing better stylesheets. Now, although I would argue that nothing makes CSS sexy this is still a very useful list. The tips for organising your CSS file and building your own framework are particularly good.

So if you are into top 10 lists then you should be happy this week whether you are a designer or a developer.

24 wayswhich post articles on web design over the Christmas period. Well, I was asked to contribute to the site so I wrote an article entitled 10 tips for design sign off. Although some of the tips have been covered on the show I thought generally it would make a good segment for the show.

The problem is that getting design sign off can be one of the most challenging parts of the web design process. It can prove time consuming, demoralizing and if you are not careful can lead to a dissatisfied client. What is more you can end up with a design that you are ashamed to include in your portfolio.

How then can you ensure that the design you produce is the one that gets built? How can you get the client to sign off on your design? (Question of the week

What tips do you have for getting designs signed off?

 

Podcast 26: Technical considerations

In this weeks show Paul and Marcus are joined by Chris and Mark to discuss the more technical aspects of launching a new website.

Play

Download this show.

What are the technical considerations you need to take into account before building your new website? Understanding things like the technical constraints faced by your users or inherent in your hosting environment, helps to define the functionality of your site.

Techno buster: RSS

Every site you visit these days seems to have those little orange RSS buttons. But what is RSS, how does it work and why should you consider adding an RSS feed to your site? Listen to this week’s podcast to find out more or check out the following post on web feeds.

Main feature: Technical considerations

In this weeks show we look at three main areas for consideration when developing a new site:

  • The technical constraints faced by users
  • The hosting environment where your website will reside
  • The integration with other business systems

Technical constraints faced by users

Probably the biggest set of technical constraints imposed on us when developing a website comes from our target audience. Factors like; the browsers used, available plugins, screen resolution, JavaScript support and bandwidth, all affect how a site is designed and built. Understanding the technical environment your users work in will dramatically alter the way you approach a web project.

Hosting environment

If you have an existing hosting platform for your site that cannot be altered, this will significantly change how your site is built. Different hosting environments support different sever side languages and databases. Also things like bandwidth capacity and streaming media services can impact whether you are able to provide rich media on your site or not.

Business systems

Integrating your website into existing business systems can be problematic. Whether you are endeavouring to tie your site into stock control / accounting systems, or ensuring it is compatible with legacy databases, backend integration can be a time consuming (and expensive process). However, if it is not properly addressed, poor integration can lead to data inconsistencies and a lot of re-keying of data.

Web resources: Backup and recovery

In response to a listener’s question, the team discuss tactics and tools to handle backup and recovery of websites and data. In particular, two tools were mentioned:

CVS
CVS is a file recovery and version control systems that can be used to manage changes to your site. As well as preventing one developer overwriting another’s work, it also allows you to rollback your website to any previous incarnation. However, be warned, this does not handle database recovery.

Groove
Although this is not strictly a backup or recovery tool, it is still invaluable if you have a distributed network of developers working on a project. By using peer-to-peer technology, it can ensure that all development files associated with a website are backed up over multiple locations. It has some great project management facilities too.

Home working and departing friends

Well, today sees the first member of staff to leave Headscape since setting the company up in January 2002. It is a strange experience having a member of staff leave when you are a virtual company.

I do not think I have mentioned in this blog before that Headscape is a virtual company. This is probably not the best term to use as it gives the impression we don’t do real work, but I cannot think of a better way of wording it.

The benefits of a virtual company

By virtual company, I mean we do not have a central office. Each member of staff works from home and we communicate and file share with tools such as Skype, CVS and Groove.

People are often curious about an entire company home working and ask how well it works in reality. My answer is usually that it is brilliant. From the employee perspective, you do not have to commute and you can see a lot more of your family. For example, if I were still working for IBM when I used to commute an hour and a half everyday, I would only see my 2-year-old son at weekends. As it is, I see him throughout the day whether I want to or not! As an employer, I love it because my staff tend to work the hours they would commute and generally home working is seen as a big bonus that keeps people at the company longer. Not to mention the savings made on premises.

Communication really is not a big problem. There are so many tools out there these days that help, and broadband means that even telephone conversations are now free.

The drawbacks of a virtual company

I guess the only thing you lack is some of the social aspects of working in an office. Rob leaving Headscape today has brought this into sharp focus. There was no office party, no embarrassing speeches, and not even any goodbyes really. No doubt, he will log on to MSN messenger again on Monday and we will see almost as much of him as we normally would anyway. I think the fact that 99% of my communication with Rob has been online or the telephone means that his departure will take a long time to sink in.

A friend moving on

What I do know is that from a work point of view he will be sorely missed. He joined us as a graduate and has grown to be a talented web developer who has been hard working and really got stuck into the business. He has left Headscape to shape young minds by teaching IT and web development in school. How can any company compete with the opportunity to influence the next generation?

I wish him all the best in his future career and advise any web design companies out there looking for a web developer to offer him obscene amounts of money to lure him away from his noble new calling!