Information Architecture: Past, Present and Future
by James Tauber

Originally written in 2009 as part of Site Sprint II

Here are some thoughts on the information elements and architecture of my site in the past and how it should be structured moving forward. After a tour of the history of the site, I talk about the types of information I'll likely want to support on the new site and ways of organizing them.

What has the site historically contained?

First Age (1996-1998)

Early on, jtauber.com contained brief biographical information and a list of interests. The idea was that if someone saw me on a mailing list or USENET group and wanted to know a bit more about me and if we shared common interests, they could visit the one-page site.

It was also a place I could put photos, files, etc. for friends to download.

Then I started writing some informational articles (mostly about SGML at first) and hosted some projects (like the Java Class Warehouse, a collection of reusable Java classes written by others) and mailing list archives.

Second Age (1999-2001)

I started doing speaking engagements on XML in 1997 and that really started to pick up by 1999. I was also doing my first open source projects that had other users. So jtauber.com became: brief biographical information, upcoming speaking engagements (also slides from previous ones), interests and open source projects.

The open source projects were initially hosted on the site but then started to move to places like SourceForge.

I also had a bunch of genealogical information up on the site.

Third Age (2002-2003)

By 2002, I'd stopped doing speaking engagements. Most of my open source projects were on SourceForge although I had some small scripts hosted on jtauber.com itself. The home page had biographical information and a list of non-code projects, all external sites. It also linked to a separate page for open source software and interests.

The software page just had links to SourceForge or local scripts.

The interests page had a hierarchy of areas of interest, some of the nodes of which were links to specific pages talking about the particulars of my interest. For example, I had a /music/ section and within that /music/prokofiev/ that said a little about my love for the music of Prokofiev and linked to a page /music/prokfiev/toccata/ dedicated to his Op. 11 Toccata.

I was using Advogato as a diary of occasional software development musings—a forerunner to a blog.

Fourth Age (2003-2004)

Rather than essentially static html, in 2003 I wrote a basic CMS called Leonardo for enabling me to more easily add new topic pages and to do so via a web interface.

The home page contained biographical information and a list of links to open source software projects, other non-OSS projects and interests. I had a small number of informational articles on topics within my areas of interest.

The site was essentially a password protected wiki at first. I later added a blog which was nothing more than a date-based wiki page.

I served up some static files such as old slide presentations, samples of my music compositions, etc.

Fifth Age (2004-2008)

My blogging really took off and most of the activity on the site was in the form of blog posts. Some posts were short announcements or musings. Others were informational articles or series of articles running for years.

I continued to develop Leonardo and ultimately a framework started to emerge out of it. I ended up spending more of my time on framework features than improving my site and ultimately did a quick-and-dirty port of Leonardo to Django. Part of this sprint is rewriting that code :-)

Beside the fact that Leonardo was a wiki that happened to also be a blog, another differentiator that I still think is quite innovative is that categories are really just other pages. In other words, when I put a blog post in the "django" category / tag it actually just links it to the general Django page on my site. This also means you can go to a page about one of my projects or interests and see the blog posts about that project or interest—not mediated through a separate tag or category: the project/interest is the tag/category.

Sixth Age (2008-2009)

I started to use Twitter a lot more and my blogging pretty much dried up. In November 2008, I forced myself to write 51 blog posts as part of the blog version of NaNoWriMo but I've struggled to write one tenth of that in the year since and many of those were just announcements.

It's partly this shift over the last few years (or indeed thirteen years) that has got me rethinking what should be on my site.

The site also increasingly had links to my presence elsewhere, not only on twitter but also Facebook, LinkedIn and GitHub.

Summary

Here are some of the various components my site has had over the last thirteen years:

biographical information
contact information and links to external presence
interests (organized into a taxonomy with some description of my particular interests)
affiliations (with links and in some cases a bit of history)
open source software projects (links, descriptions and sometimes everything)
short code snippets
web site projects (links and descriptions)
private files to make available for others to download
MIDI files or recordings of music compositions
family history information
informational articles
speaking and other event information
blog posts

That is not to say the new site needs to cover all of that. There are plenty of external services that provide hosting of this sort of thing. I have, however, long been interested in the datalibre concept of hosting your own data and simply letting others aggregate it.

A taxonomy of blog post types

I think one of the keys to the information architecture of my new site is going to be separating out the different things I've used my blog for the last 5 years. As a precursor to that, here's an initial attempt at some of the sorts of things I've blogged about:

administrative announcements about the site
reflections on the site, stats, etc (kind of like this page)
year in review (New Year or birthday; achievements in past year, future goals)
personal announcements (life event or maybe just some external reference like an article or interview)
project announcements (release, milestone, progress update, etc)—project could be code, a website, a film, a piece of music, my linguistic research
results of some (usually computational) experiment
blogging an event (e.g conference: interesting talks, who I met, etc)
public congratulations of others
technical observations / cool little tricks
news with some analysis
something funny or annoying that just happened to me
quick question / show of hands
I'm at X / going to X so who wants to meet up?
linguistic observations
from the archives (something of possible interest I've dug up from my past)
other scientific observations / questions (less developed than informational article below)
book, film or music review
code snippet to do something fun (implement an algorithm, demonstrate a concept, etc)
tech problem / troubleshooting / request for help (sometimes with solution!)
musings about technology (especially Web or programming)
informational article or series—style/tone: sometimes conveying orthodoxy, sometimes musing on my own slant, sometimes documenting my own path to understanding; topic: music theory, mathematics, science, programming, filmmaking, photography, linguistics, economics

Some posts overlap multiple types and some are now far better suited to Twitter but that leads on to the next topic.

Blogging, Twitter and something in between

I don't want to get too deep into the psychology of why I stopped blogging other than to suggest that when you don't blog for a while, it raises the bar of what you break your blogging drought with. There was one time I didn't blog for a couple of months and the next time I blogged, a friend said "I've waited months for a blog post and you post THAT!".

So to get back to putting more content on my site, I need to give myself permission to do shorter, less well-thought-out posts and not feel that every post has to be an epic article. Looking at the taxonomy above, it's clear that in the past I have made blog posts considerably shorter than informational articles.

I think there is value in distinguishing short-form and long-term posts and making enough of a separation that there is less pressure to always do long-term posts. But as well as the dimension of length, I think it also makes a lot of sense to distinguish posts which are ephemeral (or at least quite specific to the time in which they were made) from those that are, for the most part, timeless.

Taking these two dimensions into account, we can see that short, ephemeral posts and things like announcements and quick questions can be done on twitter or on a different part of the site from more in-depth articles. And even the informational articles might be better suited to the original wiki pages that Leonardo had before support for blogs was introduced.

For a time, blogging was a hammer so everything looked like a blog post to me. After this site sprint, I'd like my site to make more of a distinction between different kinds of information.

A tentative approach I'm adopting for this sprint is to make the following distinctions:

announcements
musings, thoughts and observations
informational articles

I'm still giving more thought to if these three categories are enough (should reviews be separate? what about questions?) and exactly how all the blog post types listed in the previous section might fit in here (excepting the ones that will be exclusively on a site like Twitter).

There is also the question of whether there are things I'm using Twitter for that could be done on the site (perhaps in addition to, rather than instead of). Useful or interesting links might be an example of that. I've never posted bare links on my blog before but it might make sense to do so elsewhere on the site, perhaps in the context of what the link relates to.

The broader site and a focus on projects and interests

The goal of the site sprint isn't just to rethink my blog and break it apart in some act of balkanization. It is also to rethink the overall structure of the site and the parts which aren't just time-based posts.

One way I'm considering doing that is to think about the site top-down in terms of projects I'm working on, interests I have and how various announcements, questions, musings, links, articles, etc. actually relate to the project or interest.

Blogging has, in a number of cases, been a bad way to make available stuff I'm working on (at least to the extent that it's the only way I'm making it available). A good example is my Dr Horrible covers which I posted a few at a time on my blog. That made it difficult to have a single page where someone could go to get all of them. If the Dr Horrible Covers project was the primary object and the blog posts were viewed merely as announcements of new releases to the project, it would have been easier for me to manage and easier for site visitors to get to what they wanted.

I'll update this page with more on that soon.

Follow @jtauber

Information Architecture: Past, Present and Futureby James Tauber