Creating an archive of my online writing, from 2002-2017

I’ve just spent an inordinate amount of time creating an archive of all my past online writing work, in particular of the tech blog I founded ReadWriteWeb. I thought I’d outline my reasons for doing this, and why I ended up relying heavily on the Internet Archive instead of the original website sources.

With ReadWriteWeb (RWW), I had a career archive spanning approximately 3,000 posts over 9.5 years. I ran the site from April 2003 to October 2012. During that time I wrote an average of 315 posts per year (6 per week). Given that many of those posts were of medium-to-long length, and I was doing a lot of business and editorial work too, that’s a pretty good haul.

You may be wondering why I felt the need to create an archive of all my RWW posts, given that many of those posts still live on the current iteration of ReadWrite.com.

The main reason is that the current ReadWrite design has moved too far away from my original RWW brand. That’s created a feeling, at least for me, that the old content looks out of place on the current site. The latest ReadWrite designs also introduced a bunch of technical glitches to the older content. For these reasons, I chose to link to the Internet Archive copies of my posts. Also, I simply wanted to remember all those classic RWW designs from 2003-2012.

So how did the current design get so far removed from the 2003-12 designs? Good question and probably no longer my business, since I sold the site over five years ago. But when I left RWW in October 2012, then owner SAY Media changed the domain name to ReadWrite.com and overhauled the design. I remember that was a big change, because the underlying publishing platform also changed. I’m not sure if it was that design which messed up the formatting, images and more in all my old posts. If not, then it was one of the several re-designs that were done after I left. ReadWrite.com is now owned by a company that (confusingly) now calls itself ReadWrite as well, but it used to be called Wearable World. In any case, the current design makes a dog’s breakfast of all the old, pre-2013 content.

Here’s an example of a post from 2011 and how it looks on ReadWrite now:

You’ll notice the lead image has disappeared. In its place is social media icon cruft, which dominates the upper part of the page. Also I’m not too happy to be a faceless avatar and listed as a “Contributing Writer,” when I was the founder of the site. It gets worse further down the page…

The remaining images in the post are now inexplicably tiny and blurry. That’s because they’re not linked to the original source images anymore. Instead there’s some weird CDN (Content Delivery Network) hot-linking funk going on here, whereby the original image on the RWW server has been swapped out with some rubbish CDN versions. To top it all off, the comments under the post have been obliterated (not just on this post, but most of the old posts from what I can see).

It isn’t just these glitches that make the old posts a mess. The current, rather bland, design just doesn’t suit the content that was written prior to 2013 (well, IMHO anyway). It’s not just my content affected, but posts from RWW Hall of Famers like Josh Catone, Marshall Kirkpatrick, Alex Iskold, Emre Sokullu, Sarah Perez, Audrey Watters, Jolie O’Dell, Dan Rowinski, Taylor Hatmaker, Jon Mitchell and many others.

One final gripe: not all of the old content still exists on ReadWrite.com. Some posts have been deleted entirely, seemingly at random. That was another reason I used the Internet Archive as my go-to source, for both the list of all my posts and the links.

Here then is the above post in the best approximation of its original design:

Now admittedly there are broken images on this page too, because the Internet Archive doesn’t save all images. But even despite that, what’s archived in the Wayback Machine is a damn sight better on the eyes than the current abomination. It also just feels more RWW.

The point is: I like to remember the content how it originally was, so the Internet Archive was an obvious choice for linking all my old RWW content. (Incidentally, archive.is also has a good copy of RWW in 2012; but alas there’s nothing prior to that.)

Here then is a link to my archives. These include not only the ten years of my RWW posts, but all the posts I wrote for ZDNet (2005-06), IONRSS (2005), eBook Culture (2004), some guest posts I’ve done on other sites, and even a couple of screen captures of a RWW precursor called Modern Web (2002). I’m adding posts from my current group blog, AltPlatform, as I go. Any columns I write are being actively archived too. For each of the old archives, but particularly the RWW pages, I’ve included some screenshots from the Internet Archive to give a flavour of the design at the time those pages were published.

Btw when I say ‘archive’ in this context, I mean linking to copies of the content on the Web. But I also saved copies of the original files to my cloud too, using a Firefox extension called DownThemAll!. Just in case. Heaven help us if the Internet Archive falters in future, because it really is an amazing resource.

I was partly inspired to do this archiving project after seeing Jon Udell’s archives. In a 2014 blog post he wrote:

“In some cases I’ve moved archives to my own personal web space. But I prefer to keep them alive in their original contexts, if possible.”

My approach differed only in that I preferred to link to the Wayback Machine, to approximate their original design as closely as possible. Because of that I didn’t need to save any content to my personal web space, since Internet Archive has almost all my old posts (the exception being some ZDNet posts, but unlike ReadWrite that company still has a decent live copy of them).

I’m pleased I’ve gone through all this effort to archive my back catalogue of online writing. I’m proud of what I’ve written, even if the odd post from back then makes me cringe now. But that’s life isn’t it. Most of that old content still holds up, and certainly packs a lot of memories.

45 thoughts on “Creating an archive of my online writing, from 2002-2017”

  1. I created an archive & here I explain why I chose to link to @internetarchive over the original sources like @RWW. richardmacmanus.com/2017/07/12/cre…

  2. I created an archive & here I explain why I chose to link to @internetarchive over the original sources like @RWW. richardmacmanus.com/2017/07/12/cre…

  3. I created an archive & here I explain why I chose to link to @internetarchive over the original sources like @RWW. richardmacmanus.com/2017/07/12/cre…

  4. I created an archive & here I explain why I chose to link to @internetarchive over the original sources like @RWW. richardmacmanus.com/2017/07/12/cre…

  5. Your content is yours: this is a central tenet of IndieWeb. It’s a philosophy that promotes ownership of your online content and it’s been labelled POSSE, an acronym for “Publish (on your) Own Site, Syndicate Elsewhere.” Some in the IndieWeb community take this to the extreme and save literally everything they do on the Web, from tweets to check-ins and much more. AltPlatform contributor Chris Aldrich is in this camp – he’s even come up with an elaborate workaround to post onto Facebook, via his own website, but without getting the familiar “mom-autolike” (when your mother likes everything you post, because…well, she’s your mom).
    I admire the POSSE philosophy, but I don’t fully agree with it. That’s because I have no desire to post everything I do online onto my website. This is partly due to my profession: I’m a professional writer, so I see my website as kind of an aggregator for all the types of writing I do. I list and promote my books there, I do the same for columns I write for media organisations and posts I write for AltPlatform, I showcase my career archive, I even write the occasional personal blog post. But…I don’t wish to tweet from my website, nor do I want to use it as a social network (that’s what Facebook is for). My website is the central place for anything related to my career, but it’s not a place for me to post a family photo or tweet about the NBA.
    That’s just my personal viewpoint, so I’m not saying my way is the best way. Just as what feed reader you use is a personal choice, what you do with your own website is up to you.
    The POSSE philosophy did, however, make me think harder about my career archive. Before a few weeks ago, I’d never thought much about archiving all the content I’ve published online – dating back to 2002. Most of that content, and virtually all of it from before 2013, lives on external sites. ReadWriteWeb was the main repository, since that was the professional blog I founded and ran from 2003 to 2012. When I looked into it, I discovered I’d written nearly 3,000 posts in nearly a decade. That’s a lot of blood, sweat and tears. Yet there was no record of most of those posts on my personal website. So I set out to rectify that and create a career archive.
    This week I completed that archiving project. I ended up archiving not just all my ReadWriteWeb posts, but all the articles I’d written for ZDNet and a few other sites. I’d like to tell you this was an easy project, but actually it was very time-consuming. Mainly because most of the original sources (ReadWriteWeb in particular) have changed significantly over the years. So not only was my old RWW content difficult to find on the current site, ReadWrite.com, but almost all of it was buggy (e.g. missing or corrupted images) and looked out of place in the site’s current, rather bland, design. To make matters worse, ReadWrite had deleted some of my old content entirely. Long story short, I used the wonderful Internet Archive to do almost all of my archiving. But that in itself was a painstaking journey – e.g. re-formatting lists of my posts, finding missing months through various hacks, correcting broken links, etc.
    You can read more about how I constructed my career archive in a post I did for my personal site.
    My point here is that although I don’t wish to post everything I do online on my personal website, I do wish to have a record of all my professional writing work. To me, IndieWeb (a.k.a. Open Web) is about taking care of the content that is important to me from a career perspective. Of course there is content I post on Facebook that is important to me too – such as when I post family photos or post about a show I went to see with my wife. But IMHO that content is native to a social network, not my website. All those likes and comments which may accumulate on a Facebook post belong on that platform. Sure the content may disappear in time, if Facebook ever goes under or (more likely) turns into a massive Virtual Reality social network. But I’m willing to live with that, because I simply don’t want that content on my personal website. It doesn’t belong there.
    Of course this is just one person’s perspective, so I’m curious how you view the POSSE philosophy – and how you want to take care of your online content over time.

    Share this:

    Click to share on Twitter (Opens in new window)
    Click to share on Facebook (Opens in new window)
    Click to share on LinkedIn (Opens in new window)
    Click to share on Google+ (Opens in new window)
    Click to email this to a friend (Opens in new window)
    Click to print (Opens in new window)

    Like this:

    Like Loading…

    Related

  6. I created an archive & here I explain why I chose to link to @internetarchive over the original sources like @RWW. richardmacmanus.com/2017/07/12/cre…

  7. Creating an archive of my online writing, from 2002-2017 by Richard MacManus (richardmacmanus.com)

    I’ve just spent an inordinate amount of time creating an archive of all my past online writing work, in particular of the tech blog I founded ReadWriteWeb. I thought I’d outline my reasons for doing this, and why I ended up relying heavily on the Internet Archive instead of the original website sources.

    Journalists, take note of how Richard MacManus created an online archive of his writing work!
    I’m sure it took a tremendous amount of work given his long history of writing, but he’s now got a great archive as well as a nearly complete online portfolio of his work. If you haven’t done this or have just started out, here are some potentially useful resources to guide your thoughts.
    I’m curious how others are doing this type of online archive. Feel free to share your methods.
    Syndicated copies to:


    Related

    Author: Chris Aldrich

    I’m a biomedical and electrical engineer with interests in information theory, complexity, evolution, genetics, signal processing, theoretical mathematics, and big history.

    I’m also a talent manager-producer-publisher in the entertainment industry with expertise in representation, distribution, finance, production, content delivery, and new media.
    View all posts by Chris Aldrich

Comments are closed.