Links on the web are brittle. A way to make them more robust over time is to decorate them.
There are really two problems:
- Link rot: Following a link yields a "404 Not Found" error message.
- Content drift: The content at the end of the link changes over time,
possibly to the point where it loses all similarity with the originally
In order to address this problem, those that care about link robustness resort to
the following strategy:
- When linking to a web page, a snapshot of the state of the page at linking time is created
in a web archive. Several web archives provide on-demand snapshot functionality,
including the Internet Archive,
- With the snapshot created, rather than linking to the
original web page, a link to the snapshot is put in place.
For example, I am writing this web page on January 21 2014. And I want to link to
. That's the W3C's page, and it changes rather frequently.
In order for future readers of my page to see the same W3C content that I saw when linking,
I create a snapshot, say in the archive.today
That snapshot is at
and, rather than linking to
I link to
While the creation of the snapshot is definitely an essential step in the right direction,
there are problems with the linking approach:
- Linking to the snapshot
assumes that the archive.today web archive will exist forever.
Unfortunately, there are already plenty of indications that web archives
do not have eternal life either. If the web archive in which I created the
snapshot suffers a temporary glitch, moves its content to another web location,
or ceases to exist,
visiting the snapshot becomes impossible.
- When linking to the snapshot
the URI of the W3C's page,
http://www.w3.org/, is lost.
As a result, future readers of my page cannot visit the W3C page
to see its evolved state.
Link decoration is a way to address these problems and to increase the chances
that links will lead to meaningful content, even a long time
after they were put in place. In order to maximize link robustness, the following information should be available,
in a machine-actionable manner, for a link:
- The URI of the snapshot, in our example
- The URI of the original resource, in our example
- The datetime of linking, in our example
January 21 2015.
The latter two information elements can be used to automatically find snapshots in other web archives in case
's service is interrupted, and
becomes inaccessible as a result.
Link decorations are conveyed in a manner that leverages
. Using that approach,
this decorated link to the W3C home page
is expressed as follows:
data-versiondate="2015-01-21">this decorated link to the W3C home page</a>
As can be seen by clicking the arrow that appears next to the above link,
link decorations can be made actionable, for example, by injecting the
if the original link no longer works or does not yield the expected information. Note that not only
link decorations can be used to generate Robust Links but also the page creation and/or modification dates.
Details are in the Link Decoration
An insight in the bigger picture that motivates the Robust Links work is provided in the below slide deck. It
addresses the creation of pockets of persistence on the web.