url
which specifies the URL of the Robust Link. We can ask the API to create a Robust Link for https://abcnews.go.com
by issuing the curl
command below. The resulting JSON contains our Robust Links in the keys memento_url_as_href
and original_url_as_href
.
curl "https://robustlinks.mementoweb.org/api/?url=https%3A%2F%2Fabcnews.go.com"
{
"anchor_text": null,
"api_version": "0.8.1",
"data-originalurl": "https://abcnews.go.com",
"data-versiondate": "2020-06-15",
"data-versionurl": "https://archive.li/wip/hWZdd",
"request_url": "https://abcnews.go.com",
"request_url_resource_type": "original-resource",
"robust_links_html": {
"memento_url_as_href": "<a href=\"https://archive.li/wip/hWZdd\"\ndata-originalurl=\"https://abcnews.go.com\"\ndata-versiondate=\"2020-06-15\">https://archive.li/wip/hWZdd</a>",
"original_url_as_href": "<a href=\"https://abcnews.go.com\"\ndata-versionurl=\"https://archive.li/wip/hWZdd\"\ndata-versiondate=\"2020-06-15\">https://abcnews.go.com</a>"
}
}
original_url_as_href
. Likewise, if the API user wants their readers to click the anchor text and reach the Memento by default, then they choose memento_url_as_href
.
anchor_text
parameter included in the query string. The example below uses cURL's -G
argument to specify the GET method and the --data-urlencode
argument to specify and encode the query string parameters. This combination of arguments takes care of converting all query string parameters to their URL-encoded formats and appends the query string to the URL.
curl -G --data-urlencode "url=https://abcnews.go.com" --data-urlencode "anchor_text=ABC News for June 15, 2020" https://robustlinks.mementoweb.org/api/
{
"anchor_text": "ABC News for June 15, 2020",
"api_version": "0.8.1",
"data-originalurl": "https://abcnews.go.com",
"data-versiondate": "2020-06-15",
"data-versionurl": "https://archive.li/wip/hWZdd",
"request_url": "https://abcnews.go.com",
"request_url_resource_type": "original-resource",
"robust_links_html": {
"memento_url_as_href": "<a href=\"https://archive.li/wip/hWZdd\"\ndata-originalurl=\"https://abcnews.go.com\"\ndata-versiondate=\"2020-06-15\">ABC News for June 15, 2020</a>",
"original_url_as_href": "<a href=\"https://abcnews.go.com\"\ndata-versionurl=\"https://archive.li/wip/hWZdd\"\ndata-versiondate=\"2020-06-15\">ABC News for June 15, 2020</a>"
}
}
On line 1, the user encodes the url https://abcnews.go.com
and the anchor text ABC News for June 15, 2020 with the --data-urlencode
argument. The URL encoding is necessary to ensure the clean transmission of this input.
On lines 11 and 12, the user can extract the relevant Robust Link from the output. It is different from the previous example; it now contains the requested anchor text.
url
query string variable (e.g., https://abcnews.go.com
becomes url=https%3A%2F%2Fabcnews.go.com
)anchor_text
query string variable (e.g., ABC News for June 15, 2020
becomes anchor_text=ABC%20News%20for%20May%2020%2C%202020
)https://robustlinks.mementoweb.org/api/?
(e.g., https://robustlinks.mementoweb.org/api/?url=https%3A%2F%2Fabcnews.go.com&anchor_text=ABC%20News%20for%20May%2020%2C%202020
)https://abcnews.go.com
with the anchor text ABC News for June 15, 2020
and use the original URL (URI-R) of https://abcnews.go.com
as the default link target. Where possible, these examples require no external libraries. Click the tab below to view an example in the desired language.
import json
import urllib
url = "https://abcnews.go.com/"
anchor_text = "ABC News for June 15, 2020"
query_string = urllib.parse.urlencode({ 'anchor_text': anchor_text, 'url': url })
api_url = "https://robustlinks.mementoweb.org/api/?" + query_string
response = urllib.request.urlopen(url=api_url)
json_data = json.loads(response.read())
print(json_data['robust_links_html']['original_url_as_href'])
This example displays how one would complete this process with Python.
On line 8, we encode the anchor text and url as a query string.
Line 12 demonstrates how we issue the HTTP GET request with this data.
Lines 14 - 16 extract the Robust Link HTML from the JSON response and print out Robust Link as a string.
This prints HTML output where the original resource URL is the value assigned to the href
attribute:
<a href="https://abcnews.go.com/"
data-versionurl="https://archive.li/wip/hWZdd"
data-versiondate="2020-06-15">ABC News for June 15, 2020</a>
Which is rendered by the browser, as shown below. If a reader clicks on the Robust Links menu to the right, they can choose to visit the live web resource, the Memento, or other Mementos for this resource. If the reader clicks on the anchor text ABC News for June 15, 2020 the browser delivers them to the original resource as it currently exists.
If the user wishes to use the Memento URL as the link target, then they can replace line 16 with the following:
print(json_data['robust_links_html']['memento_url_as_href'])
This prints HTML output where the Memento URL is the value assigned to the href
attribute:
<a href="https://archive.li/wip/hWZdd"
data-originalurl="https://abcnews.go.com/"
data-versiondate="2020-06-15">ABC News for June 15, 2020</a>
Which is rendered by the browser, as shown below. If a reader clicks on the Robust Links menu to the right, they can choose to visit the live web resource, the Memento, or other Mementos for this resource. If the reader clicks on the anchor text ABC News for June 15, 2020 the browser delivers them to the Memento for this resource captured by archive.org at 2020-04-14T14:53:54.
Though the anchor text is the same, and the two links look the same to the reader, they deliver the reader to different destinations. This gives the page author control over which resource the reader reaches by default. In addition, the Robust Links menu to the right provides them with additional options if they wish to visit another version than what is specified in the default.
require 'net/http'
require 'json'
require 'uri'
url = "https://abcnews.go.com/"
anchor_text = "ABC News for June 15, 2020"
api_url = URI('https://robustlinks.mementoweb.org/api/?')
api_url.query = URI.encode_www_form( { :url => url, :anchor_text => anchor_text } )
res = Net::HTTP.get_response(api_url)
json_data = JSON.parse(res.body)
puts json_data['robust_links_html']['original_url_as_href']
This example displays how one would complete this process with Ruby.
On line 11, we encode the anchor text as data for an HTTP GET.
On line 13, we issue an HTTP request with the full URL.
Lines 15 - 17 extract the Robust Link HTML from the JSON response and print out Robust Link as a string.
This prints HTML output where the original resource URL is the value assigned to the href
attribute:
<a href="https://abcnews.go.com/"
data-versionurl="https://archive.li/wip/hWZdd"
data-versiondate="2020-06-15">ABC News for June 15, 2020</a>
Which is rendered by the browser, as shown below. If a reader clicks on the Robust Links menu to the right, they can choose to visit the live web resource, the Memento, or other Mementos for this resource. If the reader clicks on the anchor text ABC News for June 15, 2020 the browser delivers them to the original resource as it currently exists.
If the developer wishes to use the Memento URL as the link target, then they can replace line 15 with the following:
puts json_data['robust_links_html']['memento_url_as_href']
This prints HTML output where the Memento URL is the value assigned to the href
attribute:
<a href="https://archive.li/wip/hWZdd"
data-originalurl="https://abcnews.go.com/"
data-versiondate="2020-06-15">ABC News for June 15, 2020</a>
Which is rendered by the browser, as shown below. If a reader clicks on the Robust Links menu to the right, they can choose to visit the live web resource, the Memento, or other Mementos for this resource. If the reader clicks on the anchor text ABC News for June 15, 2020 the browser delivers them to the Memento for this resource captured by archive.org at 2020-04-14T14:53:54.
Though the anchor text is the same, and the two links look the same to the reader, they deliver the reader to different destinations. This gives the page author control over which resource the reader reaches by default. In addition, the Robust Links menu to the right provides them with additional options if they wish to visit another version than what is specified in the default.
var url = "https://abcnews.go.com";
var anchor_text = "ABC News for June 15, 2020";
var api_url = "https://robustlinks.mementoweb.org/api/?" + "anchor_text=" + encodeURIComponent(anchor_text) + "&url=" + encodeURIComponent(url);
var client = new XMLHttpRequest();
client.open("GET", api_url, true);
client.onreadystatechange = function() {
if (this.readyState === XMLHttpRequest.DONE && this.status === 200) {
var obj = JSON.parse( client.responseText );
console.log( obj["robust_links_html"]["original_url_as_href"] );
}
}
client.send();
This example displays how one would complete this process with JavaScript.
Because JavaScript is event driven and this request is asynchronous, lines 11 - 18 contain the callback function that will extract and print the Robust Link once we have a response. Lines 14 - 16 from this event handler extract the Robust Link HTML from the JSON and print out the response as a string.
On line 20, we issue the HTTP GET request with this data. The event handler will execute once the response is received.
This prints HTML output where the original resource URL is the value assigned to the href
attribute:
<a href="https://abcnews.go.com/"
data-versionurl="https://archive.li/wip/hWZdd"
data-versiondate="2020-06-15">ABC News for June 15, 2020</a>
Which is rendered by the browser, as shown below. If a reader clicks on the Robust Links menu to the right, they can choose to visit the live web resource, the Memento, or other Mementos for this resource. If the reader clicks on the anchor text ABC News for June 15, 2020 the browser delivers them to the original resource as it currently exists.
If the developer wishes to use the Memento URL as the link target, then they can replace line 18 with the following:
console.log( obj["robust_links_html"]["memento_url_as_href"] );
This prints HTML output where the Memento URL is the value assigned to the href
attribute:
<a href="https://archive.li/wip/hWZdd"
data-originalurl="https://abcnews.go.com/"
data-versiondate="2020-06-15">ABC News for June 15, 2020</a>
Which is rendered by the browser, as shown below. If a reader clicks on the Robust Links menu to the right, they can choose to visit the live web resource, the Memento, or other Mementos for this resource. If the reader clicks on the anchor text ABC News for June 15, 2020 the browser delivers them to the Memento for this resource captured by archive.org at 2020-04-14T14:53:54.
Though the anchor text is the same, and the two links look the same to the reader, they deliver the reader to different destinations. This gives the page author control over which resource the reader reaches by default. In addition, the Robust Links menu to the right provides them with additional options if they wish to visit another version than what is specified in the default.
The RobustLinks API accepts three inputs.
Memento-Datetime
header in the HTTP response. If the URL submitted belongs to an archive that does not support the Memento Protocol, then the API cannot make this determination and treats it as an original resource. A machine client submits the URL as a value for the url
query string parameter.anchor_text
query string parameter.archive.org
and archive.today
are supported. By default, a web archive is randomly chosen. A machine client submits this text as a value for the archive
query string parameter.If the generation of a Robust Link via HTTP GET is successful, then the API responds with an HTTP status code value of 200 and a JSON data structure, as shown in the cURL examples above. In the examples above, we focused primarily on the memento_url_as_href
and original_url_as_href
keys, but machine clients can also acquire other information relevant to creating robust links for this resource. Here we show the example again and detail the meaning of each field.
{
"anchor_text": "ABC News for June 15, 2020",
"api_version": "0.8.1",
"data-originalurl": "https://abcnews.go.com",
"data-versiondate": "2020-06-15",
"data-versionurl": "https://archive.li/wip/hWZdd",
"request_url": "https://abcnews.go.com",
"request_url_resource_type": "original-resource",
"robust_links_html": {
"memento_url_as_href": "<a href=\"https://archive.li/wip/hWZdd\"\ndata-originalurl=\"https://abcnews.go.com\"\ndata-versiondate=\"2020-06-15\">ABC News for June 15, 2020</a>",
"original_url_as_href": "<a href=\"https://abcnews.go.com\"\ndata-versionurl=\"https://archive.li/wip/hWZdd\"\ndata-versiondate=\"2020-06-15\">ABC News for June 15, 2020</a>"
}
}
The response provides the following JSON keys:
anchor_text
- The anchor text submitted to the service, or null
if none submitted; this key allows the client to verify that the anchor text was correctly interpretedapi_version
- The version of the API, not the software running itdata-originalurl
- The original resource URL. This does not necessarily match the submitted URL. If a Memento URL is submitted, then this contains the URL of the original resource that was captured, as identified by the Memento Protocol. This is the value used for the attribute of the same name in a Robust Link.data-versiondate
- The date of the Memento's capture in YYYY-mm-dd format as used by Robust Links. This is the value used for the attribute of the same name in a Robust Link.data-versionurl
- The Memento URL. This does not necessarily match the submitted URL. If an original resource URL is submitted, then this contains the URL of the Memento that the Robust Links service created. This is the value used for the attribute of the same name in a Robust Link.request_url
- The URL submitted to the API; this key allows the client to verify that the URL was correctly received.request_url_resource_type
- The resource type of the submitted URL, either memento
or original-resource
; if the value is original-resource
, then the API is indicating to the client that it created a new Memento for the clientrobust_links_html
- A key containing the HTML of the Robust Links, each in a different subkey:memento_url_as_href
- the HTML of the Robust Link with the Memento URL as the default link targetoriginal_url_as_href
- the HTML of the Robust Link with the origninal resource URL as the default link targetThe Robust Links API has the following response codes if an error occurs:
400
- there was an issue with the input submitted with the requestarchive
parameter is not a supported web archive403
- legal issues or other rules prevent the API from processing the input:404
- the requested URL endpoint is something other than /api/
or the Robust Links documentation and does not exist405
- an unsupported method was used, this API only supports the GET or POST methods, may not return a JSON object414
- the submitted query string is too long for GET, try switching to POST500
- all error states not accounted for elsewhere502
- the API experienced an issue communicating with a web archive:data-versiondate
503
- the API experienced an issue trying to retrieve the given memento URL:504
- the API service itself timed out, does not return a JSON objectIn addition to a non-200 status code, the Robust Links API also returns JSON containing information about the error. The JSON below displays an example. Because we cannot predict everything that could go wrong, we do not list all possible errors here.
{
"friendly error": "The submitted URL 'https://' is invalid. Please check it and resubmit.",
"error data": "Traceback (most recent call last):\n File \"/data/venv/robustlinks_api/lib/python3.5/site-packages/robustlinks/api/errors.py\", line 18, in handle_errors\n response_json = function_name(input_url, data, current_app.config.domain_blocklist)\n File \"/data/venv/robustlinks_api/lib/python3.5/site-packages/robustlinks/utils.py\", line 174, in create_robustlinks_response_content\n validate_url(input_url)\n File \"/data/venv/robustlinks_api/lib/python3.5/site-packages/robustlinks/utils.py\", line 99, in validate_url\n \"The Robust Links API detected no hostname in the URL {}\".format(url)\nrequests.exceptions.InvalidURL: The Robust Links API detected no hostname in the URL https://\n",
"arguments": {
"anchor_text": "ABC News for June 15, 2020",
"archive": null
},
"error string": "InvalidURL('The Robust Links API detected no hostname in the URL https://',)",
"input URL": "https://"
}
Every error response provides the following JSON keys:
arguments
- additional data submitted besides the input URLinput URL
- The URL submitted to this service.friendly error
- A friendly error message.error string
- An error message containing more details.error data
- Additional data about the error, including, when appropriate, the stack trace leading to the error.If you need to report issues on the Robust Links API, include this JSON in your report.
We intend for machine clients to employ the Robust Links API to create Robust Links and return the corresponding HTML. Because of the functionality provided by this API, users may be tempted to employ it for purposes we did not intend. In this section, we outline some of the potential use cases we want to discourage when using the API.
If a machine client submits an original resource URL, the Robust Links API creates a Memento. This is a convenience feature provided to help users quickly automate the replacement of existing links. If a machine client wishes to to create new Mementos from original resources explicitly, then tools like ArchiveNow perform this action with more features and without all of the additional overhead.
The Robust Links API provides the original resource URL (in data-originalurl
) associated with the Memento URL. It also identifies a resource as a Memento or original resource. If a machine client needs this information explicitly, then the Robust Links API provides unnecessary overhead. Instead, a machine client should directly query the Memento by employing the Memento Protocol. The py-memento-client Python library can help developers do this with Python applications.