Autolink Revisited

I was, unfortunately, flying between Chicago, Atlanta, and Salt Lake City when the whole Google Toolbar Autolink kafuffle occurred, and so I didn’t have a chance to comment on the phenomenon at the time. No matter. Instead, I’ll pretend that I’ve been sitting back and carefully pondering the topic, waiting for the right time to unleash my thoughts on the world. Yeah, that’s it.

Reviewing the discourse, it’s clear the arguments fall into two lines of thought:

  • The user is God: Whatever the user wants, the user should get. After all, the content is rendered on the local machine – how is what Google doing any different than existing toolbar plugins that scrape away ads and spyware, or otherwise alter the content to enable accessibility? And isn’t anything that makes the user’s life easier, even if by altering content, a Good Thing? Cory Doctorow plays ringleader in this court.
  • The content developer is God: How dare you sully my precious content with links that I, the content provider/developer, did not deem worthy of inclusion. I demand the right to opt out! Chief adherents to this line of thinking: Dave Winer

There were others who started on one side of the argument, then flip-flopped and ended up on the other side. In the end, reasonable suggestion from Robert Scoble, Tim Bray, and others came to discover the universal truth in debates such as these: the reasonable answer lies somewhere in between the extremes.

I found it ironic that Dave Winer, patron saint of the user came down on the side of content producers (or “developers”), given his historic railings against developer-centric thinking about software applications. To his credit, Cory Doctorow remained quite consistent in his vision of “user’s rights”, applying the same “right to remix” arguments he uses against DRM technologies.

In my opinion this was not an argument so much about what the Google Toolbar currently does, but rather what it (or other applications like it) could do in the future. If it’s acceptable to allow insertion of links, would it be OK to change existing links or add new text in such a way as to alter the original vision or intent of the author. Would we effectively be allowing Google to put words in author’s mouths? Much of this ignored, I believe, that Google has no real interest in what any particular author has to say. They’re simply interested in helping people find what they’re looking for, and automatically linking ISBNs to Amazon.com and addresses to Google Maps seems like a good mechanism.

While I agree with Scoble’s assertion that there need to be some rules about linking in order to prevent competing toolbars engaging in “link fights“, I would argue against his assertion that autolinking could result in a “wall-garden”. As things stand right now, services like Yahoo Maps, Mapquest, and others have become deeply embedded in the web as more and more people link to them – but what happens if something new (like Google Maps) comes along that a reader prefers? Tough! Those links still point to the services dictated by the page author.

Imagine instead a toolbar that would allow you to select your favorite application for “classes” of links (links to movies, links to books, links to maps, etc) – this would help smooth the way to dislodge obsolete services from the web and help migrate users transparently to newer, more innovative services faster. In an ideal world it would be nice if content producers could signal to a browser, “This is a movie link to ‘Forrest Gump’ – if the user clicks this link, take them to their preferred vendor for movies (Amazon.com, Netflix, Blockbuster, whatever), or if none is available, take them to Amazon.com). Oh, and here’s all my affiliate IDs for these services, in case the user buys something”.

Unfortunately, this would require authors to change their content to provide this meta-data. Would the web be willing to exchange their ideals of absolute author rights in exchange for an autolinking mechanism that mapped existing URLs to a user’s preferred service provider for a given “class” of URL? I’m not sure. And, of course, this entirely ignores the other question about autolinking: is altering the author’s content in this fashion even legal?

A Question Of Copyright

Poor Martin Schwimmer has stirred up quite a hornet’s nest (Scoble’s got the running summary) with his recent post on his decision to ask Bloglines to stop aggregating his blog’s RSS feed. While many are quick to criticize, I think it’s important to stop to examine the issue a little deeper to see if there’s any validity to his concerns.

First, let’s examine Martin’s opening volley:

This website is published under a Creative Commons license that allows for non-commercial use, provided there is attribution. Commercial use and derivative works are prohibited.

It was brought to my attention that a website named Bloglines was reproducing the Trademark Blog, surrounding it with its own frame, stripping the page of my contact info.

At first, you might think this is a bit ridiculous, but let’s break down the issue by examining the site’s licensing terms.

Non-Commercial Use

This is probably a pretty valid argument – although I’m not clear whether or not Bloglines is currently making money, they certainly are an ongoing commercial entity. But it is a fine line – I seem to remember there being similar grumblings in the Open Source community, back when people starting building commercial services on the back of GPL software. Actually, now that I think about it, that argument is ongoing.

Maybe it would help if we took a step back. Can we agree that if someone took Martin’s page and sold in on t-shirts, then that would be an infringement? Absolutely. And if someone offered copies of the content printed and bound? Again, a blatant violation. But what about an individual user viewing the page through a commercial aggregator located on their desktop? No way in hell is that a violation.

But once the jump is made to a server-based aggregator that provides the same functionality, the line between commercial and non-commercial purpose becomes a little less certain.

(Another question: is a web-cafe that charges for Internet access in violation if one of its users view Martin’s site?)

Although Martin, in a response on his blog, laments:

At least with Google’s contextual ad program, the blog creator gets some money.

True. Although Martin certainly doesn’t make any money from Google when it creates a derivative work from his blog and displays the result in its search listings on Google.com. I wonder: has Martin contacted Google to have his site removed from its search engine? Apparently not.

No Derivative Works

You have to concede this one to Martin – “no derivative works” is a pretty clear statement.

Attribution

What constitutes “attribution”? Well, if you go according to the definition provided on the Creative Commons’ Licenses Explained page:

Attribution. You let others copy, distribute, display, and perform your copyrighted work – and derivative works based upon it – but only if they give you credit.

Hmm…that doesn’t really shed much light on it, does it? What constitutes “credit”? What “contact info” would satisfy Martin’s requirements under the Creative Commons licensing scheme? A telephone number? An address? A link directly to his contact page?

How about a link to the original web site?

A quick examination of any blog in Bloglines reveals that it displays a prominent banner featuring the name of the blog as part of its user interface. And a link to the original top-level blog URL. And links to each item on the original source blog. And the following description of the blog:

The Trademark Blog from the law offices of Schwimmer and Associates

If this doesn’t satisfy the criteria of “attribution”, what will?

An Interesting Twist

Up to now, the discussion has been focused on the terms of Martin’s Creative Commons license. But there’s an interesting twist: Martin’s RSS feed doesn’t actually contain his Creative Commons license! That’s right, if you examine the raw XML, you’ll find a “copyright” element with the contents:

Copyright 2005 Martin Schwimmer

Hmm. That’s interesting – given that every other page on Martin’s site contains an embedded link to his CC license, would I be right in thinking that the RSS is not subject at all to its licensing terms? Could it be that his feed is, gasp, protected using plain ol’ regular copyright? In that case, it would appear that all bets are off.

While I certainly don’t wish this to be the case, you have to concede that Martin is following the letter of the license he stipulated, both for the original page as well as the RSS feed. While we may not like the outcome, or the fact that such an attitude not only will balkanize useful applications and innovation on the web, I don’t think you can argue the facts – after all, trademarks and copyright are his beat.

Implications

While it may not have been his intent, I think Martin’s actions have highlighted a legitimate concern for both content creators and aggregators. The proliferation of aggregation services is driven by an age-old secret to business: steal from the commons. The web is being viewed by web-based businesses as a wonderful resource for building value-added services, but it’ll only take one really well-funded lawsuit to bring down this house of cards. Web-based services need to think about embedding CC recognition into server-based applications to protect themselves from this possibility.

For us, the blog community, we need to remember that the purpose of the Creative Commons license is to allow the creator to exert control over the fruits of their labor. While we might want everyone to choose the least restrictive CC licensing terms, if we choose to blatantly disregard those licensing terms when we don’t agree with them (or dog-pile on the creator), we’re undermining the viability of the licensing scheme as a whole. To that end, perhaps we should be browbeating web-based services, such as Bloglines, Rojo, Feedster, Feedburner, and PubSub to incorporate CC license recognition intelligence into their services and use it to filter out content that hasn’t been properly licensed for their purposes. Doing so would serve two-purposes: protect these services from future infringement litigation; and further cement the Creative Commons licensing scheme’s reputation as a legitimate mechanism for creators to exert control over their works. Indirectly, such action may also illustrate to copyright owners like Martin the value of participating in these services and choosing a less-restrictive CC license, enabling the creation of technologies that not only benefits readers, but also the content creator themselves.

I, for one, would like to thank Martin for the attention he’s brought to this issue. While it may not have been his intention to bring about this level of discussion, I think it’s been valuable nonetheless.