Back when I wrote my book, I was surprised at the lack of sophistication in the publishing industry. I had always figured that the desktop publishing revolution would have streamlined the publishing industry – I envisioned elaborate templates and tools that would enable a publisher to easily choke down text and automatically pump out a finished book. Instead, the tools provided by my publisher consisted of a Word template that rendered everything (titles, headings, body text, etc.) as monospaced Courier – all of which was later laid out in Quark Express by hand.
Rewind to last week at Web 2.0: Brewster Kahle presented the seductive vision of universal access to knowledge that could be achieved by scanning the entirety of the Library of Congress for a pitiful $260 million. This revelation followed the announcement of Google Print, Google’s answer to Amazon.com’s Search Inside the Book feature, will enable users to find information in books as part of their Google search experience.
While I applaud both Google and Brewster’s vision, I sense a gap: Brewster’s proposal will give a digital access to books from the past; Google’s service will give (limited) digital access to books from the present. All I can wonder is: who will give digital access to books in the future?
While it is obvious that digitizing the Library of Congress is a manual procedure, it might come as a surprise that Google’s efforts are equally manual. Google generously offers to scan publisher’s content, thereby making it available via the Google Print service while protecting the publisher’s content. Scanning. Just like Amazon.com. By hand. This means that 75 years from the date of an author’s death in the future, Brewster’s organization will have to scan the author’s books by hand – books that Google will already probably already have in a digital form.
All of these undertakings smack of massive amounts of physical (i.e. non-digital) labour. So, if Amazon.com and Google are both doing it, why not cut out the middleman? Why not just have the publisher’s provide the PDF’s (or whatever is the appropriate digital format) of their content directly to Google or Amazon.com? Or, better yet, why not have the Library of Congress solicit electronic versions of books directly from publishers and escrow them for the time when they enter the public domain, just as they do for physical copies? Aside from the efforts of the Library of Congress to digitize rare books, I’m not aware of whether or not they do this already – does anyone know?
My fear here is that Google and Amazon.com will amass a digital library of scanned books that will remain gated off from the public even once the books within it have entered the public domain. Do we really want to still be running Project Gutenberg in another hundred or so years? Probably not.
If the Library of Congress isn’t already cooperating with publishers to escrow electronic copies of books, wouldn’t it make sense for Google and Amazon.com to pledge to release the electronic copies to the public, the Library of Congress, or Brewster Kahle’s organization once they’re in the public domain? After all, it’s not like they even have to fulfill the pledge for another seventy-five years.
Does anyone know if this is already part of Google/Amazon/Brewster’s plans?