The Book Deal May Be Dead, But Google Is Still Right


The Google book settlement — which the search giant signed with the Authors Guild and the Association of American Publishers in 2008, after a dispute over the company’s scanning of books — was struck down by a judge this week as too far-reaching, which is arguably true (although Google would undoubtedly disagree). But the fact that the arrangement has been rejected might not be such a bad thing, because it puts the spotlight back where it should be: on the fact that Google is doing nothing wrong — legally or morally — in scanning books without the permission of the authors or the publishers of those books.

Just to recap, Google started scanning books sometime in 2002, as part of its expressed desire to “index all of the world’s information.” In addition to deals with certain publishers and various university libraries — deals that are not affected by the book settlement or the legal ruling — Google also began sourcing and scanning books that were either in the public domain or were “orphaned” (a term used to refer to books that are still under copyright, but whose author or publisher can’t be found).

So far, so good. But Google also started scanning and indexing books that were under copyright, then offered authors and publishers the ability to “opt out” of the program and have their books removed. Some felt that this was a good bargain — especially since Google was going to help promote their books (by revealing them in search and at the Google Books site) and give readers an easy way to buy them. Others, however, said that scanning and indexing their books without explicit permission was wrong, and filed the lawsuits in 2005 that led to the agreement.

The crux of this argument is that scanning a book makes a copy of that book, and that copying is not permitted unless a copyright holder specifically agrees. The authors and publishers made this argument despite the fact that Google only ever shows a small fraction of a text when they display a book online. It’s not as though the company planned to make copies of all books freely available to anyone through some kind of Google Books version of Napster. But the plaintiffs argued that simply scanning them was bad enough.

This is a ridiculous position, and always has been. Scanning something makes a copy of it in the same way that my viewing a web page makes a copy of it in the RAM of my computer — I’m surprised that authors and publishers haven’t tried to argue that this is secondary copyright infringement as well.

The reality is that Google’s use of selected extracts from books or any other work is protected by the principle of fair use (PDF link), which allows anyone to make use of published content of all kinds (text, images, etc.) without asking for permission from the creator or the rights holder. It’s the same principle that allows Google to index and show search results for images, web pages and other content without having to ask every single site publisher or photographer. Fair use requires that the user of the content meet the so-called “four factors” test, but Google arguably passes all four.

Why is this important? Because without that ability, search engines as we know them couldn’t exist, and they are a positive force for society as a whole — just as having a single way to search (and buy) every published book in the world would be a positive thing. Imagine if we were setting up public libraries now: would any author or publisher agree to have copies of their books just sitting there on shelves, for free, with anyone allowed to borrow them for as long as they wanted to? Unlikely (and e-book publishers like Amazon are trying to roll back borrowing abilities for digital works as well).

The big problem with the Google book settlement, as noted by the judge who struck it down (PDF link), is that the settlement gave the web giant the exclusive right to do whatever it wished with all scanned works, including selling orphan books, which is arguably over-reaching. But that doesn’t change the fact that Google’s initial impulse was the right one: it does have the right to scan and display extracts from books, regardless of what the Authors Guild and the AAP say, and it should continue doing so.

alan herrell

It is the nature and terms of current Copyright that is causing this mess. Copyright is too long and is not being adequately enforced.

Just staying with current US copyright illustrates this quite nicely.
Copyright’s original goal was to give creators a limited monopoly on reproduction for a limited term (14yrs.), at which time the work became part of the Public Domain, available to all. In the litigation land grab, everybody sweeps the public domain under the rug.

The only significant change to copyright was the ability for the Author to renew their copyright for an additional 14 years.
It can be argued that this was the beginning of copyright as a welfare payment system for heirs and/or publishers depending on contract. which was not the intent of copyright in the first place. Adding sound and video recordings just updated the different methods of creativity.

In 1976 the US dropped the requirement for mandatory registration and granted copyright automatically under the vague rubric of ‘fixed form’, which in 76 was printed in some form. This is the biggest mistake made around copyright and gave birth to the current crop of litigation and full employment for copyright and ‘intellectual property’ lawyers.
Two points here: registration is required for copyright litigation to proceed, and most registrations are corporate in nature as part of publishing contracts. Signing over your rights is usually the bedrock of any publishing contract regardless of media.

As you mentioned the digital world has changed things not only in what fixed form is, but also it’s reproduction, making publication something that happens on a computer in one’s bathrobe rather than in a building with printing presses, sound or movie studios.

Most current copyright litigation is industry driven, whether it be publishers or their ‘association’ mouthpieces. RIAA,MPAA, etc.

The Authors Guild and the AAP are coming late to the party and are disguising motion as activity to cover up their shortcomings in ‘protecting’ the interest of their members.

That Google was first out of the box in thinking about media in digital form should be applauded rather than being scorned for saving cultural items. That there are areas where money if made, should be split is an issue for another day, although I would bring back mandatory registration, cut back copyright back to 14 years with no extensions and use the money to enhance the storage, cataloging, and enforcement of copyright to the Copyright Office, and create a more robust Public Domain System.


Actually Google did breach the copy right as most books published have the clause:
“No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without the permission in writing from the publisher.”

So scanning and storing, even for search, goes afoul of the “information storage and retrieval” clause and thus is a breach of copyright

Mathew Ingram

That’s true, Eric — except that “fair use” is an exemption from copyright. It doesn’t have to be explicitly allowed, that’s the whole point of the exemption.

Wendell Dryden

“Scanning something makes a copy of it in the same way that my viewing a web page makes a copy of it in the RAM of my computer”

Nonsense. The latter is transient, like a TV show pulled out of the ether and streamed on to my 20″ Zenith: the former is recorded and saved, like that same show stored on a VHS tape.

“Imagine if we were setting up public libraries now: would any author or publisher agree to have copies of their books just sitting there on shelves, for free, with anyone allowed to borrow them for as long as they wanted to? Unlikely”

Yes! This is an important and compelling argument. We can think about the internet the way we think about libraries. We gain *one* perspective on rules governing internet content by considering how they would impact public service vehicles like libraries, museums, schools…. Of course, that raises the question of “for profit” versus “charitable” institutions.

“without that ability, search engines as we know them couldn’t exist, and they are a positive force for society as a whole”

Yeah? Maybe. But so what. Do they have the right to exist just because they are a societal good? Do they also have the right to make money? I’m not arguing one way or another – I’m just saying. Tied up with this “public good” argument is a “private property” argument. Somebody is going to make private money – as they always have – off of the reproduction of creative and intellectual content. This debate is really just an argument about who that “somebody” is going to be.


“This debate is really just an argument about who that “somebody” is going to be”

Greg Satell

Thanks for this. I agree that Google is taking a reasonable and honorable approach.

What’s really important is that it shows a lot of maturity on Google’s part. They are not only creating useful technology, but taking stakeholders into account. They are striving to find solutions rather than simply raging against the machine.

– Greg

jam ray

thanks for summarizing the whole thing. i’m on google’s side on this one. authors are just greedy and stupid. the world’s changing, when are you gonna keep up?

jam ray

i’m just a user who pays for those books, music or whatever. if only i can discover them, find them and securely pay for them.


I really can’t see how you found Google’s view to be correct. They show much more than excerpts in the books I usually stumble upon in search – they show the entire book with some pages missing. If that is not copyright infringement, I don’t know what is.

Since the settlement I’ve seen book results much less frequently in searches, and I think that was the main purpose of the lawsuit.

Cyndy Aleo

Google is so unbelievably wrong here it makes my skin crawl, and the EFF, which I usually agree with, is smoking hash in your commune, Mathew.

Fair Use doesn’t include the right to reproduce pages and pages of text. Even the articles you cited note that the less used the better unless it comes to something like parody. Take a search I did this morning when looking for precedent for the Kodak lawsuit:

I’m taken to several pages that basically give me all the information I needed. If I did want the book? Sure, there are links to several online retailers, but the big, gray button with “BUY THIS BOOK” staring at me takes me where? Google Checkout.

If this was Barnes & Noble or Indigo photocopying pages and handing them out in the store for you to read, people would be furious. But Google couches it as “helpful” and “search-related” and that makes it okay?

In addition, the EFF (and you) citing burden of proof regarding copyright in this era of digital reproduction is impossible. Copyright law needs a major revamp, starting there. It’s one thing to prove you own a copyright when the avenues in which to violate copyright were limited. When millions of web pages are reproduced each day, it becomes like trying to find a needle in a haystack.

As part of several online writing communities, I’ve seen “authors” actually retype entire novels, sometimes cobbling together several novels and attempting to pass it off as their own work. How many more cases are there like that where it’s not caught because it wasn’t recognized by readers? In one instance, the same user kept signing up for new accounts time after time and copying the same published work over and over again. It becomes time- and cost-prohibitive for authors to try to pursue each and every instance even when they know it’s happening because of the legal costs.

Both you and I make our living based on the premise that there is value in the creation of content. If publishing as a whole ends up subject to the loosest interpretation of copyright, as shown by Google in this instance, we’ll soon all be out of our jobs.

Edited to fix link

Mathew Ingram

I’m not in favor of Google publishing pages and pages of text from books either, Cyndy — that’s why I specifically mentioned the short excerpts that Google provides for most books (unless they have an agreement with the publisher to show more). And the burden of proof has to be on the copyright holder, because that’s where it belongs — and that principle applies in all kinds of cases, from YouTube to regular web content. Requiring Google (or anyone else) to verify and contact the rights-holder in every case would render much of the Internet obsolete, and I think we would be giving up a lot if that were to happen — a lot of things that benefit everyone, not just authors and publishers.

David Gerard

Fair use is what a judge rules it to be and can quite easily be up to and including 100% of the work in question, depending on circumstances. Your assertion that “Fair Use doesn’t include the right to reproduce pages and pages of text” is simply factually incorrect, and very misleading. You should stop saying this, unless you add a *lot* of qualification.


“If this was Barnes & Noble or Indigo photocopying pages and handing them out in the store for you to read, people would be furious. But Google couches it as “helpful” and “search-related” and that makes it okay?”

You can sit in Barnes & Noble and Borders and read however many pages out of a book or magazine stocked in the place. You can do it at many used booksellers. Many have chairs and comfortable seating areas. Are you asking the brick and mortar places to strictly police the clientele with signs up saying “No Reading Allowed In The Bookstore”? Worse, you can do it in and out of the library with their publications and each library has probably bought only one, two or just a few copies of the item.

“As part of several online writing communities, I’ve seen “authors”…attempting to pass it off as their own work.”

A red herring with nothing to do with Google. Majority of the works can be hunted down as a used paper copy by the bogus authors and they don’t have to retype, just scan the publication themselves.

“Copyright law needs a major revamp”

Let’s try this tack. I’d grant you the most draconian copyright measures and penalties possible, but copyright only last 5 years. How does that sound?

Craig C

The example that you gave for the Kodak lawsuit located here: IS NOT a good example AT ALL.

That book is reprinted in it’s entirety because the PUBLISHER GAVE EXPRESS PERMISSION TO DO SO. This is noted in the bottom left hand corner of the page. Any time you find large excerpts or entire books in Google search, it is only if the publisher has expressly authorized Google to do so. Authors and publishers have complete control over how much is shown in Google Books, so your point isn’t valid at all.


Isn’t this somewhat a “nip it in the bud” thing – selling orphan books now but who knows what Google might come up with next, what the next fight (lawsuit) would be about? Better to kill or redo now even if it needs something from congress?

