Sunday, December 07, 2008

The Elusive Moose and OCLC

Over the next few weeks I'm going to try to approach this issue in a number of different ways. Here's a first try.

Thought experiment. I walk into the Portland, ME public library and look up The Elusive Moose. Who owns the database record, with the title and subjects and so forth?

Who owns it? Joan Gannij wrote the book, and Clare Beaton illustrated it. Barefoot Books of Cambridge, MA published it.

To qualify for the Library of Congresses Cataloging in Print program, Barefoot Books filled out forms and submitted the basic data. The Catalogers at the Library of Congress used that and some sample chapters and made the basic record record from that, providing the publisher with the core cataloging information it printed in the book.

Then people at three other companies improved the record--Ingram Library Services, Baker & Taylor (twice), and Yankee Book Peddler. After that catalogers at two public libraries worked on it--the Vancouver Public Library and the Southfield Public Library. Finally the Anchorage, Alaska School District added the finishing touches. No doubt they know from moose! (LibraryThing, located in Portland, ME knows about moose too!)

Whose record is it? The authors? Publisher? The companies? The public libraries? The school library? How about me, or nobody? Aren't libraries supposed to be about free access to information?

The Answer. Right now, it's unclear. Probably no one owns it. The Library of Congress did the most work, and, by law, their work is free to all. And anyway, the record is composed of facts, which can't be copyrighted.

Come February, however, it will acquire a new owner, an organization known to few Americans and accountable to fewer, the Online Computer Library Center (OCLC) of Dublin, Ohio. The contribution came late--after the Library of Congress had created the base record they uploaded it to OCLC so other libraries could have access to it. And their contribution was minimal--warehousing 1k of data and sending it over the (free) internet. And for that work they were very well paid, both directly and for the services they offer on top.

OCLC's new license purports to offer carrots to libraries. But it's mostly carrots from their own gardens. And it comes at a steep legal price, transforming the legal relationship between librarians and their labor, and making everyone else come begging to Dublin for information about books. OCLC will be asserting a perpetual, retroactive and explicitly viral license over the records--as good as ownership. The OCLC policy that will cover many if not most library records in the world, even at the LC and other national libraries, and is designed to spread to derivative works.* All use will be on OCLC terms--which, of course, like any such license, they can change at any time. The terms shut down the Open Library, a giant open-data cataloging project sponsored by the non-profit Internet Archive. And they shut down all commercial use of records--including LibraryThing's, unless we go through their new owner.

Petitions. If this bothers you as much as it does me, check out the Stop the OCLC Powergrab Petition, put up by Aaron Swartz, Tech Lead at Open Library. Aaron also wrote an excellent blog post about the the issue.

If you're a librarian, check out Elaine Sanchez's Petition for OCLC to Collaboratively Re-write Policy for Use and Transfer of WorldCat Records.

BTW: Don't worry too much about LibraryThing. One way or another we'll get through this. More and more I'm confident either the Policy will change and OCLC will embrace and lead a future of openness and collaboration, or opposition to it will create what OCLC is trying to prevent—a free and open repository of high-quality bibliographic data.

*There are millions of "OCLC-derived" records at the LC. I think I'm going to write my next post trying to figure out what the Policy means for the LC and other federally-funded libraries.

Blogger Alexander Gieg said...

Hi! I've signed the petition and also mentioned this in a post at Wikipedia's WikiProject Books' talk page called OCLC controversy. I hope it helps bringing the issue to the attention of more people.

12/07/2008 2:02 PM  
Blogger Russ said...

While a petition is an excellent start, I also recommend people in the U.S. contact their congressional representatives. Monopolies are not allowed under anti-trust laws as I understand them. Further, if they are essentially claiming work that has already been declared free by the Library of Congress, then they should be prevented from doing so. Congress can override any license they try to enforce.

12/07/2008 2:07 PM  
Anonymous Anonymous said...

I've been trying to figure this out for months, but I still don't get it. I'm not a librarian, but I've been trying to understand libraries better for some research I am doing.

Here's what I don't get:
1. Is it possible to copyright a library record?
2. Assuming it was possible to copyright such records, have the Library of Congress and all other major libraries somehow either (a) assigned copyright to OCLC or (b) given some sort of exclusive license to OCLC?
3. What prevents libraries from sharing their records? Technical challenges? Legal problems? Fear of repercussions?
4. Why don't all libraries make Z39.50 feeds available?
5. Why don't all libraries make all of their original cataloguing available for free in MARC?
6. Is there currently no way to provide standard library services (e.g., inter-library loan, copy cataloguing) without OCLC?
7. Do large libraries use OPACs that can actually use data sources other than OCLC?
8. What prevents major libraries from not using OCLC after the changeover in policy?

The whole situation basically makes no sense to me. It seems as though all of the MARC records in the US could fit on a standard 500 GB, maybe 1 TB hard drive at this point (though Tim probably knows the actual size). This isn't to slight the hard work of original cataloguers, but to question why this bizarre middleman exists nominally to distribute infinitesimally small pieces of data.

(Also, comment to Russ: as I understand it, monopolies aren't illegal, so much as monopolies engaging in anti-competitive behaviour, which is somewhat harder to prove.)

12/07/2008 3:39 PM  
Anonymous Nicole C Engard said...

Here's one thing that confuses me still. Like you said the LOC records are legally free to all - so how can these new terms change that fact? Will LOC have to stop submitting records to OCLC?

Just some questions I haven't been able to answer on my own - and that I've been wondering about.

12/07/2008 6:57 PM  
Anonymous GeekChic said...

As someone who has worked in Systems in Canadian, Mexican and American libraries for many years I'll try to answer Anonymous:

1) IANAL and am not going to touch this. ;) Do note that we're not just talking about U.S. copyright, we're talking about licensing.

2) IANAL. Again, the issue is more licensing not U.S. copyright.

3) There are many libraries that are not members of OCLC and that do not have Z39.50 servers (of the 30+ libraries I've worked with only 1 was an OCLC member). That said, there are other ways to share (such as union catalogues and national catalogues such as AMICUS in Canada).

What keeps people from sharing is often: a) the difficulty of doing so in an efficient manner (uploading records to AMICUS, for example, is a pain and an extra step); b) cost (in the case of OCLC - almost none of the libraries I have worked with could afford membership dues); c) bandwidth (not all libraries have fast connections); d) other priorities.

4) Z39.50 servers are not trivial for some libraries (both in terms of setup and maintenance). There is also the feeling that a library's limited bandwidth is best used for other things. Finally, there can be the perception that an anonymous inbound connection from anywhere is a security risk (there has never been a documented attack through a Z39.50 connection that I am aware of - but that is often not understood by the folks manning the firewall).

5) Available to whom? Where? How? This must be something that is efficient in terms of staff time and not a strain on bandwidth or the library will not bother. There are libraries that have the philosophy that they have made the record and they are not going to give it away (I have only heard this from a few specialist libraries that do their own original cataloguing).

6) ILL and copy cataloguing can definitely be provided without OCLC - it happens all the time. Many simply find it more efficient to go to one place to get everything (a monopoly isn't necessarily a bad thing in their eyes).

7) The OPAC is irrelevant - this is the patron portion of the integrated library system. The cataloguing portion of the integrated library systems that I am familiar with do not required OCLC to be functional - my current employer uses 10 data sources, none of which are OCLC.

8) Efficiency of using OCLC. Previous software purchases / contracts. Smaller Tech. Services departments that cannot handle any inefficiencies. All of these things can be overcome if the institution so desires and has the necessary money to allocate.

12/07/2008 9:46 PM  
Anonymous Anonymous said...


Thanks for your answers. It sounds like a lot of this stuff is inertia as opposed to anything else, but who knows what's up with the licensing issues...


Maybe this is what you're planning for your next post re: LOC, but I'm curious how distributed original cataloguing is for the sorts of books that are in LibraryThing. MARC bibliographic has 001 and 003 for the original cataloguing institution, no? Is there a power law in terms of the number of records from original cataloguing institutions (+ Amazon)? (Also, if a library modifies a record, is that somehow noted in MARC?) How many institutions would LT users have to convince to keep their records un-OCLC licensed in order to have a relatively minimum impact on LT? You've got over 700 data sources, but how many would get you the majority of LT MARC records in use?

12/08/2008 2:45 AM  
OpenID bibliotecaria2 said...

One question I have that may be related to this issue: who creates/ controls/ modifies the MARC21 coding that is the carrier for all this information? Does this have any impact on the argument?

12/08/2008 9:05 AM  
Anonymous GeekChic said...

@ Anonymous: Inertia is definitely a part of it. That said, I do appreciate (even as someone who has almost never used OCLC) the efficiency of OCLC. Many cataloguing and ILL departments have been shrunk so drastically that it will be difficult for them to change quickly without staffing or monetary assistance.

Libraries do indicate when they create or modify a MARC record. This information goes in the 040 field (using their library symbol).

@ bibliotecaria2: The actual MARC21 code is maintained jointly by the Library of Congress and Library and Archives Canada.

12/08/2008 9:41 AM  
Anonymous Anonymous said...

I came to this page from The Onion site. If I didn't know better, I could believe I was still on a subsite of The Onion.

12/12/2008 10:28 AM  
Blogger JLH said...

Anonymous: Yeah, library and cataloging people can get like that - out there, you know. But what I am curious about is Blacklight, at U. Va. -- is it working yet? Are there other projects like it?

1/05/2009 3:08 PM  

