Tuesday, January 08, 2008

While you were sleeping, ThingISBN got better.

LibraryThing does a lot of cool things nobody else does. And, as we grow, we do them better and better.

I've got a very good example for today: the ThingISBN service. It was good when it was launched more than a year ago, becoming LibraryThing's first API, and it's been getting better ever since. (And where its competitor became a paid service, ThingISBN is still free for non-commercial use.)

The ThingISBN service provides something called "edition disambiguation." Give it an ISBN and it will shoot back a list of "related" ISBNs—other editions, other media, and translations. Edition disambiguation is valuable stuff. Retailers use it to aggregate reviews and other data across editions, and to sell you something when the book you searched for is no longer available. Libraries use it to make sure a patron leaves with a copy of a book, even if the edition the patron searched for is checked out.

You can get ThingISBN in two ways:
  • As a REST-based API. Just change the ISBN in this URL as needed.
  • As a complete feed (thingISBN.xml.gz in /feeds). We ask that people not hit the API more than 1,000 times per day. Instead, pick up the full feed.
What's cool here? LibraryThing isn't the only supplier of this data. The other supplier, OCLC, the Dublin-Ohio based library data organization, compiles its data through clever automated analysis of OCLC's billion-plus records. Their data and algorithms do a great job. Unfortunately, they charge for the service, called xISBN.

LibraryThing does it differently, relying instead on members, who add, combine and separate editions by the thousands every day. For doing this, LibraryThing members get better connections with other users. That is, you gain connections and enhanced recommendations by connecting your edition with others. The result is a detailed list set of correspondences between editions, assembled by thousands and improving every day.

You've got to admit it's getting better. If you improve every day, you can get pretty good, and that's what's happened to ThingISBN. OCLC still beats LibraryThing in quantity, but LibraryThing is closer, and, it seems to me, has a clear advantage for paperbacks.

I want to revist some of the examples I gave when ThingISBN debuted:
  • OCLC's canonical example is Frank Herbert's Dune. I don't have the exact counts, but LibraryThing originally trailed OCLC. (I know because I used it as example in a number of talks.) As of now, however, LibraryThing has passed OCLC, with 89 ISBNs to OCLC's 80.
  • Peter Green, Alexander of Macedon. When ThingISBN started, both LibraryThing and OCLC knew the recent hardback, and one other edition. That is, LibraryThing knew the paperback and OCLC knew the 1974 first edition. Since then, LibraryThing has discovered the first edition, giving it three ISBNs; OCLC still doesn't know about the paperback.
  • Lee Strobel, The Case for a Creator. OCLC knew of two editions, LibraryThing eight. OCLC now knows three, LibraryThing eleven. It's about paperbacks, obviously.
  • Emily Bronte, Wuthering Heights. Originally LibraryThing had 92 ISBNs, OCLC a commanding 326 ISBNs. OCLC is still in the lead, with 424 ISBNs, but LibraryThing has more than tripled its count, to 285.
Now, I'm quite sure that, overall, OCLC's xISBN service still beats LibraryThing in coverage. LibraryThing only covers 2.7 million ISBNs. OCLC must cover more.

But LibraryThing is gaining. It's getting better faster.

And while OCLC continues to sink resources into the project, including staff, now a paid service for all but minimal use as part of its Peace-is-War-ish Openly division, I can tell you honestly that I haven't touched ThingISBN in six months. I haven't made it better, even a little. Members made it better.

Now as then, that's pretty revolutionary stuff.

See you next January, OCLC.

Labels: , , , , ,

7 Comments:

Blogger Eric said...

OK, so you've copied the xISBN service and added better paperback joins, why get bent out of shape if OCLC ups the ante a bit? You don't offer service for commercial use. You limit service for more than 1000/per day. Does none of your metadata come somehow from OCLC? You elide the fact that OCLC has a free level of non-commercial service that returns MORE metadata than tISBN.

Bottom line is- letters are all equal, but x is more equal than others.

1/11/2008 2:26 AM  
Blogger Tim said...

Eric, head of the xISBN project, welcome.

Actually, we do offer a commercial service. So far, we're not pushing it. Let me know if you want to subscribe.

While we ask people to limit it to 1,000/day, we provide a feed access to the entire file for free. We have few servers than OCLC, but if you want to hit something 1,000 times/day, you probably want the speed of having the file yourself.

It's true you're returning more now. We've thought about returning LCCNs and etc., but, so far, there's not as much interest.

Incidentally, you're right that some of the ISBN numbers that members have decided to combine into works came from MARC records from libraries that are OCLC members, mostly the Library of Congress.

While I understand that OCLC believes that all your bases are belong to us, that's a pretty tenuous case, amounting to the the use of a primary key—over which Bowker, thank God, has responsibility, not OCLC. LibraryThing has, of course, never signed any data agreement with OCLC. But if you want to send out strongly worded letters to libraries to stop sharing their ISBNs outside the circle of trust, be my guest.

1/11/2008 2:43 AM  
Anonymous Anonymous said...

Disclaimer: this comment contains technical acronyms!

Thanks for the update! I included thingISBN as an example of the SeeAlso linkserver protocol in my implementation of a SeeAlso server in Perl (SeeAlso::Server at CPAN). See the source and a live demo how to use thingISBN data in HTML via JavaScript.

At the moment I think about aggregating our library catalog records and index with thingISBN to start supporting FRBR. LT benefits from library catalouges by copying their data, so it should be fair if libraries benefit from LT by copying its data, should't it? ;-)

1/22/2008 10:25 AM  
Blogger Tim said...

LT benefits from library catalouges by copying their data, so it should be fair if libraries benefit from LT by copying its data, should't it? ;-)

Absolutely! I completely agree. We want libraries to use it. That's why it's free. As for WalMart, they have to pay?

1/22/2008 12:00 PM  
Blogger Tim said...

That ? is wrong.

2/15/2008 1:00 AM  
Blogger BillSeitz said...

Is there a unique "work" identifier?

3/06/2008 12:50 PM  
Blogger Tim said...

Yeah, but it changes over time...

3/06/2008 1:51 PM  

Post a Comment

<< Home