Saturday, November 10, 2007

An academic take on LibraryThing tags

I just discovered Tiffany Smith's "Cataloging and You: Measuring the Efficacy of a Folksonomy for Subject Analysis".* It's the first detailed academic study of LibraryThing tagging—and a very sympathetic one.

The article focus on five books, comparing their tags with their Library of Congress Subject Headings (LCSH). The books are Harry Potter and the Half-Blood Prince, Susanna Clarke's Jonathan Strange and Mr. Norrell, Ian McEwan's Atonement, Marjane Satrapi's Persepolis and John Hodgman's The Areas of My Expertise.

LibraryThing doesn't "win" every comparison, but it comes out pretty well. I've already coopted her observations on two titles into my talks, namely Persepolis and Areas of my Expertise, both of which rate a single, very general subject. On the latter:
"How do you identify the subject of a fictionalized almanac, which, according to the Library Journal blurb on the back cover, is 'a handy desk reference for those needing a dose of nonsense'? If you’re the Library of Congress, you call it 'American wit and humor', and move on to the next item on your book cart. You’d be accurate, because Hodgman is American and the book is witty and humorous, but you wouldn’t have captured the specificity of this item."
Smith contrasts this with the LibraryThing's florid tag cloud, sporting such terms as almanac, hoboes, alchemy, cheese, cryptozoology, eels, omens, portents and absurdities. Record-by-record these tags may only serve to amuse, but if you can't recall the title, Hodgman's strange work can be easily retrieved by looking for books tagged both "eels" and "humor" or "hoboes" and "almanac". By contrast, I would not recommend wading through the American Wit and Humor subject!

I was also gratified to see the author notice an effect I've mentioned periodically but which has found no echo in other examinations of the topic and in the whole tired expert-vs-amateur polemic. As she writes, LibraryThing members pick up on the Napoleonic Wars element in Jonathan Strange, which LCSH misses:
"This may speak to the problem of the physical impossibility of the library cataloger reading the entirety of this roughly 800 page book to get to all of the detail. The Napoleonic element is not evident for the first third of the book and is not represented in the chapter titles, although it plays a pivotal role in the plot development."
Fundamentally, I'm willing to concede the virtues of expertise, but there's a lot to be said for reading the book all the way through, and library catalogers are not often able to do that.

In this connection, I've previously noted how my wife's third novel, Love in the Asylum, acquired an erroneous "Alcoholism" subject, derived ultimately from bad publisher flap copy. Clearly neither the librarian nor the publicist had read the book. (My wife caught the copy before it went to print, but not before it had acquired Cataloging in Print LCSHs.) And the LCSH team also missed the topic of American Indians (Abenakis), a major presence in the book, but not touched on in the first 1/3 or the flap copy.

Anyway, it's an interesting read. Since Smith did her research LibraryThing has grown almost 100%, and there are few things I'd quibble with*, but it's a very good outside examination of why LibraryThing member's tags should be dismissed by librarians interested in cataloging quality.


*"in"—as they say in academia—Lussky, Joan, Eds. Proceedings 18th Workshop of the American Society for Information Science and Technology Special Interest Group in Classification Research, Milwaukee, Wisconsin.
**For example, Smith was confused why some LibraryThing works had subjects that were not present in the Library of Congress record, which she believes is our source. In fact, we get our Library of Congress Subject Headings (LCSH) from many librares. Libraries are free to augement the LC's headings, and many do; we pick up anything in the 600s of all the MARC records that make up a work.

Labels: , , , ,

10 Comments:

Anonymous Anonymous said...

Joan Lussky is a professor in my LIS program, I'll have to point this post out to her, maybe she can give the author of the paper a heads up about it.

It is really interesting to see an actual comparison with LCSH, since my experience with LibraryThing has pretty much matched these results.

11/10/2007 10:36 PM  
Anonymous Anonymous said...

At the recent Internet Librarian conference in Monterey, someone from CiteULike was saying their analysis found only a 10% overlap between user tags on CiteULike and LCSH.

Tom Reamy of KAPS Group apparently did some research on tagging which included LibraryThing, but I don't know much of the details, beyond his finding many tags too broad.

11/11/2007 12:33 AM  
Anonymous Anonymous said...

I do believe your last sentence is missing a "not" -- rather than "tags should be dismissed" I suspect you meant "tags should not be dismissed".

11/11/2007 1:43 AM  
Anonymous Anonymous said...

I agree with your point on personal interaction from readers. Cataloging has certain restrictions and rules in order to be more uniform. Reader's Advisory has shown through research that people react and rewrite books they read based on their life experiences and cultural background at that time. Having the actual readers tag books, allows for all of the variations in reading interest to come out, while still pointing towards the "wisdom of the crowd".
The only issue I have with tagging, is that there has to be a larger number of people doing it for it to work. Just having a single library or consortia catalog tagging doesn't harness the power of numbers like LibraryThing or Amazon.

11/11/2007 1:06 PM  
Blogger Tim said...

"The only issue I have with tagging, is that there has to be a larger number of people doing it for it to work."

Very very true. And although LT has many tags, they're distributed in a very steep "long-tail."

11/11/2007 1:29 PM  
Blogger Tiffany said...

Thanks very much for reading my paper and for writing such a thoughtful response!
Thanks also for the clarification on the LCSH aggregation. For the purposes of this (exploratory) research, I was attempting to limit the comparison to just the headings assigned within the LC catalog, but if/when I delve deeper, I'll be sure to take this into account.
SIG/CR provided a great opportunity to engage in conversation about folksonomies and controlled vocabularies: interested parties may want to visit the conference paper archive at DLIST to read more.

11/11/2007 10:44 PM  
Blogger bibliotecaria said...

It's not impossible to get the LCSH headings changed at LC. We do listen to input, especially author's input. Email me and I'll see what I can do.

11/13/2007 3:51 PM  
Blogger Tim said...

bibliotecaris (if you're watching).

What about things that already hit CIP. Does the L feel bound to that?

T

11/13/2007 9:23 PM  
Blogger bibliotecaria said...

We are not necessarily bound to what was done during the CIP process, because we all too often we don't have the complete galley in hand. I've had at least one author request for a change of subject headings, which I did in fact change, to more clearly express the main ideas of the book. And although we don't greet with enthusiasm the idea of redoing old work, don't let that stop you. Email me at mpol AT loc DOT gov.

11/14/2007 10:47 AM  
Blogger LilySea said...

This is only obliquely related, but funny story:

When I worked for an Evil Empire (huge chain) bookstore, they forced the people on the floor to follow the company's computer designation for shelving books (to idiot-proof working at the store--right).

Thus I was made to move the libretto for Mozart's "Magic Flute" from the opera section where I had shelved it to ... (drumroll) ... New Age.

I could just picture the philistine who came up with that designation. Global Corporation+Bookstore=Bad Idea.

6/22/2008 11:37 PM  

Post a Comment

<< Home