Tuesday, February 20, 2007

When tags work and when they don't: Amazon and LibraryThing

This is an extensive post, revealing the results of a statistical comparison between Amazon and LibraryThing tags, and exploring why tagging has turned out relatively poorly for Amazon. I end by making concrete recommendations for ecommerce sites interested in making tagging work.



Both LibraryThing and Amazon allow users to tag books. But with a tiny fraction of Amazon's traffic, LibraryThing appears to have accumulated *ten times* as many book tags as Amazon—13 million tags on LibraryThing to about 1.3 million on Amazon. (See below for the method I used to find this out.)

Something is going on here—something with broad implications for tagging, classification and "Web 2.0" commerce. There are a couple of lessons, but the most important is this: Tagging works well when people tag "their" stuff, but it fails when they're asked to do it to "someone else's" stuff. You can't get your customers to organize your products, unless you give them a very good incentive. We all make our beds, but nobody volunteers to fluff pillows at the local Sheraton.

A tale of two tagging sites.

LibraryThing began on August 30, 2005. From the start, we allowed members to tag their books. We showed that people could embrace book tagging, much as they had photo and website tagging. But LibraryThing was a marginal player.

Three months later, Amazon unveiled its tagging feature. This was big deal in certain quarters. To many, Amaon's move signalled that tagging had "arrived." As CNet blogger Daniel Terdima wrote:
"[This] may well prove to be the most visible example of a company incorporating tags as a way to bring order to information. Outfits like Flickr are big and have tremendous followings, but nothing compared to Amazon's."
Amazon's size was key. With something like 60 million registered customers, and one of the highest traffic sites out there, tagging at Amazon must have seemed like a sure bet. It's visitors were a firehose. Point them at tagging and KABOOM!

Amazon's tagging was quick and easy—but would it work?
It didn't work out that way. Amazon visitors have not taken to tagging Amazon's books in significant numbers. With thousands of times the traffic, Amazon produced a tenth as many tags as LibraryThing. What's going on?

In fairness, Amazon didn't give tagging a lot of prominence. Tags were stuck in the middle of their ever-lengthening book page—one section for adding your own tags, another for showing others' tags. They didn't push them very hard.

It's likely Amazon could have done better. A higher profile could have increased familiarity and comfort with the feature. Some user-interface tweaks could have enhanced its appeal. Maybe Amazon will make changes, and Amazon tags will get some traction.

But there's a general message in this: If Amazon with its unsurpassed traffic is having trouble, can other ecommerce sites hope to make tagging work?

Numbers matter

Amazon's shortfall matters. To do anything useful with tags, you need numbers. With only a few tags, you can't conclude much. The tags could just be "noise."

A web of meaning: LibraryThing's tag cloud for Guns, Germs and Steel.
Take one example: LibraryThing users have applied over 3,900 tags to Jared Diamond's Guns, Germs and Steel, including "apples," "office" and "quite boring." With just a few tags, it might be thought a desert cookbook, a business book or—worst of all—a boring one. But these are all single-instance tags. With a larger number of tags, clear patterns emerge, with high-level descriptors like "history" (755 times) and "anthropology" (293 times) standing out clearly against the noise. Even lower-frequency tags, like "social evolution" (25 times) and "pulitzer prize" (20 times) can be trusted as relevant.

Large numbers are particularly important when looking for best examples for a given tag. Go by numbers alone and you just get what's popular. By the numbers Guns, Germs and Steel, tagged "evolution" 39 times, is the number ten book on evolution. That's crazy. By looking at "tag share," LibraryThing can understand that Ernst Mayr's What Evolution Is is a better choice. Although tagged "evolution" only 25 times, those constitute a much larger percentage of its tags. (See the LibraryThing tag page for "evolution.")

Critical mass is important, even if we can't pinpoint the line. Ten tags are never enough; a thousand almost always is. Unfortunately, Amazon's low numbers translate into a broader failure to reach critical mass. With ten times as many tags overall, LibraryThing has fifteen times as many books with 100 tags, and 35 times as many with over 200 tags.

ISBN tag distribution for A Farewell to Arms. Doubles as an example of my Excel-fu.
The "problem of small numbers" is compounded by Amazon's failure to aggregate tags accross editions. In Amazon, the tags for the various paperback, hardback, British, French and German editions of a work are all in separate "buckets." LibraryThing's unique user-generated "works" concept combines editions and their data, compounding tag statistics. Thus, Amazon's top edition of A Farewell to Arms has 28 tags, where LibraryThing's has 716. But when all of LibraryThing's editions are combined together under a work, LibraryThing has 1,914 tags—68 times as many! A Farewell to Arms is a very well-known book, but Amazon's 28 tags can't mean much. With 1,914 tags, LibraryThing has a truly extensive "web of meaning," created by its members. You can do a lot more when the data is so rich.

Why Ecommerce Tagging Fails

Amazon's tagging suffers a failure of incentive. The causes are multiple:
  • Tags work best when they're about memory, so tagging makes the most sense when you have a lot of something to remember. On LibraryThing, members with under 50 books seldom tag, but users with 200 or more usually do. When you get right down to it, few of us need to remember 200 books on Amazon. For most of us, the "wishlist" feature is good enough. We don't need to sub-segment out the "anthropology" books.
  • When you tag on LibraryThing, you're putting your library in order. The pleasure and use is not unlike reshelving your books the way you want them, except that tags can draw together books that must otherwise reside separately on the shelves. And tagging on LibraryThing is connected to a social system—tag something "anthropology" and you're connected to all the other anthropology buffs out there.
  • Amazon is a store, not a personal library or even a club. Organizing its data is as fun as straightening items at the supermarket. It's not your stuff and it's not your job.
  • Amazon underplays the social. Tagging really kicks into high gear when the personal blooms into the social, when organizing your web pages or your books turns into an hours-long exploration of others' web pages and books. But Amazon doesn't want you to hang out—they want you to buy! Tags on book pages do not list their taggers. You need to click around a lot before the tags turn into people. (The failure is particularly surprising in light of Amazon's clear grasp of social software. Amazon got "social" years before it was trendy. What are reviews and Listmania but social sharing and user-generated content?)
  • Users don't "own" their tags. There is no way to export them. Considering how central APIs are to Amazon—and to it's success—this comes as a surprise. (I'm guessing they'll add this eventually.)
The problem of opinion tags

Some of the tags from Ann Coulter's Treason. But what is it about? (Compare with LibraryThing's page.
The limited utility of tagging on Amazon produces an unintended consequence—a surfeit of "opinion tags." So, Daniel Silva's The Unlikely Spy gets "wow what a book" and Nick Hornby's High Fidelity gets "good" and "good book." Not infrequently, opinion outnumbers other types of tags. Five of the seven tags applied to Bette Green's Summer of My German Soldier are opinion tags, incluing "aweful" (sic), "obnoxious," "pathetic," "stupid" and "wonderful story."

The takeover is total with political books. Ann Coulter's Treason gets a lot of tags like "craptacular," "evil" and "brain dead." Coulter's tag-defenders weigh in with "you won't disprove the facts," "you can't disprove the facts," "no one has proven this book wrong" and "try and disprove this book." (Well, I guess that settles it!) Finally, Coulter has also received "dildo" (elsewhere applied mostly to Bill O'Reilly books*), "vibrator," "lunesta" and "xanax." It seems the naughty teenagers and the pharmaceutical spammers have discovered Amazon tags!**

Tag-spam on Amazon
Amazon's all-items tag cloud shows the impact of partisan tagging. After "DVD," "Music" and "fiction," the largest tag is "Defectivebydesign," applied by a small, pitchfork-weilding mob of Microsoft DRM haters.***

Ultimately, I don't care about the commercial side of things, but "opinion tagging" in a low-numbers environment holds commercial risks. The Summer of my German Soldier is actually a pretty good book (I hear). Although Amazon won't let me confirm it, one suspects all five negative tags come from one user. Is it fair to let one anonymous reader shape a book's tag cloud so completely?

How to make Ecommerce Tagging Work

Big suggestions:
  • Figure out why your customers would want to tag your stuff. Don't fool yourself.
  • Make tagging as easy as possible. (Amazon's are quite easy to add, although registration is a pain.)
  • Understand that commercial tagging can turn people off. Avoid crass commercialism. Respect your taggers—these people are helping you out!
  • Make taggers feel like it's "their" thing. Encourage users to give out their tag URLs—people love to show off—and let them export their tags any way they want.
  • Keep tagging social. Stop selling and start connecting. If you connect people up right, the selling will follow. Think Tupperware!
  • Consider whether a non-commerce site has the data you need. Back when LibraryThing had a million tags, Amazon could have bought our data for the price of a cup of coffee. Now, that we're big and important and have three employees, that'll be THREE cups of coffee, buster!
Small suggestions:
  • Put methods in place to fight spamming and tag-bombing. LibraryThing does this by considering both the number of times a tag has been applied and the number of users who use it. A single angry user can't make a tag really big on the tag cloud.
  • Have logical URLs. Amazons tag URLs are full of junk, much of it rather crass attempts at search engine optimization (eg., the book title is inserted into the tag URL, but it works without it). It seems getting a little search engine help trumped providing users with easy-to-remember URLs.
Methods

To my knowledge, Amazon doesn't release any total tag statistics. So I tried a statistical sample:
  • I picked 1000 random entries from LibraryThing libraries, and retrieved their ISBNs.
  • I ran the ISBNs through LibraryThing and Amazon, counting tag numbers. I did it by hand through 100 before I decided to write a quick scraper.
  • I compiled the results and did some simple math. You can find my Excel file to the right.
The final results were 56,185 tags on LibraryThing, 5,528 on Amazon. Extrapolating on the sample, I conclude that Amazon has something like 1,337,388 tags in total, to LibraryThing's 13,593,069.

If anyone wants to duplicate the test, let me know. By default, LibraryThing doesn't think of tags ISBN-by-ISBN, so I'd need to give you an API to that data.

Problems with my method
  • It only covers books. Maybe DVD tagging is a different phenomenon. I note, however, that Amazon's page for bananas—yes, Amazon sells bananas—is overrun with Borat-themed tags.
  • The random books were drawn from LibraryThing. Maybe LibraryThing's ISBNs are unrepresentative of Amazon's ISBNs as a whole—that the sort of books that are tagged on LibraryThing are not tagged on Amazon. There may be some truth to this insofar as LibraryThing includes a lot of older books, while Amazon focuses on the new and in-print.
  • I only sampled Amazon's US site. LibraryThing has a fair number of non-US editions.
Let me know what you think

As usual, I'm dying to hear what people think about this post. I know it's imperfect—I bit off more than I could chew. But it says a lot of things I've been keeping in my head for months. Leave comments here. If enough interest develops, we can start something on Talk.

*Shouldn't it be "falafel"? And YES! O'Reilly's Culture Warrior IS tagged "falafel"! I swear I did NOT do it.
**Out of 60 unique tags applied to the book, I can spy only four that read like subject tags.
***Small numbers also mean Amazon is open to manipulation. One of the larger tags on their tag cloud is for "bards and minstrels," applied to 4,200 products by six taggers. The tag has never been used on LibraryThing, Flickr or Del.icio.us. I suspect a conspiracy.

56 Comments:

Anonymous Jason Lefkowitz said...

Great post! I would suggest one ancillary observation, though -- I think there's a very simple precept we can take away from your data:

People WILL tag things if the tags are useful to THEM.

People WILL NOT tag things if the tags are useful to SOMEONE ELSE.

In an app like LibraryThing, tags are useful because they provide a direct benefit to the user -- they help you easily find items in your collection.

In an app like Amazon, tags are less useful because the stuff you're tagging is not "your stuff" until you buy it -- at which point you will never look at it on Amazon.com again. So by tagging items there, you're not helping yourself, you're helping theoretical future shoppers interested in the same item. What's in it for the user? Nothing. So they don't tag.

2/20/2007 5:35 PM  
Blogger Tim said...

Yeah. I think you've got about 90% of what I argued, in 2% of the space.

I should probably have stuck with something that small. Sometimes I chafe against the size restrictions of a blog post. I know I should break things down into the "smallest bloggable unit." But sometimes I want to make a long form argument--even while taking advantage of the diminished expectations for quality and completeness blogs provide :)

2/20/2007 5:39 PM  
Anonymous Jason Lefkowitz said...

What can I say, I took my Strunk & White to heart :-)

Lots of interesting ideas here to chew on, though -- thanks for taking the time to write them all up!

2/20/2007 5:53 PM  
Blogger jed said...

Great analysis!

We're working on tagging and see the issue of incentives as central. Obviously we all agree: tagging has to work for the person who does it first of all, and then we can build social value on top of that.

Our approach increases each user's payoff from tagging. We train text classifiers on a given user's tagged items. With a few examples our software can apply the tag fairly accurately to lots of other items the user may have never seen. The user can continue training the system until the system knows that user's definition of the tag.

This would work extraordinarily well for books, except that the text of most books isn't available on line. If anyone can figure out a way around that, we'd love to help.

We're planning to release our code as open source; you can learn more at the peerworks site.

2/20/2007 6:31 PM  
Anonymous Jay Fienberg said...

I think you make great points about the conditions that make tagging a useful, integral feature of a website. But, it's worth noting that Amazon often adds "features" to its site through a very federated, non-integrated process.

I don't think anyone at Amazon said: let's design a tagging system that capitalizes on how people use the Amazon site, or that creates a new way for people to use the site. Rather, I think it was more like: we just found out that one of our teams built a tag feature, and lots of people in web tech are excited about tags—let's launch it as another widget on the page and run stats to see if it's worth developing further.

Another way to compare LibraryThing vs Amazon: tagging works more when it's a designed system (as in LibraryThing) than when it's merely an implemented feature (as in Amazon).

2/20/2007 6:37 PM  
Anonymous janeblum said...

I found this fascinating, as much for the questions it raised as for those it answered. Tags are also a hot topic in librarys, and some catalogs are adding tagging features for patrons.

I'll be really interested to see whether the users of library catalogs fall more into the Amazon or the LibraryThing pattern of taggin.

2/20/2007 7:14 PM  
Blogger Blue Tyson said...

Jason's right, I think. Why would I do any work for Amazon, a foreign company, for nothing, and for zero benefit for me?

2/20/2007 8:16 PM  
Blogger Paul said...

Tim:

Super post - lots of things to think about - I have a longer reply in my blog at Duke Listens! but here's some thoughts on opinion tags:

You talk about the problem with 'opinion tags' especially when tagging is light. You point to the difference in tags for Ann Coulter's book 'Terrorism' - Amazon has user tags like "craptacular," "evil" and "brain dead while LibraryThing's has tags like 20th century, 21st c., america 1975-present, American History, cold war, commentary, Conservatism - I find it interesting that none of the LibraryThing tags are opinion based, while almost all of Amazon's tags are. I think this may point more to a cultural difference between LibraryThing and Amazon's users. Amazon has a long history of cultivating a culture of critique. Amazon users are accustomed to and encouraged to offer their opinions about everything they see on Amazon. Amazon's customers write reviews, they rate products, they make lists of products that they recommend. With this culture of critique it is not surprising that Amazon's tags are opinion-centered. LibraryThing, on the other hand, is more oriented toward people who want to organize their book collections. With this culture of organization, we see tags that are more subject-oriented. Amazon's tags are essentially one word reviews, while LibraryThings are one word descriptions.

2/20/2007 9:00 PM  
Blogger Paul said...

And by the way, Defective By Design is not anti-Microsoft, it is anti-DRM.

2/20/2007 9:11 PM  
Anonymous Steve said...

Enjoyed reading your take on this, Tim. I think you (and those who have commented on it before me) have it right. People "feed the machine" of social networking sites because they feel like they're part of a community. LibraryThing is a community. Amazon's a store. Why should I feed a store?

Your analysis also fits nicely into a concept I've been working on at work the last two months. (I work at a public library.) I want to more fully integrate social networking ideas into as many aspects of the library as possible. I'd love to see tagging become part of our catalog, but if we don't make the entire library and website more of an interactive, energetic community, we run the risk of failing. We would be very much like a store.

We need to make our collections (and our activities) BELONG to our patrons.

Thanks again for your great post!

2/20/2007 9:19 PM  
Blogger Kath said...

I agree with the conclusions you've made and with the other commentators here. When I first read your study, I thought "Amazon is a store. Why would I tag items I'm buying?" I wouldn't even look at Amazon's tags because I don't consider the general mass (morass is more like it) of Amazon users to be nearly as literate, articulate, or as book-loving as the members of LT.

Tagging in LT not only prompts my memory about each book but it allows me to find similar books I want to read. Tags allow me to quickly list my books by subject matter or when I bought them or when I read them -- and lists are wondrous things to glance upon.

You must lay awake at night thinking this stuff up.

2/20/2007 9:59 PM  
Blogger Tim said...

>You must lay awake at night thinking this stuff up.

Sad but true.

2/20/2007 11:41 PM  
Blogger Tim said...

Here's the complete tag list for Coulter's book. It's got more oddballs than usual, I must say:

politics (42)
non-fiction (11)
Conservatism (7)
history (6)
conservative (6)
Current Affairs (5)
read (5)
political (5)
Nonfiction (4)
commentary (3)
borrowed (2)
government (2)
own (2)
hardback (2)
cold war (2)
US History (1)
political conservative (1)
america 1975-present (1)
unread (1)
Legally Trained Author (1)
terrorism (1)
read in part (1)
shelf-1T2 (1)
used (1)
public policy (1)
US politics and government (1)
zeitgeist (1)
contemporary conservative comm (1)
paperback (1)
Owned (1)
Liberalism (1)
islam (1)
history - us (1)
American (1)
21st century (1)
hard cover (1)
anti-communist movements (1)
Partially read (1)
republicans (1)
Women Authors (1)
WWII (1)
political science (1)
jingoism (1)
General & Current Events (1)
20th century (1)
US liberalism (1)
Philosophy (1)
military history (1)
Conservative Women (1)
filed c09-s3-b06 (1)
republican (1)
finished (1)
conservative politics (1)
overblown (1)
recommend (1)
Female Author (1)
find-library (1)
leftism (1)
US foreign relations (1)
American History (1)
biblical research (1)
American Political Thought (1)
finished 2004-03-09 (1)
espionage (1)
signed (1)
immigration (1)
conserative ideology (1)
political commentary (1)
NF Politics (1)
nojacket (1)
current events (1)
Autographed (1)
science fiction (1)

2/20/2007 11:51 PM  
Anonymous andyl said...

It should also be noted that amazon.co.uk doesn't allow tagging. Not that I think it would make much difference as I too feel that the only tagging that makes sense is to help me find stuff. In a book shopping situation I cannot see tagging being helpful at all.

2/21/2007 7:05 AM  
Blogger Tim said...

Yeah, I didn't look into that as I should have. Probably none of them do. So that covers maybe 5-10% of LT's books. Then again, if they combined across ISBNs, there would be no problem.

2/21/2007 7:41 AM  
Blogger Kurt A Beard said...

It’s interesting that Amazon can get ratings and reviews and seems to be somewhat successful at that but can’t get tags. That may be an interesting analysis to have run, Amazon’s ratings vs. LT ratings; how many are rated what their ratings are, etc. In the same vane as the tags it would be interesting to see the review comparison from each site.
While I agree with the analysis at large, I may disagree with the critique of ‘opinion’ tags. They can be useful; let’s say Comic Book Guy from the Simpsons tags 3 separate books “Worst Book Ever.” Since he is a political liberal and I’m conservative and the tags are applied to the political books he has been reading his tags can become useful to me. If he tagged, Ann Coulter's Treason and 2 of Bill O'Reilly’s books “word book ever” I learn these books have something in common, they are hated by liberals and since I hate liberals (I don’t really hate liberals) I may like these books.

I may argue that opinion tags often contain a hidden meaning though they are more about the tagger than the book. If a site were smart enough to separate the opinion tags from topic tags the opinion tags could be used to give context to members. This context could be used to suggest books, if I have 5 books by right wing authors and give them negative tags the site can begin to learn that I may not be a conservative this could add another layer to the book suggestions and books ratings. Opinion tags could be cataloged and referenced on severity, hate being worse than bad and so on. This get’s hard when we want to tag a book on global warming, I tag it “hot air” because it’s full of hot air and Comic Book Guy tags it “hot air” because it has a chapter called hot air.

So the tag “worst book ever” in conjunction with the tag “politics” tell me something about Comic Book Guy’s views. If I’ve tagged Treason with “best book ever” and “politics” the opposite tags are matched up to recommend a book by Bill O'Reilly.


p.s. If you want I can think of more bad examples.

2/21/2007 8:47 AM  
Anonymous Anonymous said...

The problem with 'tags' is that they are so dated. Tags from 2 years ago will be meaningless as they represent the then current pop dumbed down culture

2/21/2007 10:29 AM  
Anonymous Tito Sierra said...

TIm, I actually appreciated the lengthy post. You've included many interesting details. This topic would make for an interesting presentation. Are you going to shop it around?

There is a lot of debate over the merits of tagging, often devolving into the basic value argument between expert vs. user contributed classification (analogous to the wikipedia debate). But as with any method of organization the value of it depends on context. Motivation ("what's in it for me?") plays a big role in the success of user contributed content and I think you've got some data to back this up, in terms of tagging at least.

Regarding Blue Tyson's comment "why would I do any work for Amazon, a foreign company, for nothing, and for zero benefit for me". Actually many people freely do work for Amazon in the form of product reviews and Listmania! lists, which are significantly more work than tagging products. For those submitting reviews the incentive may be a desire to have their opinion heard, a way to vent about a bad product experience, or maybe they just want to increase their reviewer rank. I think user contributed content can be very successfully in a e-commerce context, but the incentives have to be clear and compelling.

2/21/2007 12:06 PM  
Blogger Steve Lawson said...

Nice post. If you haven't already, you should see Joshua Porter's The del.icio.us lesson.

The lesson is that "personal value precedes network value," or, as has been mentioned above, people do stuff on the internet that is useful to them, not out of the desire to make a nifty tagsonomy.

2/21/2007 12:07 PM  
Anonymous johne said...

'I know I should break things down into the "smallest bloggable unit."'
For Heaven's sake, why?
Fascinating post.

2/21/2007 2:42 PM  
Blogger Tim said...

>'I know I should break things down into the "smallest bloggable unit."'
>>For Heaven's sake, why?

Well, that's the advice. I think it's more "blog"-like. There's also a "leave them wanting more" aspect. Blogs aren't very good for long-form presentation plus argument.

Then again, blog conventions to heck—I use footnotes after all :)

2/21/2007 2:45 PM  
Anonymous Anonymous said...

Who needs tags on Amazon when you have wishlists? Each list is a category/tag. I never tag things on Amazon, in fact, I ONLY tag things on Delicious, no where else.

2/21/2007 3:07 PM  
Anonymous Anonymous said...

Great post, great comments, something else to think about:

People here are saying they don't want to tag for other people, but have you read Amazon reviews? So in-depth, so pleading with you to watch this movie or avoid this iPod accessory, so sincere and insistent - and all unsolicited and uncompensated.

So clearly it's not the mere fact that tagging is for someone else - after all, reviewing a product you've already bought and consumed in some way doesn't really change your purchase or consumption.

So two suggestions then:

1) Why doesn't Amazon do impromptu tag clouds based on user reviews? Since they get a lot more push out of them anyway, just generate tag clouds based on the reviews and you should at least get some useful descriptions and consensus opinions.

2) Along the same lines, why not offer the tag box along with the review box? Amazon has these two separated, but it seems that people writing reviews would be much more likely to contribute a list of tags than your average person.

Finally, there's the observation that tags work best in a niche environment: tag your books at LT, tag your links at delicious, tag your news at technorati, and ne'er the twain shall meet. So universalist Amazon suffers because their users are numerous but dispassionate.

2/21/2007 3:45 PM  
Blogger Amanda Ellis said...

Fantastic article. I think you should approach WIRED to do a longer article and look at Flickr and del.icio.us etc too. These sites make it harder to tag with more than one word, but the users have more incentive to use the tags (your it's theirs argument).

In relation to your Problems with my method section

* The random books were drawn from LibraryThing. Maybe LibraryThing's ISBNs are unrepresentative of Amazon's ISBNs as a whole—that the sort of books that are tagged on LibraryThing are not tagged on Amazon. There may be some truth to this insofar as LibraryThing includes a lot of older books, while Amazon focuses on the new and in-print.

You could do further analysis on how representative they are by looking at the user's data source (the LT source field). A lot would have used Amazon for the covers, except for the librarians who want their immaculate data!

* I only sampled Amazon's US site. LibraryThing has a fair number of non-US editions.

True! But because the US is the largest publishing market for books in English a lot of US editions show up in places like Australia, where a special interest title like No Plot? No Problem!: A Low-Stress, High-Velocity Guide to Writing a Novel in 30 Days would be unlikely to sell enough copies to justify a local edition. Amazon.com has been one of the top ten ecommerce sites used by Australians for over a decade because of the larger choice and cheaper books (booksellers charge like a wounded bull for imports). So using Amazon data is still representative of the international--English centric bookworld.

2/21/2007 6:33 PM  
Blogger xiao said...

An interesting discussion, I wonder how the difference in demographics plays into this. I mean, Amazon is huge, practically a household word in the online marketplace. Imagine if Wal-Mart allowed their customers to tag their purchases. Wouldn't the spirit and tone of the tags used differ from those collected by a less well-known retailer?

2/21/2007 6:41 PM  
Blogger Ray said...

The most important "tag" on Amazon is the price tag.

The best books are not always best sellers. We all certainly hope that great books by great authors will have great sales.

Opinion_Tags, i.e. this_really_sucks or I_love_this are just bundled tags that act like a phrase. I have used them myself
before I knew they had a name. They do help the memory sometimes, and do not have to be opinion related.

Well, Tim, I think you have opened up Pandoras Box with you thoughtful post.

What should I tag your post as?

2/21/2007 9:59 PM  
Blogger vanderwal said...

This was good, but over the top in points. Amazon did start adding tags to some of its users pages in December 2005, but most did not get them until mid-way through 2006 and then they were at the bottom of the page.

Amazon has a vastly different audience with a different purpose. It also offers many alternate ways to annotate, hold, and share their items for sale. Sites like discomusic.com has tagged its top picks for the fans of that site to find its picks easily.

The DefectiveByDesign is anti-DRM and has half the tags as DVD the true top tag used. Many people who use Amazon tags use DefectiveByDesign to know if their CD will keep them from copying the music to their MP3 player. It is a well loved tag by those who know what it means and want to avoid it. One of the reasons people tag is to fill in missing metadata and Amazon does not indicate DRM protections on its music or devices it sells. Tagging stands out better than reviews to make it clear that status of that product.

The info about LibraryThing user's scaling and use of tags is really good info.

Amazon has been iterating their tagging quite a bit in the last 4 months and has been seeing 10 fold monthly growth in the last three months. I have been using Amazon's use of pivots as examples in presentations and have been watching and tracking its growth. There are still some things that Amazon could do with tagging that would help it grow, but tagging is competing with reviews, ratings, wishlist, listmania, friends sharing, recommendations, and sharing purchase lists.

2/21/2007 10:33 PM  
Blogger Tim said...

Thomas:

Thanks for the corrections.

1. You're right, Music and DVD (and Fiction, but very close so it may move around) are bigger than DefectiveByDesign. I also appreciate the DBD point. But I don't think that calling Vista, the Zune and the iPod DBD fills in gaps. If you're aware of the phrase, you are certainly aware of where the iPod stands with respect to DRM.

2. I am really girding my loins to doubt the idea that Amazon's tagging has been "seeing 10 fold monthly growth in the last three months." (You mean ten-fold over three months, not each month, right?) It doesn't accord with my previous, albeit less thorough investigations of the LT/Amazon gap. Could it be growth in non-book tags? Where are your numbers coming from?

The looking closer at the tag cloud—and that nice tool-tip--suggests that I could do some tag-by-tag comparisons. They are crazy lopsided (eg., Fiction 5,816 vs.846,507, history 3509 vs. 235,365). Although I'd need to analyze many, it suggests another interesting fact about Amazon tags--they cluster much less tightly than Amazon tags. That cuts both ways, but it's interesting.

2/21/2007 10:50 PM  
Anonymous esme said...

Good post! I've wondered about the utility of Amazon's tags, myself, and here's where I think I land on it (thinking only about non-opinion tags for the time being): When we tag our books on LT, we're organizing in ways that make sense to us based on our experiences with the books. At Amazon, I'm almost always going to be looking up a book when I don't own it and likely have never read it -- instead, I'm thinking about purchasing it for myself or a friend. Even if Amazon were to give me an incentive to tag the book, I wouldn't have a meaningful way of doing so; I'd be able to do nothing more than approximate some general subject categories based on what I've heard or where I've seen the book shelved at Barnes & Noble. (For example, Becoming Madame Mao, which I've tagged "multicultural" and "historical fiction" in my own library, would probably have gotten a "fiction/literature" if I'd tagged it on Amazon before reading it; C.S. Lewis's "Space Trilogy," oh-so-cleverly tagged "space trilogy, fantasy, religion" in my library, would likely have gotten a "fiction," or even a completely misleading "psychology" since I purchased them for a Psych class.) In that case, you'd be better off just downloading card catalog-type subject information from the Library of Congress.

So I guess what I'm saying is that I find the value of tags to be in their context -- that each one developed and was applied based on an experience (whether mine or someone else's). On an e-commerce site, a large percentage of non-opinion tags must necessarily be context-free, and they're likely to remain so until Amazon can convince us that there's some worth in going back to tag those books we bought long after the sales transactions have been concluded.

2/22/2007 12:17 AM  
Anonymous CR Haynes said...

Esme makes a good point about not having a meaningful way to tag items at Amazon, because if you don't own it yet, you can't describe it well.

I've found that tags work well on sites that have no navigation or cataloging system whatsoever, and whose content is not self-describing (like Flickr and delicious).

It fails when a site already has a highly structured navigation and a category system in place. Both Amazon and Slashdot have good text searches and fairly rigid categories for their content. User tags adds _another_ layer on top of this, and it becomes superfluous. And so tagging fails at both sites.

2/22/2007 1:37 AM  
Anonymous BGF said...

Fascinating, especially since I've never thought about tags much at all, let alone in the ways you've brought it up.

There are only two places where I use tags, my LibraryThing and my blog. In both cases the tags are an organizational tool for my benefit alone, set up in a way that best works for me. The thought of someone else's use of my tags never, ever hit, and that could be because to me LT is a more a repository for me than a social thing. Perhaps that's why I've ignored the 'merge tag' feature on LibraryThing, because I don't care how other people arrange their stuff, and don't want to use their guides. I certainly wouldn't want a librarian or commerce outlet's tagging system used as default for my system.

Until you pointed it out, I didn't even noticed that Amazon now had tags! That's probably because of the way I use Amazon. When I approach other people's blogs/sites, I've assumed their tags are of primary meaning to them. I only notice and click on a tag if something interesting/impressive about a post makes me want to see what else they may have in that category.

Thanks for putting this post up. Glad I read it! Interesting stuff to learn about.

2/22/2007 2:34 AM  
Anonymous Anonymous said...

Amazon changed their tags system somewhere along the way, causing me to have to change my tagging labels and I'm guessing causing others to abandon the idea of tagging on Amazon. (About Wishlists, yes I used them for cataloguing, but only for cataloguing books I haven't read but want to.)

Here's the problem I had with Amazon tagging (though it didn't use to be):

I (a Christian Zionist) used to have tags like "israel" or "good bibles" and others would use the same tags and Amazon initially showed the entire group of all books using those tags but also broke it down into users using the tags and a visitor to my profile page wanting suggestions for a Bible or a book on Israel from my Christian dispensationalist perspective (as indicated by my profile and reviews) could click on my tag and see not only everyone's suggestions, but those that were uniquely mine, so my tag could be helpful for someone who identified with my worldview and just needed guidance. However, somewhere along the way Amazon changed it so that the tags no longer showed the individual user's collection in that tag. So to differentiate I then had to relabel my tag "israel" to be "israel-" and then when others started using "israel-" for books I'd definitely not recommend, I again changed it to "israel--" and started using tags less. Same with "good bibles," you can imagine, one person thinks a real loose, yet inaccurate paraphrase is "good," so I had to change it to "good bibles-". Or take the tag "bad theology," obviously a nondispensationalist will think my books are bad while I'll think theirs are bad, so I had to make a tag "bad theology-."

But from a quick glance on my visit to LibraryThing, it looks likes each person can display their own collection within the broader tag, so its more useful to self and all.

("Encompassed Runner" on Amazon)

2/22/2007 8:52 AM  
Blogger peta said...

It would be interesting to see how the tagging behaviour of librarians on librarything differs to non-librarians. Do the members of the Librarians who Librarything Group have more or less tagging in their collections, and with a culture of describing resources for the benefit of other users do they consider that when tagging their own collections.
I believe the Librarians who Librarything group is the largest group - does that have some influence on why tagging has been much more successful on Librarything? If they tend to tag more than others then creating that critical mass for other users to build on may have been a driver for others to see why tagging is a good thing.

2/22/2007 6:02 PM  
Blogger Patrick said...

This is a great analysis Tim, and I wouldn't apologise for the detail - this is what distinguishes possibly true opinion from a grounded understanding. I only wish I had had this piece when I was writing my folksonomies chapter for my new book - it will have to go in edition 2! (There's an early version of the chapter at http://www.greenchameleon.com/gc/blog_detail/folksonomies_and_rich_serendipity/). I have been looking at social tagging from the perspective of enterprise knowledge management, and while you've focused on ecomerce, the lessons you draw are highly relevant to that environment too.

You've hit the nail on the head as far as I'm concerned - in respect of ownership and self interest, social exposure of personal collections, and getting a critical mass of tags to be able to see "wisdom of crowd" patterns.

I think an interesting next step would be to take your rule of thumb eg that "a thousand is almost always enough" - how do we know when an item has aggregated sufficient tags to be tagged authoritatively? Or alternatively, what is the critical number of taggers a content item has to be exposed to, for meaningful tagging patterns to emerge?

Your article suggested to me another factor I hadn't really considered before, which is aggregation of tags across editions/iterations - it seems such a simple thing but not obvious to implementers.

I think there is one slightly weaker part of your comparison with Amazon that you haven't fully accounted for, which is that Amazon uses tagging as only one late addition to a range of many other findability strategies such as "people who bought this book also bought", reading lists, ratings, Amazon categories, personal preference algorithms etc etc - so while you acknowledge that tagging has not been given prominence in Amazon, this doesn't fully recognise that the findability support is extremely strong, and the incentive to tag is therefore less strong - I don't have to look after myself by tagging "my stuff" (which I can have in a looser way on Amazon too) - whereas on LibraryThing, tagging is your members' primary findability strategy apart from author title stuff (I'm a member too).

But great stuff... am blogging it today, and I look forward to seeing more on this!

2/22/2007 8:05 PM  
Blogger JWG said...

Good info in this post.

One possible explanation lies in the fact that people who visit a particular book's page on Amazon are usually looking to acquire the book, and are much less familiar with the content. I know that I use Amazon to shop for books that I don't own and know very well. I have absolutely no incentive to go back to Amazon after I've read a book and put tags on it then.

In short, I think that people on Amazon are less knowledgeable about the books they are viewing and therefore less likely to provide accurate tags.

2/23/2007 8:37 AM  
Anonymous Patrick Maué said...

Here's another suggestion:

* provide career opportunities and give out rewards.

Busy and good rated reviewers get into the Top100 of the Amazon Reviewers. Do the same with taggers. Give out gift certificates if user tags tags are proven to be good.

(also discussed in my blog )

2/23/2007 11:43 AM  
Blogger paper said...

Another easy way to up the use of tags on Amazon would be to incorporate them into their recommendations algorithm. That's the only reason I ever play with the ratings on Amazon, and it's the only reason I think I would ever tag things there.

2/23/2007 9:00 PM  
Blogger Blue Tyson said...

Do they give the top amazon reviewers money, cash, free books, shipping, or anything, out of interest?

2/23/2007 9:09 PM  
Blogger Debra Hamel said...

Google's come up with a way to make tagging images fun: http://images.google.com/imagelabeler/

2/24/2007 12:38 PM  
Blogger Wendy said...

Great post! I have some thoughts about the library questions others have asked. I think that people who have mentioned that we go to Amazon to shop (and therefore aren't ready to tag, since we haven't read the books yet) hit the nail on the head. And the same is true of the library. When I check out a book, I wouldn't tag it at that time - I'd only tag after reading it. And, to be honest, I don't think I'm likely to go to the library website after the fact and enter tags, unless I was prompted in some way to do so. If, for example, the library sent me an e-mail asking for tags after I'd returned a book. I'm pretty sure this will never happen in my current library system, though, since they deliberately wipe out the records of who checked out which books as soon as books are returned (to prevent them from having to give info to govt. agencies if asked). So, in all honesty, I don't think I would be much help in any library tagging system, unless there was some way to remind me (it's not at all that I'd want compensation; just that I wouldn't think to go and do it without some outside encouragement).

As someone above mentioned, I only use tags here at LT and on my blog, and pretty much only for my own ease of use. I love the fact that my tags might be helpful to others, but that's a side benefit, not my main purpose in tagging.

2/25/2007 3:18 AM  
Anonymous Lenny said...

For what it's worth, defectivebydesign was the only Amazon tag I ever found useful.

Since Amazon introduced the feature, I thought it was a bad idea, since there is no reason to tag. I tag bookmarks at del.icio.us prolifically, but I tag first for myself, and benefit from the social/sharing aspect of the site as a side-effect.

2/26/2007 5:16 PM  
Blogger St. Dunstan Library said...

Tim, I'll buy all three of you a cup of coffee -- jsut let me know and I'll send a check or cyperpay. If it's local or evewn DD, go for two. If it's Starbucks, only one, not that they're evil but that they're too expensive.
And why? Because you, Tim, are super-duper! And I believe you do it because you love it and you can, and you're smart, as smart as all those West Coast fellows who started those innovative businesses. Thanks (and let me know if you need some DD coupons) ...
A Yankee from the Shell,
JLH in NC

P.S. I have notes on points made by respondents here, things I wanted to respond to -- "impromptu tag clouds" -- yay! I'll vote for that! -- your generosity in giving tips to Amazon -- the large number of informed responses to your posts -- Pandora's Box! -- Esme's post! -- and Wendy's -- all fine -- so for now I just say hooray for Tim and for LT!

2/26/2007 8:10 PM  
Anonymous Gerhard Jan said...

Nice post. But actually the second statement in Jason Lefkowitz' comment - People WILL NOT tag things if the tags are useful to SOMEONE ELSE - seems to be missing the point, because what is useful to SOMEONE ELSE need not be unuseful to THEM. And Debra Hamel is right. Google Image Labeler, based on ideas of Luis von Ahn (see his talk on human computation at Google, July 2006), is an example of tagging made fun. The crux is that the developer of the tagging system does well to think in terms of isolating the tagging activity from the application in which the tagging results are put to use (tag interoperability). There are many ways in which you can involve an audience, although one should be cautious of tagging artifacts, i.e. the task at hand may be decisive for your choice of labels. Which, by the way, offers additional opportunities for developers!

2/27/2007 8:28 AM  
Blogger JoelSapp said...

Excellent Post. It does seem correct that a retail company would have competing interests with Social networking and its bottom line.
The most valuable space, where the reader's eyes are first drawn, is most probably is taken up by something that will generate revenue.
Tagging elements may need to be in the same space to gain wide usage and understanding.

3/03/2007 5:20 PM  
Anonymous Anonymous said...

You said 1000 tags are almost always enough. Were you including the names of authors as well as people - ficitional or not - who are the subjects of those books?

I've been playing with the tag cloud at DailyKos and your article was mentioned there today. Glad to have found you.

3/03/2007 11:24 PM  
Anonymous Lee said...

It may be helpful (especially on LibraryThing, and also on Amazon) to structurally segregate tags by type, including separate tag types for content, opinion, and status. The content type “anthropology” and perhaps status types “loved it” have shared social value. The status type “I’d love to read this” or “stored in the basement” has little or no value to others.

3/22/2007 9:12 AM  
Blogger Tim said...

I hear you, but it would add bother to the equation, and that's the best thing about tags—no bother. And peope would disagree on these categories. You'd need to basically tag tags. There's no end to the metadata you can put on metadata—turtles all the way down.

Also, check out "unread." I submit that although personal, it's actually a little funny. Not all books are equally unread...

3/22/2007 1:41 PM  
Blogger Ben Still said...

Fantastic analysis - thank you for sharing this information. Your post has helped us rethink the process of just grabbing one of the usual UI elements (tag clouds et al) and sticking it in a design!
We ended up using a text cloud approach rather than tags, based on ideas in a post by Joe Lamatia. You can have a look at what we did, and download source code on our blog, which hopefully will be useful to someone else! Thanks again.

7/09/2007 2:44 AM  
Blogger Wendy C. Allen a.k.a. EelKat said...

WOW! What a great post! I never realized the effect of tags before.

On Amazon when I tag somethingm, it's usually for my benifit not for other users benifit, as a result I only tag book I buy or intend to buy from Amazon.

I've been buying stuff from Amazon sice 1997, so I've been a regular Amazom surfer since their very early days. However, I rarely use their tags. I don't see Amazon's tags as very user friendly. They are actually kind of pointles, because they don't help you to search for books on their site, they are just kind of stuck in haphazardly.

On LibraryThing, though that is differant. I can use LibraryThing's tags to find books I might like, to find others who have the same tastes in books I have, etc. It' loads of fun, and it's one of the things that makes LibraryThing so great. LibraryThing is much more user friendly with tags than Amazon is, so I tag everything in sight! LOL! So far I've onlt tagged about half of my books, but eventually I'll get them all tagged. I just do it a little at a time here and there so that I don't get bored by doing it all at once.

Over all, judging from the way I personaly use Amazon tags vs LibraryThing tags, I can see how Amazon came so low on the scale while LibraryThing came in so high. LibraryThing tags are just made better, are more user friendly, and help users better than Amazon tags do.

~~Wendy

9/28/2007 3:20 PM  
Blogger BrianFH said...

In the end, the only reason to tag others' books is if you care how many of who reads them. It's a persuasion thing, pro or con. If you don't give a rat's, it will seem like a waste of time, no matter how user-friendly.

Amazon's "also bought" is much more appropriate for a seller; it's real-world action, and reflects the material in the book from a "read" instead of "unread" POV.

I think libraries would also have a hard time motivating useful tagging; why does one user care what another one checks out? BUT -- a list of "this reader also checked out ..." would be much more informative. If you MUST use tags, I suggest a little sticker to go on each book as it's handed back with room for, say, 10 tags. OCR might work on the stickers, otherwise they'd have to be hand data-entered.

BTW -- Wendy; are you a recent school product? Yore spelin sugs.

Tim: in one of your comments you say, "Although I'd need to analyze many, it suggests another interesting fact about Amazon tags--they cluster much less tightly than Amazon tags." I can't make this make sense.

It seems to me there are two utterly distinct tagging functions being conflated here: grouping and rating. Their goals and methods and contents would be very different.

Kurt: your suggestion presumes the tags can be linked to known taggers, which is kind of hard to envisage being very handy to do. BTW: vain, not vane, and worst book, not word book.

10/08/2007 10:51 PM  
Anonymous Sharon G said...

Great post!
Tim, I know this is a little old but I couldn't find the excel file. Is it still online?
And also, can you please provide the dates when the data was scraped? (I'd like to cite this...)
Cheers
Sharon

4/21/2008 7:55 AM  
Blogger Tim said...

Click the Excel graphic within the post to download the file. If you don't think you got it, it probably downloaded to your computer.

As for the date, it's in the days immediately proceeding the post.

4/21/2008 9:33 AM  
Anonymous tagsoda said...

Hi Guys,

Can I talk with some one about my new ecommerce tagging web site? I would like some one to write about it.

www.tagsoda.com

thanks
joel

5/06/2008 11:04 PM  
Blogger prodys said...

Yeah, I didn't look into that as I should have. Probably none of them do. So that covers maybe 5-10% of LT's books. Then again, if they combined across ISBNs, there would be no problem. define

9/21/2008 4:18 PM  
Anonymous Anonymous said...

I tag stuff at amazon, but I use del.icio.us to do it. Why use a different system for each site or portal, when you can use one for the entire web?

7/25/2009 7:28 PM  
Anonymous China said...

I was just doing some research on Amazon tags and found this old blog post. A bit dated now in 2010, but fascinating nonetheless, especially compared with the evolution of Amazon's tags over the past 3 years. I think all I have to add now is that independent authors rely on Amazon's tags for visibility on the site as a substitute for lack of sales/customer reviews. Tags offer unknown authors a fighting chance at being noticed in otherwise crowded genres. I'm sure once this catches on, Amazon will tweak the system, just to keep the little guy down, but for now, I think tags are more important than ever. Thanks for the data!

4/21/2010 2:55 AM  

Post a Comment

<< Home