Random tags for scholars
Someone asked me to come up with a page of truly random tags for an academic project that needed to assess typical tagging. It might prove interesting to other students and scholars doing projects on LibraryThing.
Here's the page.
It's an HTML page, not an XML feed or other such format. Techies will scoff, but I've been asked for a lot of data like this, particularly from MLS students. The people who can easily parse XML in programming languages are not generally writing graduate school papers.
Here's the page.
It's an HTML page, not an XML feed or other such format. Techies will scoff, but I've been asked for a lot of data like this, particularly from MLS students. The people who can easily parse XML in programming languages are not generally writing graduate school papers.
Labels: folksonomy, tags
9 Comments:
Tim, quite a few of the tags just say "private member" - can you add on a script to exclude private libraries from the pool you're drawing from?
Yeah, the user wanted to know how much was being excluded. Meh. I don't think it hurts too much.
Interesting stuff. One discovers all sorts of things looking at random data. Until now, I was blissfully ignorant of 'sports sex' as a genre.
How are you making the random choices ? Is it choose a random work, then choose a random tag from that work, or do you start with the tag ? I ask because it can make a difference to the analysis (depending what the students are doing with it.)
I can understand you not offering it in XML given who wants it. But CSV should be accessible to most students without effort, and would some tasks a little easier, such as sorting.
If only there were some hybrid of HTML and XML.
How are you making the random choices ? Is it choose a random work, then choose a random tag from that work, or do you start with the tag ?
My assumption is that he's starting with the tags, picking one at random, and then pulling out the individual using the tag, the work being tagged, and the rest of the tag cloud assigned by that individual to that work. (Tim, am I right?)
It's very interesting because it's not just a generic tag, but a specific instance of the use of a tag, which is actually even more specific than I originally envisioned. Thanks again!
Interesting way to find spam postings- see line 340, and the other 'books' in that user's/author's library
anonymous, the page is different every time, so your spammer has sliped through the system I fear :(
Feature request: accept a parameter, mininum_uses, and only return tags that have been applied to that many things.
>jm
Can't do that accurately. I'd have to work off summary tables (ie., frozen information).
Post a Comment
<< Home