Archive News #3 – How is the Archive developing? + More about tags!

Welcome to our third Archive news post! These regular posts are a venue for us to answer some frequently asked questions about the Archive of Our Own. Questions about the Archive Roadmap, Tags, and Warnings…right under the cut!

Please leave your questions about the Archive in comments and we’ll answer them in upcoming posts. (This is a space for more general questions – if you have specific comments about the design or usability of the Archive please send feedback on the Archive site itself, so it goes into our bugfix and design process).

How far through the Archive Roadmap are the coding team?

You can view the AOOO Roadmap on our website. We’ve completed the basic work up to version 0.6, although we want to make some major revisions to Searching and Browsing, and we are also enhancing our Bookmarks.

When will you add the big features which are still outstanding?

Search and Browse
This is already in place, but we’re doing some intensive work to make it faster, stronger and better. We expect to finish our redesign by August 2009.

Bookmarks
Bookmarks are already in place, but we want to refine and add some features. We expect to finish our redesign by August 2009.

Collections and challenges:
We’re in the process of designing this feature, with the help of all the lovely people who contributed their thoughts on scenarios. We hope to have it fully designed and coded by October 2009 but are aware this is a big piece of work. In the best case scenario, we hope to run a pilot this year, but either way, we should be able to start accommodating larger archives around New Year’s this year.

Subscriptions
This is scheduled for development during July-October 2009, when we will be welcoming an intern from Darmstadt University, Germany, who will be leading the design and coding for this area.

Will the Archive host fanworks other than fic?

Yes! The OTW supports all forms of fan creativity, and the Archive will ultimately host fanart. We’re in the very early stages of planning how we’ll host fanvids, which will probably be their own project rather than part of the AOOO. Thankfully, we can take advantage of existing and developing open source solutions for hosting online video: we won’t have to build a vid archive from scratch.

What does it mean for a tag to be ‘common’?

If you’ve ever clicked on a tag, you may have noticed that the page that comes up tells you ‘This is (or is not) a common tag’ – see the tag John Sheppard/Radek Zelenka for an example. ‘Common’ tags are used as options in our search filters (the ones your see on the right hand side of the page marked ‘filter your results’. This allows us to avoid having multiple different versions of a tag all showing up in the filters at the same time – so you don’t have John/Radek as well as John Sheppard/Radek Zelenka. The filters still find the alternate versions, because they’re all connected to the ‘common’ (aka ‘canonical’) tags behind the scenes. (See our previous news post on tags for a bit more information about this.)

I have used my tags multiple times, but they’re not marked ‘common’ – why not?

It may be that your tag has been connected to another tag – for example the tag John/Rodney has been merged into the Rodney McKay/John Sheppard tag.

If your tag is a freeform tag, for example kink_bingo, santa, and roadtrip, you might find that it has not been marked common or connected to any others. This is because our tag wranglers haven’t wrangled it yet, as we’re still working on the best way to use freeform tags in our filters. Once we figure out the best way to handle these, then we’ll incorporate them into our search and browse so that you can easily find fics tagged in a particular way. However, this has a big potential impact on performance since there are a lot of these tags. We’re currently rebuilding our search and browse interface in order to improve performance while enhancing choice and control on the part of the reader. In the meantime, our tag wranglers have been holding off on marking most freeform tags common and focusing on dealing with fandom and characters.

How can I search for fics with tags that haven’t been marked common?

Our search box searches all fields, so you can search for anything you like in there. In response to popular demand, our latest code update adds some new tag functionality – clicking on any tag will now bring up works which use that tag. We hope to bring some more power to tag searching in the revamp of our search interface, but in the meantime we think it should be reasonably easy to find what you are looking for.

I’m not sure how to tag my crossover / rpf / other fic.

We mentioned in the last post that we’re still figuring out the coding logistics for crossovers and real person fiction. However, this work relates entirely to things which will happen behind the scenes, so please don’t let it delay you in posting fic to the Archive! Just tag your fics in the way that feels right to you, and let our tag wranglers deal with any headaches that may arise. The one thing we do ask is that you be generous with your tagging information – for example, giving characters’ full names makes life a bit easier for us.

How are you planning on organising the fandom tags for Books and Literature? They seem a bit inconsistent.

It’s true that the canonical tags for Books and Literature do not follow one consistent pattern – books fandoms vary a lot in the way they are referred to (book title, author name, series), so it’s almost impossible to use a ‘one size fits all’ approach.

Our official standard is to use ‘Book / Series name – Author name’, for example Chronicles of Narnia – C.S. Lewis. However, this standard is still evolving, and there are some complexities relating to fandoms such as Georgette Heyer which don’t fit neatly into this form. We’re working on some code that will make it easier for us to manage tricky fandom names like this, and we’ll likely make some changes to introduce more consistency in the future, but for now we’re focusing on having tags which will be meaningful and familiar to our users rather than on a single unified form. If you’re wondering how to tag your own fic, then please do use a format that seems sensible to you rather than worrying too much about the ‘right’ way from our POV.

When you wrangle tags, how do you deal connections between tags which aren’t exact equivalents?

As we mentioned in our previous post, our tag wrangling process connects up tags which are synonyms, such as SPN and Supernatural. Some of you have pointed out that some tags may technically be synonyms, but the individual terms carry levels of nuance. For example, Alternate Universe and Alternate Reality are used interchangeably by some people, but others make a distinction. We currently don’t have any way to add nuance to our tags relationships – we’re working on ways to categorise tags as ‘related but not identical’, but we haven’t come up with the right code yet. In the meantime, we take these on a case-by-case basis – as a general rule we won’t mark tags as synonyms if there is the potential for confusion.

I want to add more warnings to my work than your standard warning tags allow.

When our Content Policy team decided on our warnings policy, they wanted to ensure that authors could give (and readers could get) as much information as they wanted about stories. This is one of the main purposes of our freeform tags – you can add as much information to your story as you see fit, and readers can add their own additional descriptions and tags if they choose to bookmark the fic. One of the key themes which has emerged in the recent debates on warnings is that many people just want more information about stories generally. Our freeform tags not only let you add other warnings to your own story, but also to label your stories with themes, kinks, or other story elements readers might want to know about and to tag other stories on the archive with the information you find most relevant.

Our standard warning tags, which include “choose not to warn”, “choose not to warn for some content”, and “none of these warnings apply”, were designed to help the reader decide about whether or not to seek out additional information about a story through the tag system, rather than as a comprehensive list of all serious warnings. We want to keep the number of core warnings low because those warnings are enforceable: if a story contains major character death, and is labeled “none of these warnings apply,” a reader can report that story to Abuse, who can contact the author and, if necessary, change the story’s label to “Choose not to warn for some content.” The same applies for graphic violence, underage, and rape/noncon. The bigger the list of core warnings, the more difficult they are to enforce (not even taking into account the even greater difficulty of defining categories such as “dubcon”), especially given that our policy is to defer to authors’ judgment in close cases.

But if you’re someone who prefers not to have too much information about a story before you read, fear not! You may already have noticed that logged-in users have an option to hide all warnings by default (you set this on your preferences page, and when it’s enabled you can still reveal warnings on a case-by-case basis). We’re planning to add the same option for tags, so that if you prefer surprise!alien spiders–or even surprise!incest–you can have it – watch out for this new feature in a future site update.

As the latest discussion on warnings reminded everyone, it’s hard to find a solution that works for everyone – but we think the model we have developed is a good balance.

We hope this post answers a few of your questions! Please leave other questions and comments here. We won’t always answer comments on this post directly – we’ll put your feedback into our pool of things to answer in future posts.

Archive News #2 – Tags

Welcome to our second Archive news post! These regular posts are a venue for us to answer some frequently asked questions about the Archive of Our Own. Everything you wanted to know about tags–right under the cut!

Please leave your questions about the Archive in comments and we’ll answer them in upcoming posts. (This is a space for more general questions – if you have specific comments about the design or usability of the Archive please send feedback on the Archive site itself, so it goes into our bugfix and design process).

This week we’re looking at something quite specific: the way tags are used on the Archive. This is a bit more detailed than a lot of the posts we plan to have in this slot, but as tags work a bit differently in the Archive than on other sites you may use and we’ve had a lot of questions on them, we thought we’d do a special feature.

Where are tags used on the Archive?

Tags can be used on works (ie your fics) and on bookmarks. For the rest of this post, we’ll mainly be discussing the way tags work on works (bookmarks are a bit simpler).

For the purposes of the archive, almost all the metadata on works – that is Category, Warnings, Rating, Fandom, Characters, Pairings, and what is shown as Tags (when you enter them on the work form they’re known as ‘Freeform tags’) – is treated as a tag. So, all the bits in the grey header box on a work are tags:

Archive: work header box

The advantage of doing it this way is that it should be easier for people to search on any field they wish. If you click on a tag, then you will be shown all the works using that tag. Tags are also used to populate our search filters, which help you drill down through all the available fics.

What format do tags have?

  • They can have spaces – so you can put Harry Potter rather than Harry_Potter.
  • They can be 42 characters long – we may allow longer tags in the future, but we have to have some limit on the length, or our database might melt.
  • They can’t have commas – tags are comma separated at input, so if you include a comma, then the database assumes you’re making a new tag (apologies to fans of ‘At Swim, Two Boys’ and other comma-loving fandoms).

What’s so special about the Archive tags?

The tags are a bit more organised than you’ll find on sites like Delicious. For those of you who are metadata-inclined, we’re using a sort of hybrid of folksonomy (user-defined tags) and classification. What this means is:

  • In most fields, our users can enter any tag, in exactly the form they want it.
  • Behind the scenes, our team of tag wranglers classify and make connections between tags, building a structure which adds extra meaning and helps make tags as useful as possible to all users.

The reason for this is to ensure maximum flexibility for authors, and maximum ease of searching for readers.

What else do tag wranglers do? How do the tag relationships work?

Our tag wranglers work to ensure that tags give our users as much meaningful information as possible, by hooking up related tags and clearing up ambiguities. The full details of how tags work behind the scenes and what tag wranglers do with them are fairly complex. However, some of the most important things are as follows:

* Some tags are marked as ‘canonical’. These are the tags that are the most meaningful when viewed alone. So, Dean Winchester/Sam Winchester would be canonical, while Sam/Dean wouldn’t. All the other versions are connected to this tag. Only canonical tags are visible in our filters, so you won’t see every tag you ever used in there, but since the canonical versions are hooked to all the others the filters still search every possible version of a tag.
* Some tags are ‘ambiguous’. These are the tags which could mean more than one thing. For example, the tag Dean could refer to Dean Winchester in Supernatural, Dean Forester in Gilmore Girls, or a whole host of other Deans. We are planning to introduce a special behind the scenes tag category called ‘Ambiguity’. In this case, the tag wranglers will mark the tag as ambiguous but also hook it up to all the possible canonical tags it could refer to.
* Tags are given relationships which put them in context. So, character tags John Sheppard and Rodney McKay ‘belong’ to the fandom tag Stargate Atlantis.

Some tags appear with ‘freeform’ or ‘character’ after them. What’s that all about?

Each tag belongs to a category – Pairing, Character, Fandom, etc. Tag names must be unique, so if a tag already exists in one category, then when it is used in another category, the Archive will automatically add the name of the category.

For example:
* Buffyfan1 puts ‘Buffy the Vampire Slayer’ in as a fandom tag. A new tag is created.
* Buffyfan2 puts ‘Buffy the Vampire Slayer’ in as a character tag. Oh noes – this tag name is taken and can’t be reused! So, the Archive changes it to ‘Buffy the Vampire Slayer – character’.

Result:
* XanderGirl5 can click on ‘Buffy the Vampire Slayer – character’ and find all fics which feature Buffy as a character.
* The Archive has unique tag names and so the database does not melt.
* Everyone is happy.

People might use all kinds of crazy tags! Do I need to search for every possible variation of a tag?

Nope. Thanks to the work of our tag wranglers, different tags which mean the same thing are marked as synonymous. So, some authors will mark their fic Harry Potter/Severus Snape, others will put Harry/Severus and still others will put Snarry. Behind the scenes, the tag wranglers hook all these together so the Archive knows that if you search for Snarry you also want fics tagged with the other possible ways of describing the pairing.

My tag has been categorised wrongly! What should I do?

From time to time, there’s a chance the tag wranglers will make a mistake when they wrangle your tag. Maybe you only write Death Note fic based on the anime, and yet your fic is coming up when you search for Death Note (manga). Our tag wranglers are only human, and they don’t know every fandom in existence, so what’s obvious to you might not be clear to them. If you notice a mistake, then please let us know via the Archive Feedback Form.

How can I make sure the tag wranglers don’t make mistakes with my tags?

Be generous with your taggings! The more information you put in the easier it is for tag wranglers to make sense of what you meant.

  • Give full names (first name, family name) for characters and pairings, unless that’s not appropriate for some reason (i.e. ‘Angel’ from AtS and BtVS just goes by the one name).
  • If your fandom could be ambiguous – for example, you write Death Note fic for the anime only, not the manga – then add more detail. You can put Death Note (anime) in the Fandom field, or just add anime in the freeform tags.

What about RPF?

The Archive of Our Own is RPF-friendly! However, RPF is one of the most difficult areas for tag wranglers, since the way people classify their fandoms varies a lot. Do we use fandom name? Network name? What about general groups of people, like “Canadian Actors”, or historical figures?

We’ve come up with a concept for dealing with this, but we haven’t finished building it, so please bear with us.

What about crossovers?

We love crossovers too! However, we don’t have specific tags for crossovers, since crossovers are just fics with more than one fandom.

Please just enter each fandom in your crossover in the Fandom field, separated by commas. Don’t separate with slashes, as this creates a tag which we can’t wrangle into our search structure. Feel free to add Crossover as a freeform tag, though.

Can I become a tag wrangler?

Yes, probably! We will always need tag wranglers to keep the Archive tags in line. We are looking to add more diverse fandoms to our teams, so while we won’t take on everyone who applies (some fandoms are already well-represented) we’d love to hear from anyone who thinks they can help. If you’ve noticed a fandom languishing unwrangled for a while, it probably means that we have nobody with expertise for it – if you could fix that, please let us know! Anime, manga and comics are currently particularly under-represented.

Some known bugs with tags

We think we’ve fixed all our known bugs with tags, but there are a few which were extremely noticeable / annoying, so we’re listing them here. If you’ve encountered some other weirdness, do let us know via the feedback form.

  • Filtering not specific enough in some cases – e.g. filtering for Stargate Atlantis also brought up fics for SG-1. This was caused by the way tag relationships were working for fandoms with common characters. This should be fixed as of our deploy in late May.
  • ‘No fandom’ showing up as a possible fandom in the filters. This was caused by our need to mark tags which don’t belong to a specific fandom (for example, schmoop). It showing up in the filters was a bug, and should be fixed as of our deploy in late May.

We hope this post answers a few of your questions! Please leave other questions and comments here. We won’t be answering comments on this post directly – we’ll put your feedback into our pool of things to answer in future posts.

Notes from the Open Video Conference, Day Two

Summary of a couple of panels on Day 2:

Automated DMCA Takedowns and Web Video: Scott Smitelli, a professional sound designer and editor, is the fellow who wrote Fun with YouTube’s Audio Content ID System, in which he tried to test out the limits of YouTube’s fingerprinting system for audio. Conclusions: the software is mainly interested in the first 30 seconds of a song, and can be thwarted by pitch or time alterations of over 6% (which may be unhelpful to the musically sensitive among us, but there you go.) Kevin Driscoll and others from YouTomb discussed the January Massacre: the massive increase of takedowns in December, 2008 and January, 2009. On a graph, it looks like takedowns have dropped off since then, but that may be deceptive: in fact, it seems like things are being detected so fast (within ten minutes) that YouTomb can’t keep track of them, or to put it another way: takedowns are low because stuff’s never getting UP in the first place. A suggestion: that it would be great if every takedown left a webpage with a card saying, “This has been taken down,” because in many cases, people are not aware of what they can’t have. Oliver Day, also from YouTomb, told a chilling story: the original filmmaker who shot the clouds that were used in the Anonymous anti-Scientology ads had his original footage taken down–not in deference to those ads, but in deference to a Huffington Post anti-Giuliani parody of those ads. As Day put it, “The power is with the powerful”: even though the original filmmaker’s footage was there first, it was assumed that he was infringing the Huffington Post, and not the other way around.

Who Owns Popular Culture? Remix and Fair Use in the Age of Corporate Mass Media: This was the panel hosted by Jonathan McIntosh and featuring animator Nina Paley (of Sita Sings The Blues, Neil Sieling from the Center for Social Media, political remixer Elisa Kreisigner, Karl Fogel from questioncopyright.org, and OTW Board Member Francesca Coppa. The panel largely discussed what the policing of online video and the over-enforcement of copyright means for artists, remixers, and those interested in free speech. Nina Paley answered the question literally, by providing a list of who owns popular culture–or in her case, literally, the songs, mostly from 1927-28, that she used in Sita Sings The Blues, while Elisa Kreisinger evoked many the important visual artists, from Duchamp to Koons to Kruger to Lichtenstein to Warhol, for whom remixing and recontextualizing pop culture was a key artistic move. (She also showed her remixes of the Queer Housewives of New York City.)