This is a new article published in Information, Communication, and Society as part of their annual special issue for the Association of Internet Researchers (AoIR) conference. This year’s special issue was edited by Lee Humphreys and Tarleton Gillespie, who did a great job throughout the whole process.
Abstract: This article introduces and discusses the role of bespoke code in Wikipedia, which is code that runs alongside a platform or system, rather than being integrated into server-side codebases by individuals with privileged access to the server. Bespoke code complicates the common metaphors of platforms and sovereignty that we typically use to discuss the governance and regulation of software systems through code. Specifically, the work of automated software agents (bots) in the operation and administration of Wikipedia is examined, with a focus on the materiality of code. As bots extend and modify the functionality of sites like Wikipedia, but must be continuously operated on computers that are independent from the servers hosting the site, they involve alternative relations of power and code. Instead of taking for granted the pre-existing stability of Wikipedia as a platform, bots and other bespoke code require that we examine not only the software code itself, but also the concrete, historically contingent material conditions under which this code is run. To this end, this article weaves a series of autobiographical vignettes about the author’s experiences as a bot developer alongside more traditional academic discourse.
Official version at Information, Communication, and Society
Author’s post-print, free download [PDF, 382kb]
I’ve written a number of papers about the role that automated software agents (or bots) play in Wikipedia, claiming that they are critical to the continued operation of Wikipedia. This paper tests this hypothesis and introduces a metric visualizing the speed at which fully-automated bots, tool-assisted cyborgs, and unassisted humans review edits in Wikipedia. In the first half of 2011, ClueBot NG – one of the most prolific counter-vandalism bots in the English-language Wikipedia – went down for four distinct periods, each period of downtime lasting from days to weeks. Aaron Halfaker and I use these periods of breakdown as naturalistic experiments to study Wikipedia’s quality control network. Our analysis showed that the overall time-to-revert damaging edits was almost doubled when this software agent was down. Yet while a significantly fewer proportion of edits made during the bot’s downtime were reverted, we found that those edits were later eventually reverted. This suggests that human agents in Wikipedia took over this quality control work, but performed it at a far slower rate.
This post for Ethnography Matters is a very personal, reflective musing about the first bot I ever developed for Wikipedia. It makes the argument that while it is certainly important to think about software code and algorithms behind bots and other AI agents, they are not immaterial. In fact, the physical locations and social contexts in which they are run are often critical to understanding how they both ‘live’ and ‘die’.
My frequent collaborator Aaron Halfaker has written up a fantastic article with John Riedl in Computer reviewing a lot of the work we’ve done on algorithmic agents in Wikipedia, casting them as Wikipedia’s immune system. Choice quote: “These bots and cyborgs are more than tools to better manage content quality on Wikipedia—through their interaction with humans, they’re fundamentally changing its culture.”
In the Wikipedia research community — that is, the group of academics and Wikipedians who are interested in studying Wikipedia — there has been a pretty substantial and longstanding problem with how research is published. Academics, from graduate students to tenured faculty, are deeply invested and entrenched in an system that rewards the publication of research. Publish or perish, as we’ve all heard. The problem is that the overwhelming majority of publications which are recognized as ‘academic’ require us to assign copyright to the publication, so that the publisher can then charge for access to the article. This is in direct contradiction with the goals of Wikipedia, as well as many other open source and open content creation communities — communities which are the subject of a substantial amount of academic research.
I’m part of a Wikipedia research group called “Critical Point of View” centered around the Institute for Network Cultures in Amsterdam and the Centre for Internet and Society in Bangalore. (Just a disclaimer, the term ‘critical’ is more like critical theory as opposed to Wikipedia bashing for its own sake.) We’ve had some great conferences and are putting out an edited book on Wikipedia quite soon. My chapter is on bots, and the abstract and link to the full PDF is below:
I describe the complex social and technical environment in which bots exist in Wikipedia, emphasizing not only how bots produce order and enforce rules, but also how humans produce bots and negotiate rules around their operation. After giving a brief overview of how previous research into Wikipedia has tended to mis-conceptualize bots, I give a case study tracing the life of one such automated software agent, and how it came to be integrated into the Wikipedian community.
The Lives of Bots [PDF, 910KB]
This is a paper I co-authored with David Ribes and recently presented at HICSS, the Hawaii International Conference on Systems Sciences. It’s a qualitative methodology based on analyzing logging data that we developed through my research on Wikipedia, but has some pretty broad applications for studying highly-distributed groups. It’s an inversion of the previous paper we presented at CSCW, showing in detail how we traced how Wikipedian vandal fighters as they collectively work to identify and ban malicious contributors.
Abstract: We detail the methodology of ‘trace ethnography’, which combines the richness of participant-observation with the wealth of data in logs so as to reconstruct patterns and practices of users in distributed sociotechnical systems. Trace ethnography is a flexible, powerful technique that is able to capture many distributed phenomena that are otherwise difficult to study. Our approach integrates and extends a number of longstanding techniques across the social and computational sciences, and can be combined with other methods to provide rich descriptions of collaboration and organization.
Trace Ethnography: Following Coordination through Documentary Practices (PDF, 361KB)
Citation: Geiger, R.S., & Ribes, D. (2011). Trace Ethnography: Following Coordination Through Documentary Practices. In Proceedings of the 44th Annual Hawaii International Conference on Systems Sciences. Retrieved from http://www.stuartgeiger.com/trace-ethnography-hicss-geiger-ribes.pdf
With the help of my advisor, Dr. David Ribes, I recently got a chapter of my master’s thesis accepted to the ACM conference on Computer Supported Cooperative Work, to be held in February 2010 in Savannah, Georgia. It is titled “The Work of Sustaining Order in Wikipedia: The Banning of a Vandal” and focuses on the roles of automated ‘bots’ and assisted editing tools in Wikipedia’s ‘vandal fighting’ network.
Abstract: In this paper, we examine the social roles of software tools in the English-language Wikipedia, specifically focusing on autonomous editing programs and assisted editing tools. This qualitative research builds on recent research in which we quantitatively demonstrate the growing prevalence of such software in recent years. Using trace ethnography, we show how these often-unofficial technologies have fundamentally transformed the nature of editing and administration in Wikipedia. Specifically, we analyze „vandal fighting‟ as an epistemic process of distributed cognition, highlighting the role of non-human actors in enabling a decentralized activity of collective intelligence. In all, this case shows that software programs are used for more than enforcing policies and standards. These tools enable coordinated yet decentralized action, independent of the specific norms currently in force.
Download the full paper (PDF)
This week, I’m presenting a poster at WikiSym 2009 on “The Social Roles of Bots and Assisted Editing Tools.” Most of the work is distilled from my thesis.
Abstract: This project investigates various software programs as non-human social actors in Wikipedia, arguing that their influence must not be overlooked in research of the on-line encyclopedia project. Using statistical and archival methods, the roles of assisted editing programs and bots are examined. First, the proportion of edits made by these non-human actors is significantly more than previously described in earlier research. Second, these actors have moved into new spaces, changing not just the practice of article writing and reviewing, but also administrative work.
Download the Poster (PDF)
Download the Extended Abstract (PDF)
And if you are interested in this topic, check out the full paper, The Work of Sustaining Order in Wikipedia: The Banning of a Vandal.
Jimmy Wales speaking at the conference keynote, by GreenReaper, CC BY-SA 3.0
A few months ago, I had the pleasure of presenting at the first (hopefully annual) WikiConference New York, sponsored by the Wikimedia New York City chapter with assistance from Free Culture @ NYU and the Information Law Institute at NYU’s law school. I know that I am atrociously late in writing this post, but I’m not really writing it for the Wikipedians out there. Rather, the WikiConference was an interesting experiment that seemed to apply Wikipedia’s philosophy towards editing to a conference, resulting in what the organizers called a “modified unconference.”