This is an updated schedule of track 113, Critical Data Studies, at the 2016 Annual Meeting of the Society of the Social Study of Science (4S) and the European Association for the Study of Science and Technology (EASST). Please contact Stuart Geiger if you have any questions.
We’re organizing a workshop on trace ethnography at the 2015 iConference, led by Amelia Acker, Matt Burton, David Ribes, and myself. See more information about it on the workshop’s website, or feel free to contact me for more information.
I built a script that dynamically generates a robots.txt file for search engine bots, who download the file when they seek direction on what parts of a website they are allowed to index. By default, it directs all bots to stay away from the entire site, but then presents an exception: only the bot that requests the robots.txt file is allowed full reign over the site.
This is a new article published in Information, Communication, and Society as part of their annual special issue for the Association of Internet Researchers (AoIR) conference. This year’s special issue was edited by Lee Humphreys and Tarleton Gillespie, who did a great job throughout the whole process.
I’ve written a number of papers about the role that automated software agents (or bots) play in Wikipedia, claiming that they are critical to the continued operation of Wikipedia. This paper tests this hypothesis and introduces a metric visualizing the speed at which fully-automated bots, tool-assisted cyborgs, and unassisted humans review edits in Wikipedia. In the first half of 2011, ClueBot NG – one of the most prolific counter-vandalism bots in the English-language Wikipedia – went down for four distinct periods, each period of downtime lasting from days to weeks. Aaron Halfaker and I use these periods of breakdown as naturalistic experiments to study Wikipedia’s quality control network. Our analysis showed that the overall time-to-revert damaging edits was almost doubled when this software agent was down. Yet while a significantly fewer proportion of edits made during the bot’s downtime were reverted, we found that those edits were later eventually reverted. This suggests that human agents in Wikipedia took over this quality control work, but performed it at a far slower rate.
This post for Ethnography Matters is a very personal, reflective musing about the first bot I ever developed for Wikipedia. It makes the argument that while it is certainly important to think about software code and algorithms behind bots and other AI agents, they are not immaterial. In fact, the physical locations and social contexts in which they are run are often critical to understanding how they both ‘live’ and ‘die’.
My frequent collaborator Aaron Halfaker has written up a fantastic article with John Riedl in Computer reviewing a lot of the work we’ve done on algorithmic agents in Wikipedia, casting them as Wikipedia’s immune system. Choice quote: “These bots and cyborgs are more than tools to better manage content quality on Wikipedia—through their interaction with humans, they’re fundamentally changing its culture.”
I don’t normally pick on people whose work I really admire, but I recently saw a tweet from Mark Sample that struck a nerve: “Look, if you don’t instagram your first pumpkin spice latte of the season, humanity’s historical record will be dangerously impoverished.” While it got quite a number of retweets and equally snarky responses, he is far from the first to make such a flippant critique of the vapid nature of social media. It also seriously upset me for reasons that I’ve been trying to work out, which is why I found myself doing one of those shifts that researchers of knowledge production tend to do far too often with critics: don’t get mad, get reflexive. What is it that makes such a sentiment resonate with us, particularly when it is issued over Twitter, a platform that is the target of this kind of critique? The reasons have to do with a fundamental disagreement over what it means to interact in a mediated space: do we understand our posts, status updates, and shared photos as representations of how we exist in the world which collectively constitute a certain persistent performance of the self, or do we understand them a form of communication in which we subjectively and interactionally relate our experience of the world to others?
This was an interview I did with the wonderful Heather Ford, originally posted at Ethnography Matters (a really cool group blog) way back in January. No idea why I didn’t post a copy of this here back then, but now that I’m moving towards my dissertation I’m thinking about this kind of stuff more and more. In short, I argue for a non-anthropocentric yet still phenomenological ethnography of technology, studying not the culture of the people who build and program robots, but the culture of those the robots themselves.
In the Wikipedia research community — that is, the group of academics and Wikipedians who are interested in studying Wikipedia — there has been a pretty substantial and longstanding problem with how research is published. Academics, from graduate students to tenured faculty, are deeply invested and entrenched in an system that rewards the publication of research. Publish or perish, as we’ve all heard. The problem is that the overwhelming majority of publications which are recognized as ‘academic’ require us to assign copyright to the publication, so that the publisher can then charge for access to the article. This is in direct contradiction with the goals of Wikipedia, as well as many other open source and open content creation communities — communities which are the subject of a substantial amount of academic research.
I recently saw Helvetica, a documentary directed by Gary Hustwit about the typeface of the same name — it is available streaming and on DVD from Netflix, for those of you who have a subscription. As someone who studies ubiquitous socio-technological infrastructures (and Helvetica is certainly one), I know how hard it is to seriously pay attention to something that which we see every day. It may seem counter-intuitive, but as Susan Leigh Star reminds us, the more widespread an infrastructure is, the more we use it and depend on it, the more invisible it becomes — that is, until it breaks or generates controversy, in which case it is far too easy. But to actually say something about what well-oiled, hidden-in-plain-sight infrastructures are, how they came to have such a place in our society, and why they won out over their competitors is a notoriously difficult task. But I came to realize that the film is less of a history of fonts, and more of an anthropology of design.
So given what’s going on* in Egypt and the Middle East, we in the West are fascinated by not so much revolutions and popular uprisings against dictatorial regimes, but an efficacious use of social media. Even Clinton is talking about the Internet as “the world’s town square”, and it seems that the old conversation about the Internet and the public sphere is going to flare up for the third time (1993-5 and 2001-3 are the other two times). Since Habermas is generally credited for bringing this notion of the public sphere to the forefront of popular, political, and academic discourse, it is natural to cite him. Then critique him to death, talking about how we need to get beyond an old white guy’s theories. And it feels good, I know.
This is a paper I co-authored with David Ribes and recently presented at HICSS, the Hawaii International Conference on Systems Sciences. It’s a qualitative methodology based on analyzing logging data that we developed through my research on Wikipedia, but has some pretty broad applications for studying highly-distributed groups. It’s an inversion of the previous paper we presented at CSCW, showing in detail how we traced how Wikipedian vandal fighters as they collectively work to identify and ban malicious contributors.
Have you done historical bibliometric analysis of a scientific field or topic area and found that there is a massive increase in research articles after 1990? Are you using ISI’s Web of Science and searching by topic or keyword? If so, don’t make the same mistake I did: these results aren’t because of some sea change or paradigm shift, but rather result from a poorly-documented shift in how ISI began indexing articles after 1990.
This is a paper that I recently got published in gnovis, which is a peer-reviewed journal run entirely by graduate students at Georgetown’s Communication, Culture, and Technology program. It is a sneakishly Latourian intervention into the debate between Habermasians and post-Habermasians regarding the Internet as a (part of the) public sphere. They have been arguing for some time about whether the Internet (and specifically blogging) leads to political fragmentation or real collective action. However, they have all taken for granted the highly-automated software infrastructures that mediate our knowledge of the blogosphere. The article is up in HTML on the gnovis site, but I’ve also made a full-text, metadata friendly PDF simply because Google Scholar likes those. The abstract is after the jump.
With the help of my advisor, Dr. David Ribes, I recently got a chapter of my master’s thesis accepted to the ACM conference on Computer Supported Cooperative Work, to be held in February 2010 in Savannah, Georgia. It is titled “The Work of Sustaining Order in Wikipedia: The Banning of a Vandal” and focuses on the roles of automated ‘bots’ and assisted editing tools in Wikipedia’s ‘vandal fighting’ network.
In an age of information overload, the history of Wikipedia’s co-evolving media use and governance model gives us a powerful lesson regarding the way in which the development of social structures and media technologies are fundamentally interrelated in the digital era.
I show that asking whether Wikipedia is a reliable academic source enframes Wikipedia into an objectless standing-reserve of potential citations, foreclosing many other possibilities for its use. Instead of asking what Wikipedia has done to reality, I ask: what have we done to Wikipedia in the name of reality?
This is a review of Julian Orr’s Talking About Machines, an ethnography of Xerox photocopier technicians. Blurring the line between ethnomethodology, organizational communication, infrastructure studies, human-computer/machine interaction, business administration, and traditional ethnography of work, his study reveals more than just the daily practices of what may initially seem like a boring job.
A response to Thomas Kuhn’s The Structure of Scientific Revolutions, in which his work is applied to a personal vignette of experimentation practices in a High School Physics class. When in the course of scientific education should students be allowed to modify scientific theories to fit experimental data instead of modifying experiments to fit the theories?
This is a tentative article-length introduction to my thesis on Wikipedia. It is an attempt to analyze Wikipedia from an interdisciplinary perspective that tries to make problematic various assumptions, concepts, and relations that function quite well in the “real world” but are not well-suited to studying Wikipedia. I begin by talking about the nature of academic disciplines, then proceed to a detailed but sparse review of certain prior research on Wikipedia. By examining the problems in previous research within the context of disciplines, I establish a tentative methodology for a holistic study of Wikipedia.
As some of you might know, I work part-time at the Federation of American Scientists. Most of what I do has involved the creation of a wiki for virtual worlds, and I am proud to say that it is ready for the world. It is not simply a wiki, but a structured semantic wiki. This means that when you edit a page on a virtual world, you get a customizable form instead of a massive textbox. Check it out!
As someone who studies Internet culture, one of my biggest problems is “link rot,” or broken links. I’m a big fan of the Internet Archive, but they are usually six to eight months behind on even the most popular sites. I also applaud sites like Wikipedia for providing stable version histories so that I can point to a specific revision of a page. However, for all other websites, the only option is self-archiving, which is technically difficult and fraught with problems. What I have found incredibly useful is WebCite, a free webpage archiving service that fills in this gap.
An outright ban on technology in the classroom - which may or may not include the pen and paper - is not the right answer. If one wishes to curb disruptive behavior, then ban disruptive behavior instead of banning all the little things that could be disruptive.
As you may know, Google often thinks it knows what you are looking for better than you do. It will suggest different search queries and display them underneath the top three results for your original query. So I did a simple Google search for “Phenomenology of Spirit,” an 1807 book written by German philosopher G.W.F. Hegel today and found a very interesting suggestion.
This was my final project for an Information Studies class I took back in 2006, when I was an undergraduate at the University of Texas. Our assignment was to transform information from one form to another, and I chose to perform this analysis of Deleuze and Guattari’s A Thousand Plateaus. I scanned and OCRed the entire book and did a visual frequency representation of certain words. I analyzed by chapter and comprehensively with certain core themes in the work. I also did a comprehensive analysis with more general or common words. It is intended to look the way it does, as I am going for a “1960s IBM goes to the academy” look. Take what you will from it: it is about 35% art, 25% snarky pastiche, 15% pretending to be linguistics, and -5% serious intellectual critique. Here is a sample:
Content on my website and my Flickr account has been licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives license for a while. I was pretty proud of myself. But then I got to thinking: why don’t I choose Attribution-ShareAlike? Obviously, it was product of two kneejerk reactions: I don’t want someone else to make money off my stuff, and I don’t want someone messing with my stuff.
Director of the Library of Alexandria, Dr. Ismail Serageldin gave a keynote speech on the first day of Wikimania 2008 titled, New Paradigms for New Tomorrows. It was quite thoughtful and inspiring – the man is one of the most amazing individuals I have heard. He is learned in so many different areas of academic and cultural knowledge, as well as incredibly wise. I would recommend watching the video of his speech, but if you are pressed for time you can read my notes.
Collaborative research on Wikiversity by Cormac Lawler (user Cormaggio on Wikimedia projects) at the University of Manchester. Wikiversity is a relatively young project in the Wikimedia umbrella, but I think it is a natural development and a great space to realize the potential of all the educators currently on Wikipedia, Wiktionary, Wikibooks, and all the other projects.
From “Flagged Revisions,” a presentation at Wikimania 2008 by Philip Birken. In my opinion, flagged revisions realize the concept of stable versions without making the article actually stable. It is not a system of voting to approve new revisions – a new revision is approved when only one autoconfirmed user says it is vandalism-free. Yes, it won’t solve everything, but it will make things much better. We can get rid of protecting articles that are experiencing heavy vandalism if we do this, because an edit only updates to the public when it is flagged as not-vandalism by a trusted user. However, vandals (or any other user) immediately sees the results of their edit for an hour, which is just ingenious. Also, you can choose whether the most recent revision is shown by default, or make it so that certain users (like anonymous users) only see the most recent reviewed revision. For those who feel that it threatens “the wiki way,” I suggest making the most recent version appear by default and giving people the option to see the latest reviewed version.
This panel was at Wikimania 2008, and featured James Forrester, Andrew Lih, Kat Walsh, and Charles Matthews. Everyone except for Lih is or has been on the Arbitration Committee, and this turned into a discussion about admins.
As an ethnographer, I enter into communities, learn their customs, beliefs, and practices, then report back to the academy to share what I have discovered. In this presentation, I wish to do the opposite, presenting to the Wikipedian community an ethnography of academics as they relate to Wikipedia.
Content and the Internet in the (Globalized) Middle East, Dr. Ahmed Tantawi, Technical Director, IBM Middle East and North Africa. Another copy of my notes from Wikimania 2008 – this was the keynote speech on the second day of the conference. He began by warning us that, “I’ve changed this presentation, and I’ll change it during. That is open content, yes?” Everyone laughed.
This was part of the opening keynote in Wikimania 2008, given by the Egyptian Deputy Minister of Communication and IT, Hoda Baraka. Here are my notes, again without any commentary – I apologize for them not being cleaner.
The official theme or slogan for this year’s Wikimania is “the knowledge revolution that is changing wisdom.” I think this phrase – especially the difference between knowledge and wisdom – was chosen very carefully and I think it is an excellent distinction to make. This morning’s opening ceremony began with a speech from the Egyptian Minister of State for Administrative Development, Dr. Ahmed Darwish. I will relay his comments here, without much analysis – that will come later, when I have the time.
I am currently in Egypt for Wikimania 2008, which is being held this year at the Library of Alexandria. On Sunday, I will be presenting my ethnographic analysis of conceptions and misconceptions academics hold about Wikipedia. This presentation was going to be about old, computer-illiterate professors but has turned into something much more interesting: a commentary on Wikipedia’s status in the so-called postmodern digital humanities. I will update the post on this site as I finalize my presentation.
I feel bad that I have not written a new entry in so long. I feel like I should apologize - not to the readers, but to the software, to the site itself. I ought to write a new post; I ought to update my status. How did I get into a situation whereby these collections of code could make ethical demands upon me? And is this bad?
Brian Williams talked about how this year’s primary season has shown that even in the age of the Internet, we still have a longing for real communities. I take issue with his use of “virtual community” and claim that most political communities are virtual.
I explore the memetic inkblot, which refers to units of cultural information that have effectively no singular semiotic value and therefore serve as a psychosocial indicator. In other words, they are so vague and open to interpretation that you can learn a lot about someone by asking someone to give a simple definition of them.
Works licensed under the GPL and the GFDL can be modified and then freely redistributed, as long as the modified versions are released under the same conditions. Why are we not allowed to modify these licenses and redistribute them?
Benjamin Wiker’s book on bad books throughout the ages is misinformed and makes a few critical errors in its analysis. Specifically, it ignores the cultural context around each book he critiques, treating them as pure subliminal propaganda.
This presentation was adapted from a chapter in my Senior thesis on Wikipedia’s legal system that focused on a dispute over the inclusion of images of the Islamic prophet Muhammad in an article about him, using a methodology of communicative ethnography. Most who opposed the image were not familiar with Wikipedia’s unique method of content regulation and dispute resolution, as well as its editorial standards and principles. However, most who argued in favor of keeping the image knew these and initially used them to their advantage. This ethnographic study of the communicative strategies used by the parties involved in the dispute shows how new editors to the user-written encyclopedia first emerged in a hostile communicative environment and subsequently adapted their argumentative strategies. This conflict is an excellent example of how disputes are resolved in Wikipedia, showing how this new media space regulates its own content.
An investigation into the community formed by small number of Wikipedia contributors who care enough to decide how, at some level, Wikipedia is run. The work discusses identity, communication, and organizational hierarchy in this subculture.
My thesis studied the legal culture of Wikipedia to examine the law through stories and histories, giving the reader a sense of not only what the Wikipedian legal system is, but also what fundamental assumptions the community makes in utilizing such a system.
This is a piece of web art or net art, with an included work of art criticism about the piece. The work makes the argument that while interactive digital art can be considered user-centered, this new style and medium is only centered around those possibilities that the creator wishes to make available to the user. You can see The Facticity of Art at http://stuartgeiger.com/art/art-intro.shtml.
This is a response to they hypertext fiction work Patchwork Girl by Shelley Jackson. It is comprised in part of ‘patches’ of other works, most notably Mary Shelley’s Frankenstein. I have made this essay entirely out of parts from the novel.
William Gibson’s novel Neuromancer tells the story of a team of radically different technologically-savvy individuals who are recruited by a young artificial intelligence named Wintermute, who desires to bypass the limitations placed on it by its owners and the authorities.
In his book Me++: The Cyborg Self and the Networked City, William Mitchell describes how information technology – specifically digital, wireless networks which are accessed primarily through portable devices – fundamentally changes how we interact with others. More than anything else, “[c]onnectivity had become the defining characteristic of our twenty-first-century urban condition” (11). For Mitchell, we have given up the virtual reality fantasy that dominated predictions made in previous decades in lieu of subtler revolution: that of the networked self, the Me++.
This was a CSS stylesheet I wrote for the CSS Zen Garden, which is a really cool concept in web design. There is a standard HTML page in which all the content is wrapped up in div tags, and the idea is to write a CSS stylesheet that makes it pretty. Mine was based on blueprints, and can be accessed here. It turns out that I didn’t make into the accepted designs, but I did get on the list of those that didn’t make the cut. I can see why – it needs some cleaning up around the lines which I might do if I have some time. But I’ll take being top of that list.
The vast worlds of MMORPGs seem close to postmodern theories of identity, as a player is able to radically constitute their on-line self at will. Despite this, these virtual gaming communities should not be seen as safe spaces in which a subject can realize their true (or ideal) self.