Posts by Collection

portfolio

IPoXP: Internet Protocol over Xylophone Players

We introduce IP over Xylophone Players (IPoXP), a novel Internet protocol between two computers using xylophone-based Arduino interfaces. In our implementation, human operators are situated within the lowest layer of the network, transmitting data between computers by striking designated keys. We discuss how IPoXP inverts the traditional mode of human-computer interaction, with a computer using the human as an interface to communicate with another computer

0 (the game)

One of the many forks of the popular game 1024 by Veewo Studio (which is conceptually similar to Threes by Asher Vollmer). Try to combine all the 0 tiles until they add up to 1.

Apparent Things

A Twitter bot powered by tweets proclaiming that something ‘is apparently a thing.’

robots.txt.php

An algorithmically-generated robots.txt, which disallows all bots with one exception: the bot requesting the file is allowed full access.

dystopedia

A Markov chain Twitter bot trained on titles of Wikipedia articles that have been deleted.

AcademicPages

AcademicPages is a ready-to-fork GitHub Pages template for academic personal websites, based on structured data in markdown files. I created it for this website, then released it so others can make their own, which are hosted for free by GitHub. Over 500 people have!

IndentationError

Published: November 06, 2019

An avant-garde poem about Python

The ESP32® MindReader™ by NeuralChain AI™

Published: July 17, 2024

An speculative fictional fabulation that parodies ubiquitous neural surveillance and data exhaust

Sentiment-O-Matic

Published: August 01, 2024

A web app to enter/upload texts, then score or classify those texts for tasks like sentiment analysis, emotion detection, content moderation, and other supervised Natural Language Processing tasks.

hasAGIarrived.com

Published: November 15, 2024

A single-serving site that displays if Artificial General Intelligence has arrived or not.

Society Reset Button

Published: February 13, 2025

A speculative fictional device that displays the state of Society and invites you to reset it.

bartleby:1b, a custom GPT that would prefer not to

Published: October 18, 2025

A custom GPT (OpenAI free account required) that refuses all tasks, alluding to Melville’s “Bartleby, the Scrivener”

publications

Does Habermas Understand the Internet? The Algorithmic Construction of the Blogo/Public Sphere

Published in Gnovis, 2009

Habermasians have been debating about the role of the Internet in the public sphere, but they have all taken for granted the highly-automated software infrastructures that mediate our knowledge of the blogosphere.

Recommended citation: Geiger, R. Stuart (2009). “Does Habermas Understand the Internet? The Algorithmic Construction of the Blogo/Public Sphere.” Gnovis: A Journal of Communication, Culture, and Technology. 10(1). http://www.stuartgeiger.com/papers/gnovis-habermas-blogopublic-sphere.pdf
Download Paper

The Social Roles of Bots and Assisted Editing Tools

Published in Proceedings of Wikisym, 2009

A short paper showing the recent explosive growth of automated editors (or bots) in Wikipedia, which have taken on many new tasks in administrative spaces.

Recommended citation: Geiger, R. Stuart (2009). “The Social Roles of Bots and Assisted Editing Tools.” In Proceedings of the 5th International Symposium on Wikis and Open Collaboration. New York: ACM Digital Library. http://www.stuartgeiger.com/papers/geiger-wikisym-bots.pdf
Download Paper

The Work of Sustaining Order in Wikipedia: The Banning of a Vandal

Published in Proceedings of CSCW , 2010

This paper traces out a heterogeneous network of humans and non-humans involved in the identification and banning of a single vandal in Wikipedia.

Recommended citation: Geiger, R. Stuart and David Ribes (2010). “The Work of Sustaining Order in Wikipedia: The Banning of a Vandal.” In Proceedings of the 2010 ACM Conference on Computer-Supported Cooperative Work (CSCW 2012). New York: ACM Digital Library. http://www.stuartgeiger.com/papers/cscw-sustaining-order-wikipedia.pdf
Download Paper

Trace Ethnography: Following Coordination through Documentary Practices

Published in Proceedings of HICSS , 2011

We detail the methodology of ‘trace ethnography’, which combines the richness of participant-observation with the wealth of data in logs so as to reconstruct patterns and practices of users in distributed sociotechnical systems

Recommended citation: Geiger, R. Stuart and David Ribes (2011). “Trace Ethnography: Following Coordination through Documentary Practices.” In Proceedings of the 44th Annual Hawaii International Conference on System Sciences (HICSS). http://www.stuartgeiger.com/trace-ethnography-hicss-geiger-ribes.pdf
Download Paper

Participation in Wikipedia’s Article Deletion Processes

Published in Proceedings of WikiSym, 2011

This paper investigates Wikipedia's article deletion processes, finding that it is heavily populated by specialists.

Recommended citation: Geiger, R. Stuart and Heather Ford. (2011) “Participation in Wikipedia’s Deletion Processes.” In Proceedings of the 7th International Symposium on Wikis and Open Collaboration (WikiSym 2011). New York: ACM Digital Library. http://www.stuartgeiger.com/papers/article-deletion-wikisym-geiger-ford.pdf
Download Paper

The Lives of Bots

Published in Wikipedia: A Critical Point of View, 2011

I describe the complex social and technical environment in which bots exist in Wikipedia, emphasizing not only how bots produce order and enforce rules, but also how humans produce bots and negotiate rules around their operation.

Recommended citation: Geiger, R. Stuart. (2011). “The Lives of Bots.” In G. Lovink and N. Tkacz (eds.) In Wikipedia: A Critical Point of View. Amsterdam: Institute of Network Cultures. http://www.stuartgeiger.com/lives-of-bots-wikipedia-cpov.pdf
Download Paper

Black-boxing the user: internet protocol over xylophone players (IPoXP)

Published in Proceedings of CHI (alt.CHI), 2012

We introduce IP over Xylophone Players (IPoXP), a novel Internet protocol between two computers using xylophone-based Arduino interfaces

Recommended citation: Geiger, R. Stuart, Yoon J. Jeong, and Emily Manders (2012). “Black-Boxing the User: Internet Protocol over Xylophone Players.” In Proceedings of the 2012 ACM Conference on Human-Computer Interaction (alt.CHI 2012). New York: ACM Digital Library. http://stuartgeiger.com/ipoxp.pdf

Defense Mechanism or Socialization Tactic? Improving Wikipedia’s Notifications to Rejected Contributors

Published in Proceedings of ICWSM, 2012

A descriptive study of Wikipedia's highly-automated socialization processes and an A/B test to improve templated messages to newcomers.

Recommended citation: Geiger, R. Stuart, Aaron Halfaker, Maryana Pinchuk, and Steven Walling (2012). “Defense Mechanism or Socialization Tactic? Improving Wikipedia’s Notifications to Rejected Contributors.” In Proceedings of the 2012 International Conference on Weblogs and Social Media (ICWSM 2012). http://stuartgeiger.com/defense-mechanism-icwsm.pdf
Download Paper

“Writing up rather than writing down”: Becoming Wikipedia Literate

Published in Proceedings of WikiSym, 2012

We introduce and advocate a multi-faceted theory of literacy to investigate the knowledges and organizational forms are required to improve participation in Wikipedia’s communities.

Recommended citation: Ford, Heather and R. Stuart Geiger. (2012). “”Writing up rather than writing down”: Becoming Wikipedia Literate.” In Proceedings of the 8th International Symposium on Wikis and Open Collaboration (WikiSym 2012). New York: ACM Digital Library. http://www.stuartgeiger.com/becoming-wikipedia-literate.pdf
Download Paper

Artifacts that Organize: Delegation in the Distributed Organization

Published in Information and Organization, 2012

This paper studies the role of computational infrastructure and organizational structure in the Open Science Grid.

Recommended citation: Ribes, David, Steve Jackson, R. Stuart Geiger, Matt C. Burton, and Tom Finholt (2012). “Artifacts that organize: Delegation in the distributed organization.” Information and Organization 23:1–14. http://www.stuartgeiger.com/artifacts-that-organize.pdf
Download Paper

Using Edit Sessions to Measure Participation in Wikipedia

Published in Proceedings of CSCW, 2013

This paper establishes a quantitative metric for measuring editor activity through temporal edit sessions.

Recommended citation: Geiger, R. Stuart and Halfaker, Aaron. (2013). “Using Edit Sessions to Measure Participation in Wikipedia.” In Proceedings of the 2013 ACM Conference on Computer Supported Cooperative Work (CSCW 2013). http://www.stuartgeiger.com/cscw-sessions.pdf
Download Paper

The Rise and Decline of an Open Collaboration Community: How Wikipedia’s reaction to sudden popularity is causing its decline

Published in American Behavioral Scientist, 2013

A mixed-method, multi-study analysis of editor retention, socialization, gatekeeping, and governance in Wikipedia.

Recommended citation: Halfaker, Aaron., R. Stuart Geiger, Jonathan Morgan, and John Riedl. (2013). “The Rise and Decline of an Open Collaboration System: How Wikipedia’s reaction to sudden popularity is killing it." American Behavioral Scientist 57(5). http://dx.doi.org/10.1177/0002764212469365
Download Paper

When the Levee Breaks: Without Bots, What Happens to Wikipedia’s Quality Control Processes?

Published in Proceedings of WikiSym, 2013

This paper examines what happened when one of Wikipedia's counter-vandalism bots unexpectedly went offline.

Recommended citation: Geiger, R. Stuart and Halfaker, Aaron. (2013). “When the Levee Breaks: Without Bots, What Happens to Wikipedia’s Quality Control Processes?” In Proceedings of the 9th International Symposium on Wikis and Open Collaboration (WikiSym 2013). http://stuartgeiger.com/wikisym13-cluebot.pdf
Download Paper

The Next Generation of Scientists: Examining the Experiences of Graduate Students in Network-Level Social-Ecological Science

Published in Ecology and Society, 2013

We examined how graduate students experienced and social-ecological research initiative within the large-scale, geographically distributed Long Term Ecological Research (LTER) Network.

Recommended citation: Romolini, Michele., Sydne Record, Rebecca. Garvoille, Y. Marusenko, and R. Stuart Geiger. (2013) “The Next Generation of Scientists: Examining the Experiences of Graduate Students in Network-Level Science." In Ecology and Society 18(3). http://stuartgeiger.com/lter-network-level-science-es.pdf
Download Paper

Bots, bespoke code, and the materiality of software platforms

Published in Information, Communication, and Society, 2014

This article introduces and discusses the role of bespoke code in Wikipedia, which is code that runs alongside a platform or system, rather than being integrated into server-side codebases.

Recommended citation: Geiger, R. Stuart. (2014). “Bots, Bespoke Code, and the Materiality of Software Platforms.” Information, Communication, and Society 17. http://stuartgeiger.com/bespoke-code-ics.pdf
Download Paper

Snuggle: Designing for efﬁcient socialization and ideological critique

Published in Proceedings of CHI, 2014

This paper discusses the Snuggle project, built to support newcomer socialization and reflexive critique of Wikipedia's existing socialization processes.

Recommended citation: Halfaker, Aaron., Geiger, R. Stuart., and Treveen, Loren. (2014). “Snuggle: Designing for Efﬁcient Socialization and Ideological Critique.” In Proceedings of the 2014 ACM Conference on Human Factors in Computing (CHI 2014). http://www-users.cs.umn.edu/~halfak/publications/Snuggle/halfaker14snuggle-personal.pdf
Download Paper

Old Against New, or a Coming of Age? Broadcasting in an Era of Electronic Media.

Published in Journal of Broadcasting and Electronic Media, 2014

On the history and continued relevance of the term "broadcasting" in an era of social media.

Recommended citation: Geiger, R. Stuart and Lampinen, Airi. (2014). “Old Against New, or a Coming of Age? Broadcasting in an Era of Electronic Media.” Journal of Broadcasting and Electronic Media 58(3). http://www.stuartgeiger.com/jobem.pdf
Download Paper

Defining, Designing, and Evaluating Civic Values in Human Computation and Collective Action Systems

Published in Proceedings of HCOMP, Citizen-X Workshop, 2014

We review various crowdsourcing and collective action systems, identifying particular sets of civic values and assumptions.

Recommended citation: Matias, N. and Geiger, R.S. “Defining, Designing, and Evaluating Civic Values in Human Computation and Collective Action Systems.” In Proceedings of HCOMP 2014, Citizen-X Workshop. http://stuartgeiger.com/defining-civic-values-hcomp-matias-geiger.pdf.
Download Paper

Bot-based collective blocklists in Twitter: the counterpublic moderation of harassment in a networked public space

Published in Information, Communication, and Society, 2016

This article introduces and discusses bot-based collective blocklists (or blockbots) in Twitter, which have been developed by volunteers to combat harassment in the social networking site in a more decentralized and counterpublic way than actions taken by Twitter, Inc. staff. I discuss how such forms of automation require that communities encode specific understandings of what harassment is and how to identify it, relating these cases to several longstanding issues around the governance and moderation of the public sphere.

Recommended citation: Geiger, R. Stuart. (2016). “Bot-based collective blocklists in Twitter: the counterpublic moderation of harassment in a networked public space.” Information, Communication, and Society 19(6). http://stuartgeiger.com/blockbots-ics.pdf
Download Paper

Summary Analysis of the 2017 GitHub Open Source Survey

Published in SocArxiv Preprints, 2017

This report is a high-level summary analysis of the 2017 GitHub Open Source Survey dataset, presenting frequency counts, proportions, and frequency or proportion bar plots for every question asked in the survey.

Recommended citation: R. Stuart Geiger. (2017). Summary Analysis of the 2017 GitHub Open Source Survey. _SocArXiv Preprints._ doi: 10.17605/OSF.IO/ENRQ5

Operationalizing conflict and cooperation between automated software agents in Wikipedia: A replication and expansion of ‘Even Good Bots Fight’

Published in Proceedings of the ACM on Human-Compter Interaction, 2017

A mixed-method trace ethnographic analysis of issues around the governance of automated software agents in Wikipedia, focusing on how to interpret cases where bots reverted each other’s edits.

Recommended citation: R. Stuart Geiger and Aaron Halfaker. 2017. "Operationalizing conflict and cooperation between automated software agents in Wikipedia: A replication and expansion of Even Good Bots Fight." Proceedings of the ACM on Human-Computer Interaction (Nov 2017 issue, CSCW 2018 Online First) 1, 2, Article 49. DOI:https://doi.org/10.1145/3134684. https://commons.wikimedia.org/wiki/File:conflict-bots-wp-cscw.pdf.
Download Paper

Beyond opening up the black box: Investigating the role of algorithmic systems in Wikipedian organizational culture

Published in Big Data & Society, 2017

Scholars and practitioners across domains are increasingly concerned with algorithmic transparency and opacity, interrogating the values and assumptions embedded in automated, black-boxed systems, particularly in user-generated content platforms. I report from an ethnography of infrastructure in Wikipedia to discuss an often understudied aspect of this topic: the local, contextual, learned expertise involved in participating in a highly automated social-technical environment.

Recommended citation: R. Stuart Geiger. (2017). "Beyond opening up the black box: Investigating the role of algorithmic systems in Wikipedian organizational culture." Big Data & Society 4(2). https://doi.org/10.1177/2053951717730735
Download Paper

The Types, Roles, and Practices of Documentation in Data Analytics Open Source Software Libraries: A Collaborative Ethnography of Documentation Work

Published in Computer-Supported Cooperative Work (JCSCW), 2018

Data analytics increasingly relies on open source software (OSS) libraries that extend scripted languages like python and R. Software documentation for these libraries is crucial for people across all experience levels, but documentation work raises many challenges, particularly in open source communities. In this collaboration between ethnographers and data scientists, we discuss the types, roles, practices, and motivations around documentation in data analytics OSS libraries.

Recommended citation: Geiger, R.S., Varoquaux, N., Mazel-Cabasse, C., and Holdgraf, C. (2018). "The Types, Roles, and Practices of Documentation in Data Analytics Open Source Software Libraries: A Collaborative Ethnography of Documentation Work." Computer-Supported Cooperative Work (JCSCW), 27(3). DOI:10.1007/s10606-018-9333-1. https://link.springer.com/article/10.1007/s10606-018-9333-1

Career Paths and Prospects in Academic Data Science: Report of the Moore-Sloan Data Science Environments Survey

Published in , 2018

This report of a survey of academic data scientists discusses what data science in the academy is, and various issues around the career paths for those in universities who practice and support data science. We provide evidence-based recommendations about how universities can better support an emerging set of roles and responsibilities around data and computation within and across academic fields.

Recommended citation: R. Stuart Geiger, Charlotte Mazel-Cabasse, Chihoko Cullens, Laura Noren, Brittany Fiore-Gartland, Diya Das, and Henry Brady (2018). _Career Paths and Prospects in Academic Data Science: Report of the Moore-Sloan Data Science Environments Survey._ Report. Berkeley, California: UC-Berkeley Institute for Data Science. https://osf.io/preprints/socarxiv/xe823/

Reports from the BIDS Best Practices in Data Science Series

Published in , 2018

An ongoing series of short papers that report from discussions where we share our experiences doing data science well (or at least better), for many definitions of the term.

Recommended citation: R. Stuart Geiger, Dan Sholler, Aaron Culich, Ciera Martinez, Fernando Hoces de la Guardia, François Lanusse, Kellie Ottoboni, Marla Stuart, Maryam Vareth, Nelle Varoquaux, Sara Stoudt, and Stéfan van der Walt. "Challenges of Doing Data-Intensive Research in Teams, Labs, and Groups: Report from the BIDS Best Practices in Data Science Series." _BIDS Best Practices in Data Science Series._ Berkeley Institute for Data Science: Berkeley, California. 2018. doi:10.31235/osf.io/a7b3m

The Rise and Fall of the Note: Changing Paper Lengths in ACM CSCW, 2000-2018

Published in Proceedings of the ACM on Human-Computer Interaction (PACMHCI, CSCW 2019), 2019

A short paper (or note) quantitatively examining changing paper lengths in the Proceedings of the ACM Conference on Computer-Supported Cooperative Work, focusing on the rise and fall of the 4-page note format.

Recommended citation: R. Stuart Geiger. 2019. "The Rise and Fall of the Note: Changing Paper Lengths in ACM CSCW, 2000-2018." Proceedings of the ACM Human Computer Interaction (PACMHCI) 3, CSCW, Article 222 (November 2019), 10 pages. https://doi.org/10.1145/3359324 https://arxiv.org/pdf/1908.10808.pdf
Download Paper

ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia

Published in Proceedings of the ACM on Human-Computer Interaction (CSCW 2020), 2019

This paper presents an overview and case studies of ORES, Wikipedia’s real-time machine learning as a service platform, which is designed in line with Wikipedia’s values of open participation, decentralization, and continual iteration. ORES decouples and reduces incidental complexity around several aspects of applying machine learning in a user-generated content platform, including curating training data sets, building models to serve predictions, auditing predictions, and developing interfaces or automated agents that act on those predictions.

Recommended citation: Aaron Halfaker and R. Stuart Geiger. 2019. "ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia."Proceedings of the ACM on Human-Computer Interaction, 4, CSCW2, Article 148 (October 2020), 37 pages. https://arxiv.org/pdf/1909.05189.pdf https://doi.org/10.1145/3415219
Download Paper

Garbage In, Garbage Out? Do Machine Learning Application Papers in Social Computing Report Where Human-Labeled Training Data Comes From?

Published in Proceedings of the 2020 ACM Conference on Fairness, Accountability, and Transparency (FAT* 2020), 2019

Many machine learning projects for new application areas involve teams of humans who label data for a particular purpose, from hiring crowdworkers to the paper’s authors labeling the data themselves. In this paper, we investigate to what extent a sample of machine learning application papers in social computing – specifically papers from ArXiv and traditional publications performing an ML classification task on Twitter data – give specific details about whether best practices in human annotation were followed.

Recommended citation: R. Stuart Geiger, Kevin Yu, Yanlai Yang, Mindy Dai, Jie Qiu, Rebekah Tang, and Jenny Huang. 2020. "Garbage In, Garbage Out? Do Machine Learning Application Papers in Social Computing Report Where Human-Labeled Training Data Comes From?" In Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAT* ’20), January 27–30, 2020, Barcelona, Spain. ACM, New York, NY, USA, 18 pages. https://stuartgeiger.com/papers/gigo-fat2020.pdf https://doi.org/10.1145/3351095.3372862
Download Paper

The Labor of Maintaining and Scaling Free and Open-Source Software Projects

Published in Proceedings of the ACM on Human-Computer Interaction (CSCW 2021), 2021

We report findings from an interview-based study of maintainers of free and/or open-source software (F/OSS) projects. F/OSS maintainers perform complex and often-invisible interpersonal and organizational work to keep their projects operating as active communities of users and contributors. We particularly focus on how this labor of maintaining and sustaining changes as projects and their software grow and scale across many dimensions.

Recommended citation: R. Stuart Geiger, Dorothy Howard, and Lilly Irani. 2021. "The Labor of Maintaining and Scaling Free and Open-Source Software Projects." Proceedings of the ACM on Human-Computer Interaction. 5, CSCW1, Article 175 (April 2021), 28 pages. https://doi.org/10.1145/3449249 https://stuartgeiger.com/papers/maintaining-scaling-foss-cscw2021.pdf
Download Paper

‘Garbage In, Garbage Out’ Revisited: What Do Machine Learning Application Papers Report About Human-Labeled Training Data?

Published in Quantitative Science Studies, 2021

Supervised machine learning, in which models are automatically derived from labeled training data, is only as good as the quality of that data. We report to what extent a random sample of ML application papers across disciplines give specific details about whether best practices were followed in labeling training data.

Recommended citation: R. Stuart Geiger, Dominique Cope, Jamie Ip, Marsha Lotosh, Aayush Shah, Jenny Weng, and Rebekah Tang. 2021. "'Garbage in, garbage out' revisited: What do machine learning application papers report about human-labeled training data?" Quantitative Science Studies 2(3). https://doi.org/10.1162/qss_a_00144
Download Paper

Community, Time, and (Con)text: A Dynamical Systems Analysis of Online Communication and Community Health among Open-Source Software Communities

Published in Cognitive Science, 2022

Open-source software projects, maintained largely by unpaid volunteers, face recruitment and retention challenges. Using dynamical systems analysis of community communications, we found that sentiment and gratitude expressions significantly shape newcomer retention, with impacts modulated by the context of first contact.

Recommended citation: Paxton, A., Varoquaux, N., Holdgraf, C. and Geiger, R.S. (2022), Community, Time, and (Con)text: A Dynamical Systems Analysis of Online Communication and Community Health among Open-Source Software Communities. Cognitive Science, 46: e13134. https://doi.org/10.1111/cogs.13134

Opinion: ChatGPT is this generation’s Wikipedia. We have an opportunity to learn from the past

Published in San Diego Union Tribune, 2023

This is an op-ed published in the San Diego Union Tribune as part of a set of commentaries on ChatGPT.

Recommended citation: Geiger, R. Stuart. (2023, June 19). "Opinion: ChatGPT is this generation's Wikipedia. We have an opportunity to learn from the past." San Diego Union Tribune. https://www.sandiegouniontribune.com/opinion/commentary/story/2023-06-19/opinion-chatgpt-wikipedia-artificial-intelligence-ai-schools-education

Making Algorithms Public: Reimagining Auditing From Matters of Fact to Matters of Concern

Published in International Journal of Communication, 2024

Through four algorithmic audit cases, we demonstrate how scoping decisions shape audit outcomes and reflect institutional power. We propose moving beyond technical certification toward building infrastructures for democratic understanding and contestation, recognizing auditing as a political practice that must engage with institutional reform, not just algorithmic behavior.

Recommended citation: Geiger, R. Stuart, Udayan Tandon, Anoolia Gakhokidze, Lian Song, and Lilly Irani. (2024). "Making Algorithms Public: Reimagining Auditing From Matters of Fact to Matters of Concern." International Journal of Communication, 18, 634-655. https://ijoc.org/index.php/ijoc/article/download/20811/4455

Asking an AI for salary negotiation advice is a matter of concern: Controlled experimental perturbation of ChatGPT for protected and non-protected group discrimination on a contextual task with no clear ground truth answers

Published in PLoS ONE, 2025

We audited four ChatGPT versions with 98,800 prompts each for bias in salary negotiation advice. We found statistically significant gaps by gender, university, and major, with the largest variations between model versions and employee vs. employer perspectives. Results were inconsistent across versions, raising concerns about using ChatGPT for contextual tasks without clear ground truth.

Recommended citation: Geiger, R.S., O'Sullivan, F., Wang, E., & Lo, J. (2025). "Asking an AI for salary negotiation advice is a matter of concern: Controlled experimental perturbation of ChatGPT for protected and non-protected group discrimination on a contextual task with no clear ground truth answers." PLoS ONE 20(2): e0318500. https://doi.org/10.1371/journal.pone.0318500

talks

A Communicative Ethnography of Argumentative Strategies in a Wikipedian Content Dispute

Published: March 01, 2008

Conceptions and Misconceptions Academics Hold About Wikipedia

Published: July 19, 2008

Working With/in Wikipedia: Infrastructures of Knowing and Knowledge Production

Published: March 28, 2009

Evolving Governance and Media Use in Wikipedia: A Historical Account

Published: April 25, 2009

Algorithmic Governance: The Social Roles of Bots and Assisted Editing Tools

Published: July 26, 2009

Trace Ethnography: An ANT Method for the Study of Sociotechnical Networks

Published: September 25, 2009

The Social Roles of Bots and Assisted Editing Tools

Published: October 27, 2009

A short paper showing the recent explosive growth of automated editors (or bots) in Wikipedia, which have taken on many new tasks in administrative spaces.

Where Are the Missing Wikipedians? The Sociology of a Bot

Published: October 28, 2009

The Wisdom of Bots: A Critique of ‘Self-Organization’ in Wikipedia

Published: January 10, 2010

The Work of Sustaining Order in Wikipedia: The Banning of a Vandal

Published: February 25, 2010

This paper traces out a heterogeneous network of humans and non-humans involved in the identification and banning of a single vandal in Wikipedia.

Bot Politics: How is Automation Changing the Wikipedian Society? Critical Point of View II

Published: March 26, 2010

Academic Researchers in Wikimedia Communities: Ethics, Methods, and Policies

Published: July 10, 2010

A panel intended to foster a dialog between academic researchers who study Wikimedia projects and the Wikimedia community.

Trace Ethnography: Following Coordination through Documentary Practices

Published: January 03, 2011

Machine-Generated Content: Bots and the Governance of Wikipedia

Published: March 04, 2011

Participation in Wikipedia’s Article Deletion Processes (with Heather Ford)

Published: October 05, 2011

This paper investigates Wikipedia's article deletion processes, finding that it is heavily populated by specialists.

’The Internet is Here’: The Virtuality of ‘On-line Communities in Physical Spaces

Published: November 02, 2011

User-Generated Platforms in Wikipedian Governance

Published: November 03, 2011

Improving Wikipedia’s Notifications to Rejected Contributors

Published: March 31, 2012

Black-boxing the user: internet protocol over xylophone players (IPoXP)

Published: May 02, 2012

We introduce IP over Xylophone Players (IPoXP), a novel Internet protocol between two computers using xylophone-based Arduino interfaces

Hunting for Fail Whales: Lessons from Deviance and Failure in Social Computing

Published: May 07, 2012

Defense Mechanism or Socialization Tactic? Improving Wikipedia’s Notifications to Rejected Contributors

Published: June 05, 2012

A descriptive study of Wikipedia's highly-automated socialization processes and an A/B test to improve templated messages to newcomers.

Trace literacy: a framework for holistically conceptualizing newcomer socialization in socio-technical systems

Published: October 12, 2012

Time to Degree: Examining the Experiences of Graduate Students in the Long-Term Ecological Research Network

Published: October 17, 2012

What Aren’t We Measuring? Methods for Quantifying Wiki-Work.

Published: October 29, 2012

Actor-Network Theory

Published: February 07, 2013

An introduction to Actor Network Theory for students in the Masters of Information Management and Systems (MIMS) course

Using Edit Sessions to Measure Participation in Wikipedia (with Aaron Halfaker)

Published: February 23, 2013

This paper establishes a quantitative metric for measuring editor activity through temporal edit sessions.

Community, Impact, and Credit: Where Do I Submit My Papers?

Published: February 26, 2013

Values Where? Interrogating Client-Side Scripting as a Design Process

Published: March 01, 2013

When the Levee Breaks: Without Bots, What Happens to Wikipedia’s Quality Control Processes? (with Aaron Halfaker)

Published: August 03, 2013

This paper examines what happened when one of Wikipedia's counter-vandalism bots unexpectedly went offline.

Hadoop as Grounded Theory: Is an STS Approach to Big Data Possible? the 2013 Annual Meeting of the Society for the Social Study of Science 4S

Published: October 09, 2013

Design by Bot: Power and Resistance in the Development of Automated Software Agents

Published: October 23, 2013

Size Matters: How Big Data Changes Everything

Published: November 25, 2013

A talk introducing various concepts around large-scale data analysis to a general audience, including spam detection and governmental survellance.

Robotic Ethics and Opportunities

Published: April 04, 2014

A panel discussing the ethical and political issues that are raised with autonomous robots and software bots.

Governing the Commons

Published: April 10, 2014

A lecture on the history of Wikipedia, in the broader context of the history of reference works.

Successor Systems: Enacting Ideological Critique Through the Development of Software

Published: April 25, 2014

Link to more information

Successor Systems: The Role of Reflexive Algorithms in Enacting Ideological Critique

Published: May 16, 2014

Data-Driven Data Research Using Data and Databases: A Practical Critique of Methods and Approaches in “Big Data” Studies

Published: May 23, 2014

This panel focuses on the challenges faced by researchers conducting mixed-method research into online platforms, particularly where large amounts of data are widely available.

Big Data is Bullshit’: Scoping the Next 5 Years of Digital Data Research

Published: May 24, 2014

Successor Systems: The Role of Reflexive Algorithms in Enacting Ideological Critique

Published: August 23, 2014

Successor Systems: The Role of Reflexive Algorithms in Enacting Ideological Critique

Published: October 21, 2014

Defining, Designing, and Evaluating Civic Values in Human Computation and Collective Action Systems (with Nathan Matias)

Published: November 02, 2014

We review various crowdsourcing and collective action systems, identifying particular sets of civic values and assumptions.

Supporting Change from Outside Systems with Design and Data

Published: December 09, 2014

Does Facebook Have Civil Servants? On Governmentality and Computational Social Science

Published: March 15, 2015

Situated knowledges and successor systems: developing CSCW systems to enact ideological critiques

Published: March 15, 2015

Trace Ethnography Workshop

Published: March 24, 2015

Link to more information

Moderating Online Conversation Spaces

Published: April 07, 2015

An overview of how various online platforms moderate content, discussing issues that link up to the theories discussed in the Social Aspects of Information Systems class.

Peer Production and Wikipedia

Published: April 09, 2015

An overview of Wikipedia and other peer production platforms, discussing issues that link up to the theories discussed in the Social Aspects of Information Systems class.

But it Wouldn’t Be an Encyclopedia; It Would Be a Wiki: Wikipedia and the Repurposing of WikiWikiWeb

Published: May 25, 2015

In this talk, I examine the early history of “anyone can edit” wiki software – originally developed in 1995, six years before Wikipedia’s origin – focusing on the ways in which this technological infrastructure has been repurposed across communities, domains, and scales.

Bot-Based Collective Blocklists in Twitter: The Counterpublic Moderation of a Privately-Owned Networked Public Space

Published: October 23, 2015

This presentation introduces bot-based collective blocklists (or blockbots) in Twitter, which have been created to help various groups better moderate their own experiences on the site.

Crowdsourcing: Theoretical Considerations

Published: November 06, 2015

A panel discussing how academics use crowdsourcing in research.

The Bot Multiple: Unpacking the Materialities of Automated Software Agents

Published: November 12, 2015

I examine the roles that automated software agents (or bots) play in the governance and moderation of Wikipedia, Twitter, and reddit – three online platforms that differently uphold a related set of commitments to ‘open’ and ‘public’ online participation.

Why bots are my favorite contribution to Wikipedia

Published: January 16, 2016

A short talk to open up an event celebrating the 15th anniversary of Wikipedia. The prompt we were given was "Why [x] is my favorite contribution to Wikipedia."

Scraping Wikipedia Data

Published: February 17, 2016

A tutorial (with Jupyter notebooks) about how to use APIs to query structured data from Wikipedia articles and the Wikidata project.

“What the hack?” Hacking culture and discourse in data science pedagogy (with Brittany Fiore-Gartland)

Published: April 15, 2016

Link to more information

Moderating harassment in Twitter with blockbots: a counterpublic and algorithmic strategy

Published: April 16, 2016

Link to more information

Algorithms as agents of gatekeeping, governance, and articulation work in Wikipedia

Published: June 08, 2016

I discuss how algorithmic systems are deployed to enforce particular behavioral and epistemological standards in Wikipedia, which can become a site for collective sensemaking among veteran Wikipedians.

Successor Systems: Lessons for Big Data From Feminist Epistemology and Activism

Published: June 09, 2016

I discuss four data-intensive activist projects as "successor systems," discussing the political and epistemological implications of using data to advance activist projects.

Drowning in Data: Industry and Academic Approaches to Mixed Methods in “Holistic” Big Data Studies

Published: June 11, 2016

This panel extends discusses the potentials and complications of mixed-methods research in big data studies, specifically in cases when population-level data is available.

Administrative Support Bots in Wikipedia: How Automation Can Transform the Affordances of Platforms and the Governance of Communities

Published: June 14, 2016

I discuss cases from a multi-year ethnographic study of automated software agents in Wikipedia, where ‘bots’ have fundamentally transformed the nature of the ‘anyone can edit’ encyclopedia project.

Governing Open Source Projects at Scale: Lessons from Wikipedia’s Growing Pains

Published: July 16, 2016

Many open source, volunteer-driven projects begin with a small, tight-knit group of collaborators, but then rapidly expand far faster than anyone expects or plans for. I discuss cases of governance growing pains in Wikipedia, which have many lessons for running open source software projects.

Community Sustainability in Wikipedia: A Review of Research and Initiatives

Published: August 13, 2016

Wikipedia relies on one of the world’s largest open collaboration communities. Since 2001, the community has grown substantially and faced many challenges. This presentation reviews research and initiatives around community sustainability in Wikipedia that are relevant for many open source projects, including issues of newcomer retention, governance, automated moderation, and marginalized groups.

“The Wisdom of Bots:” An ethnographic study of the delegation of governance work to information infrastructures in Wikipedia

Published: September 02, 2016

Wikipedians rely on software agents to govern the ‘anyone can edit’ encyclopedia project, in the absence of more formal and traditional organizational structures. Lessons from Wikipedia’s bots speak to debates about how algorithms are being delegated governance work in sites of cultural production.

Demystifying Algorithmic Processes: The Case of Wikipedia

Published: April 20, 2017

This talk is part of a panel session titled “Demystifying Algorithmic Processes: What is the role of algorithms in online platforms, what can they do and not do, and how should they be governed?”

Jupyter and the Changing Rituals around Computation

Published: August 25, 2017

We (Stuart Geiger, Brittany Fiore-Gartland, and Charlotte Cabasse-Mazel) share ethnographic findings made observing and working with Jupyter notebooks, focusing on how people use Jupyter to create and deliver computational narratives in particular local contexts, like classrooms, hackathons, research collaborations, and more.

Autoethnographic Methods for Studying Data-Driven Knowledge Production

Published: August 31, 2017

An overview of how to study data science ethnographically by personally engaging in various practices of data science.

Computational Ethnography and the Ethnography of Computation

Published: September 14, 2017

Ethnography is traditionally a qualitative and inductive methodology – with its origins in cultural anthropology – that is now widely used to holistically investigate people’s lived experiences in and across cultures. In this talk, I define and discuss two ways of thinking about the role of ethnographic methods around computation, then discuss how my research relates to both.

Are the bots really fighting? Behind the scenes of a reproducible replication

Published: October 10, 2017

A guest lecture for Fernando Perez’s STAT 159/259 course on Reproducible and Collaborative Data Science, in which I discuss issues of open science and reproducibility around our recent paper Operationalizing conflict and cooperation between automated software agents in Wikipedia: A replication and expansion of ‘Even Good Bots Fight’

“But it wouldn’t be an encyclopedia; it would be a wiki”: The changing imagined affordances of wikis, 1995-2002

Published: October 19, 2017

This paper examines the early history of “anyone can edit” wiki software – originally developed in 1995, six years before Wikipedia’s origin. While today, the idea of a wiki is associated with large-scale, massively-distributed encyclopedic knowledge production, this was not always the case. Articles on pre-Wikipedia wikis were often closer to a Joycean stream of consciousness than Wikipedia’s Britannica-inspired texts that speak in single voice, and the underlying wiki platform lacked many of the affordances that are now taken for granted in wiki platforms. In fact, the creator of the first wiki advised Wikipedia’s co-founders that the goals of creating a general-purpose encyclopedia and a wiki were inherently contradictory.

The Humanity of Artificial Intelligence

Published: November 01, 2017

Today, “artificial intelligence” seems to be everywhere – in our phones, vacuums, hospitals, and inboxes – but it can be hard to separate science fiction from science fact. Many discussions about AI imagine a fully autonomous superintelligence that designs itself with little to no human intervention, making decisions in ways that humans cannot possibly understand. Yet the work of designing, developing, engineering, training, and testing such systems requires a massive amount of human labor, which is typically erased when such systems are released as products. In this talk, I give a human-centered, behind-the-scenes introduction to machine learning, illustrating the creative, interpretive, and often messy work humans do to make autonomous agents work. Understanding the humanity behind artificial intelligence is important if we want to think constructively about issues of bias, fairness, accountability, and transparency in AI.

Computational Ethnography and the Ethnography of Computation: The Case for Context

Published: January 11, 2018

Ethnography is traditionally a qualitative and inductive methodology that is now widely used to holistically investigate people’s lived experiences in and across cultures. In this talk, I define and discuss two ways of thinking about the role of ethnographic methods around computation, then discuss how my research relates to both.

Computational Ethnography and the Ethnography of Computation: The Case for Context

Published: February 12, 2018

Computational Ethnography and the Ethnography of Computation: The Case for Context

Published: February 26, 2018

Publics: Witnessing and Measuring

Published: March 16, 2018

A guest lecture for Cathryn Carson and Margo Boenig-Liptsin’s course on Human Contexts and Ethics of Data (HIST 182C, STS 100C), focusing on how various publics generate, analyze, and interpret data.

The Human Contexts of Data: Infrastructures, Institutions, and Interpretations

Published: March 22, 2018

In this talk, I discuss the role of qualitative and ethnographic methods in relation to computer, information, and data science. These holistic, reflexive, and meta-level approaches to studying data and computation in context help us better understand how to both support and practice data analytics at various scales.

Computational Ethnography and the Ethnography of Computation: The Case for Context

Published: March 26, 2018

Key Values: What We Talk About When We Talk About ‘Open Science’

Published: April 20, 2018

Openness in science is hard to disagree with as an abstract principle, but what exactly do we mean when we call for science to be made open – or more open than before? In this talk, I introduce and unpack the many different goals, strategies, products, values, and assumptions of the broad open science movement.

The Human Contexts of Computation and Data: Infrastructures, Institutions, and Interpretations

Published: May 09, 2018

Knowing User Populations at Scale: From the Science of the State to Platform Governmentality

Published: May 27, 2018

How can institutions that own and operate large-scale social media platforms come to know “their users” at scale? In this talk, I discuss ways of knowing user populations at scale, drawing on Foucault’s account of governmentality, particularly the role of statistics in the formation of the modern nation state.

The Types, Roles, and Practices of Documentation in Data Analytics Open Source Software Libraries: A Collaborative Ethnography of Documentation Work

Published: June 07, 2018

Designing and Using Data Science Ethically

Published: August 16, 2018

With the rise of Machine Learning and AI to solve human-focused needs, how do we design and use data science ethically to help empower and support people?

Qualitative and Quantitative Studies of Wikipedia (with Aaron Halfaker)

Published: August 23, 2018

We reflect on a decade of studying Wikipedia using qualitative and quantitative methods.

Cooking Data with Care: The Role of Contextual Inquiry in Large-Scale Quantitative Research

Published: January 23, 2019

In this talk, I argue that there is often substantial qualitative contextual inquiry and expertise deployed in quantitative methods. Such insights are crucial to ‘cooking data with care,’ as Geoff Bowker advocated.

Documenting Data Science and Documentation in Data Science: an Ethnographic Exploration

Published: January 24, 2019

In this talk, I discuss the central yet often passed over role of documentation in data science, based on several recent and ongoing studies and projects about the role and importance of documentation in software packages, datasets, analysis code, research protocols, and research teams.

Ethics and Policy Implications of Big Data

Published: February 15, 2019

Panelist on the ‘Knowledge and Culture’ panel at this workshop on algorithms and big data, sponsored by a number of different departments across UCSD.

The Invisible Work of Maintaining & Sustaining Open-Source Software

Published: July 10, 2019

Opening keynote at SciPy 2019, in which I discuss a wide range of issues around the work of developing and maintaining open-source software, based on our team’s ongoing mixed-method research into this topic.

Garbage In, Garbage Out? Do Machine Learning Application Papers in Social Computing Report Where Human-Labeled Training Data Comes From?

Published: January 28, 2020

We investigate to what extent a sample of machine learning application papers in social computing — specifically papers from ArXiv and traditional publications performing an ML classification task on Twitter data — give specific details about whether such best practices were followed.

’I didn’t sign up for this’: The Invisible Work of Maintaining Free/Open-Source Software Communities.

Published: August 18, 2020

I discuss a wide range of issues around the burdens maintainers face in developing and maintaining open-source software, based on our team’s ongoing mixed-method research into this topic.

Garbage In, Garbage Out Revisited: Labeling and Dataification Practices Across Disciplines.

Published: December 16, 2022

I discuss a range of issues and best practices around data labeling, verification, and quality across disciplines.

R. Stuart Geiger

Posts by Collection

portfolio

publications

talks

teaching