Bibliography

Below can be found a working collection of publications on machine learning, from the technical to the philosophical. This bibliography is by no means exhuastive, but covers a large range of works that together provide a useful tool for critical machine learning research.

Each publication is assigned topic tags. Click on the button below to expand the tags and search in-page to see the associated publications.

Tags: Abstraction | Accountability | Adversariality | Affect | Agency | AI art | Algorithmic genealogy | Algorithmic thought | Algorithmic unconscious | Algorithms in law | Anthropomorphism | Antiblackness | Artificial Intelligence | Attention | Authorship | BERT | Bias | Big data | Biometrics | Black boxes | Blackness and anti-blackness | Carcerality | CNN | Cognition | Complexity | Computational statistics | Computer vision | Computing methodology | Concept-based explainability | Connectionism | Convolution | Cultural analytics | Culture | Data choices | Data ethics | Data society | Databases | Datasets | Decision-making | Decisionism | Decolonization | Deep learning | Deep neural networks | Design | Determinism | Difference | Differences in algorithms | Digital methods | Disciplinarity | Disciplinary concerns | Discourse | Discrimination | Discrimination in law | Domain expertise | Doubt | Empiricism | Epidemiology | Ethics | Ethics and design | Ethics of design | Ethnography | Evaluation | Explainability | Facebook | Fairness | FAT | Foundation models | Game theory | GANs | Gender | Genealogy | Genomics | Gestalt theory | Governmentality | GPT | GPT-2 | GPT-3 | Habitus | Hate speech | Hermeneutics | History | Homophily | Human agency | Humans vs. computers | Idealizations | Image-to-image translation | Imaginations | Indexicality | Inductive biases | Industries | Information visualization | Infrastructure | Institutions | Intentional stance | Interpretability | Interpretability tradeoffs | Jurisprudence | Knowledge representation | Language | Language models | Law | Legality | Limitations | Machine decisions | Mythology/allegory | Media analytics | Medicine | Methodology | Micropolitics | ML genealogy | Model selection | Modeling the senses | Modularity | Multilingual machines | N. Katherine Hayles | Neural machine translation | Neural networks | Neuroscience | NLI | NLP | No-free-lunch theorems | Nooscope | Nuisance variation | Olfactory perception | Opacity | OpenAI | Optics and perception | Perception | Phenomenology | Platforms | Political questions | Politics | Positivism | Post-truth | Posthumanism | Pragmatics | Predictive crime | Psychoanalysis | Psychology | Public health | Queer epistemology | Race | Race as covariate | Race and visuality | Recommender systems | Reinforcement Learning | Representation | Restorative justice | Rights | Semiotics | Sex | SHAP values | Shortcomings | Social groups | Social learning | Social media | Social questions | Sociality | Society | Sociolinguistics | Sociological theory | Sociotechnical imaginaries | Strategic manipulation | Subjectivity | Subversive AI | Surveillance | Technical paper | Technicity | Translation | Transformers | Transparency | Trustworthy AI | Twitter | Uncertainty | Values | Word2Vec | XAI | YouTube

Adadi, Amina and Mohammed Berrada. “Peeking inside the black-box: A survey on explainable artificial intelligence (XAI).” IEEE Access 6 (September, 2018): 52138–52160. Tags: XAI, black boxes
At the dawn of the fourth industrial revolution, we are witnessing a fast and widespread adoption of artificial intelligence (AI) in our daily life, which contributes to accelerating the shift towards a more algorithmic society. However, even with such unprecedented advancements, a key impediment to the use of AI-based systems is that they often lack transparency. Indeed, the black-box nature of these systems allows powerful predictions, but it cannot be directly explained. This issue has triggered a new debate on explainable AI (XAI). A research field holds substantial promise for improving trust and transparency of AI-based systems. It is recognized as the sine qua non for AI to continue making steady progress without disruption. This survey provides an entry point for interested researchers and practitioners to learn key aspects of the young and rapidly growing body of research related to XAI. Through the lens of the literature, we review the existing approaches regarding the topic, discuss trends surrounding its sphere, and present major research trajectories.

Alkhatib, Ali. “Anthropological/Artificial Intelligence & the HAI.” Last modified March 26, 2019. https://ali-alkhatib.com/blog/anthropological-intelligence. Tags: Industries, ethics
Last week Stanford launched the institute for human-centered artificial intelligence, and to kick things off James Landay posted about the roles AI could play in society, and the importance of exploring smart interfaces.

Amaro, Ramon. “As If.” e-flux architecture vol. 97 (February, 2019). Tags: Computer vision, optics and Perception, Blackness and anti-Blackness, race and visuality.
In 2016, Joy Buolamwini, a researcher with the Civic Media group at the MIT Media Lab and founder of Code4Rights, developed the Aspire Mirror. Buolamwini describes the Mirror on its website as a device that allows one to “see a reflection of [their] face based on what inspires [them] or what [they] hope to empathize with.”1 The project draws inspiration from Thing From The Future, an imagination card game that asks players to collaboratively and competitively describe objects from a range of alternative futures. The Mirror draws additional influence from futuristic machines and speculative imaginaries found in popular science fiction novels (for instance, the empathy box and mood organ in Phillip K. Dick’s Do Androids Dream of Electric Sheep?, tales of shape shifting from the Ghanaian tale of the spider Anansi, and movies like Transformers). Buolamwini says she developed the Mirror to induce empathies that can help facilitate the spread of compassion in humanity. Another important goal of the Mirror is to catalyze individual reflection based on a set of cultural values like humility, dedication, oneness with nature, harmony, faith and self-actualization. Ultimately for Buolamwini, these transformative futures are a “hall of possibilities” where individuals can explore self-determinant futures, “if only for a small period of time.”
Aspire Mirror relies on facial detection and tracking software to capture and interpret image data before transforming them into futuristic scenes or “paintings.” During testing, Buolamwini encountered a problem. The Mirror could not detect details of her presence due to her dark skin tones and facial features. In order to validate the device and generate an alternative reality, Buolamwini had to first alter her existing appearance to make herself visible, and gain access to aspirational futures. She accomplished this by wearing a white facial mask, with features that were more easily detected. For Buolamwini, this was no surprise. Buolamwini had encountered this limitation previously, while developing a previous computer vision system.

Amoore, Louise. “Doubt and the Algorithm: On the Partial Accounts of Machine Learning.” Theory, Culture & Society 36, no. 6 (November 2019): 147–69. https://doi.org/10.1177/0263276419851846. Tags: Ethics, Politics, Posthumanism, doubt.
In a 1955 lecture the physicist Richard Feynman reflected on the place of doubt within scientific practice. ‘Permit us to question, to doubt, to not be sure’, proposed Feynman, ‘it is possible to live and not to know’. In our contemporary world, the science of machine learning algorithms appears to transform the relations between science, knowledge and doubt, to make even the most doubtful event amenable to action. What might it mean to ‘leave room for doubt’ or ‘to live and not to know’ in our contemporary culture, where the algorithm plays a major role in the calculability of doubts? I propose a posthuman mode of doubt that decentres the liberal humanist subject. In the science of machine learning algorithms the doubts of human and technological beings nonetheless dwell together, opening onto a future that is never fully reduced to the single output signal, to the optimised target.

Amoore Louise. “Introduction: Thinking with Algorithms: Cognition and Computation in the Work of N. Katherine Hayles.” Theory, Culture & Society 36, no. 2 (March 2019):3-16. doi:10.1177/0263276418818884. Tags: Ethics, cognition, N. Katherine Hayles.
In our contemporary moment, when machine learning algorithms are reshaping many aspects of society, the work of N. Katherine Hayles stands as a powerful corpus for understanding what is at stake in a new regime of computation. A renowned literary theorist whose work bridges the humanities and sciences among her many works, Hayles has detailed ways to think about embodiment in an age of virtuality (How We Became Posthuman, 1999), how code as performative practice is located (My Mother Was a Computer, 2005), and the reciprocal relations among human bodies and technics (How We Think, 2012). This special issue follows the 2017 publication of her book Unthought: The Power of the Cognitive Nonconscious, in which Hayles traces the nonconscious cognition of biological life-forms and computational media. The articles in the special issue respond in different ways to Hayles’ oeuvre, mapping the specific contours of computational regimes and developing some of the ‘inflection points’ she advocates in the deep engagement with technical systems.

Andersen, Jack. “Understanding and Interpreting Algorithms: Toward a Hermeneutics of Algorithms.” Media, Culture & Society 42, no. 7–8 (October 2020): 1479–1494, doi:10.1177/0163443720919373. Tags: Hermeneutics
This article develops a hermeneutics of algorithms. By taking a point of departure in Hans-Georg Gadamer’s philosophical hermeneutics, developed in Truth and Method, I am going to examine what it means to understand algorithms in our lives. A hermeneutics of algorithms is consistent with the fact that we do not have direct access to the meaning of algorithms in the same way as we do not have direct access to the meaning of other cultural artifacts. We are forced to interpret cultural artifacts in order to make meaning out of them. The act of interpretation is an action on behalf of the interpreter. However, interpreters are not free to interpret cultural artifacts in whatever way they like. Interpreters are bound by the cultural artifact and its embeddedness in tradition. Furthermore, the act of interpretation is not to recover the historicity of the cultural artifact. Rather, interpretation concerns the way we make sense of algorithms in everyday life and how they are part of a tradition. It is about living with algorithms. Understanding and interpreting algorithms are therefore a mode of existence and mode of living with and enacting algorithms.

Apprich, Clemens. “Secret Agents: A Psychoanalytic Critique of Artificial Intelligence and Machine Learning.” Digital Culture & Society 4, no. 1 (June 2018): 29–44. doi: https://doi.org/10.25969/mediarep/13524. Tags: Psychoanalysis, representation, neural networks, determinism
“Good Old-Fashioned Artificial Intelligence” (GOFAI), which was based on a symbolic information-processing model of the mind, has been superseded by neural-network models to describe and create intelligence. Rather than a symbolic representation of the world, the idea is to mimic the structure of the brain in electronic form, whereby artificial neurons draw their own connections during a self-learning process. Critiquing such a brain physiological model, the following article takes up the idea of a “psychoanalysis of things” and applies it to artificial intelligence and machine learning. This approach may help to reveal some of the hidden layers within the current A. I. debate and hints towards a central mechanism in the psycho-economy of our socio-technological world: The question of “Who speaks?”, central for the analysis of paranoia, becomes paramount at a time, when algorithms, in the form of artificial neural networks, operate more and more as secret agents.

Ashwin, William Agnew, Umut Pajaro, Hetvi Jethwani, and Arjun Subramonian. “Rebuilding Trust: Queer in AI Approach to Artifical Intelligence Risk Management.” Preprint, submitted in January 2022. arXiv:2110.09271. Tags: Queer epistemology, trustworthy AI.
Trustworthy artificial intelligence (AI) has become an important topic because trust in AI systems and their creators has been lost. Researchers, corporations, and governments have long and painful histories of excluding marginalized groups from technology development, deployment, and oversight. As a result, these technologies are less useful and even harmful to minoritized groups. We argue that any AI development, deployment, and monitoring framework that aspires to trust must incorporate both feminist, non-exploitative participatory design principles and strong, outside, and continual monitoring and testing. We additionally explain the importance of considering aspects of trustworthiness beyond just transparency, fairness, and accountability, specifically, to consider justice and shifting power to the disempowered as core values to any trustworthy AI system. Creating trustworthy AI starts by funding, supporting, and empowering grassroots organizations like Queer in AI so the field of AI has the diversity and inclusion to credibly and effectively develop trustworthy AI. We leverage the expert knowledge Queer in AI has developed through its years of work and advocacy to discuss if and how gender, sexuality, and other aspects of queer identity should be used in datasets and AI systems and how harms along these lines should be mitigated. Based on this, we share a gendered approach to AI and further propose a queer epistemology and analyze the benefits it can bring to AI. We additionally discuss how to regulate AI with this queer epistemology in vision, proposing frameworks for making policies related to AI & gender diversity and privacy & queer data protection.

Asif, Manan Ahmed. “Technologies of Power - From Area Studies to Data Sciences.” Spheres: A Journal for Digital Cultures 5 (November 2019). Tags: Genealogy, disciplinarity, history, decolonization.
This essay is an attempt to bring together two seemingly divergent trends in the American university of the recent past: first is the disciplinary presence of ‘area studies’ in the US academy since 1958, and the second is the rise of ‘data science institutes’ on US campuses since 2008. The first is responsible for the training of vast numbers of US citizens in languages and cultures (sometimes labeled ‘civilizations’), most frequently, of the places of the world which are of geo-strategic concern to the United States. This training has resulted in the concomitant production of academic scholars of “Near East, East Asia, Middle East, Southeast and South Asia” over the decades with hundreds, perhaps thousands, tenured professorships, monographs etc. The second is the result of a strategic shift of funding away from area studies in 2008 and towards automation, algorithmic capacities, and data analysis which created new offices, new buildings, new faculty positions in data sciences on American campuses. Where the Department of State was the federal funding agency for the first, the Defense Advanced Research Project Agency (DARPA) is often the federal funding agency for the second. What combines the two, this essay will argue, is the presence of philology and the primacy of the military concerns of the state – they are both technologies of power, which ought to be collectively studied. In linking ‘area studies’ to ‘data sciences’, I am arguing not for a simple rhetorical framing but to see how the critical philological method was to the accumulation of data about the colonized body, in continental North America and later in the global south. I offer two interventions: first, a re-definition of ‘data’ in order to fold in the history of philology, and a recognition of the grammar and phrase books and the dictionaries not only as those critical tools of colonization but as well data for it. Second, I argue that we need to build upon Bernard S. Cohn’s work on colonial knowledge production, and envisage an ‘algorithmic modality’ within which both the history of philological sciences and data sciences co-exist for the American imperial past and present.

Barocas, Solon, Andrew D. Selbst, and Manish Raghavan. “The hidden assumptions behind counterfactual explanations and principal reasons.” In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* ’20, Barcelona, Spain, 2020, 80–89. New York: Association for Computing Machinery. Tags: Social questions, interpretability, FAT, explainability
Counterfactual explanations are gaining prominence within technical, legal, and business circles as a way to explain the decisions of a machine learning model. These explanations share a trait with the long-established “principal reason” explanations required by U.S. credit laws: they both explain a decision by highlighting a set of features deemed most relevant—and withholding others. These “feature-highlighting explanations” have several desirable properties: They place no constraints on model complexity, do not require model disclosure, detail what needed to be different to achieve a different decision, and seem to automate compliance with the law. But they are far more complex and subjective than they appear. In this paper, we demonstrate that the utility of feature-highlighting explanations relies on a number of easily overlooked assumptions: that the recommended change in feature values clearly maps to realworld actions, that features can be made commensurate by looking only at the distribution of the training data, that features are only relevant to the decision at hand, and that the underlying model is stable over time, monotonic, and limited to binary outcomes. We then explore several consequences of acknowledging and attempting to address these assumptions, including a paradox in the way that feature-highlighting explanations aim to respect autonomy, the unchecked power that feature-highlighting explanations grant decision makers, and a tension between making these explanations useful and the need to keep the model hidden. While new research suggests several ways that feature-highlighting explanations can work around some of the problems that we identify, the disconnect between features in the model and actions in the real world—and the subjective choices necessary to compensate for this—must be understood before these techniques can be usefully implemented.

Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21), Virtual event Canada, 2021, 610-623. New York: Association for Computing Machinery. DOI:https://doi.org/10.1145/3442188.3445922. Tags: Language models, computing methodology, NLP, limitations, ethics and design, FAT
The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, especially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models.

Binder, Jeffrey M. “Romantic Disciplinarity and the Rise of the Algorithm.” Critical Inquiry 46, no. 4, (June. 2021): 813-834. DOI.org (Crossref), https://doi.org/10.1086/709225. Tags: Algorithmic genealogy
Scholars in both digital humanities and media studies have noted an apparent disconnect between computation and the interpretive methods of the humanities. Alan Liu has argued that literary scholars employing digital methods encounter a “meaning problem” due to the difficulty of reconciling algorithmic methods with interpretive ones. Conversely, the media scholar Friedrich Kittler has questioned the adequacy of hermeneutics as a means of studying computers. This paper argues that that this disconnect results from a set of contingent decisions made in both humanistic and mathematical disciplines in the first half of the nineteenth century that delineated, with implications that continue to resonate in the present day, which aspects of human activity would come to be formalized in algorithms and which would not. I begin with a discussion of Nicolas de Condorcet, who attempted, at the height of the 1789 revolution, to turn algebra into a universal language; his work, I argue, exemplifies the form of algorithmic thinking that existed before the Romantic turn. Next, I discuss William Wordsworth’s arguments about the relationship of poetry and science. While Wordsworth is sometimes viewed as a critic of science, I argue that his polemic is specifically targeted at highly politicized projects like Condorcet’s that sought to supplant existing modes of thought with scientific rationality. Finally, I demonstrate the importance of Romantic thought for George Boole, creator of the logic system that would eventually form the basis of digital electronics. The reason Boole was able to succeed where Condorcet had failed, I argue, was that Romantic notions of culture enabled him to reconcile a mechanical view of mathematical reasoning with an organic view of the development of meaning—a dichotomy that remains a key assumption of computer interfaces in the twenty-first century.

Birhane, Abeba, Pratyusha Kalluri, Dallas Card, William Agnew, Ravit Dotan, and Michelle Bao. “The Values Encoded in Machine Learning Research.” In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’22), Seoul, Republic of Korea, June 21–24, 2022. New York: Association for Computing Machinery. Tags: Ethics, society, values.
Machine learning currently exerts an outsized influence on the world, increasingly affecting institutional practices and impacted communities. It is therefore critical that we question vague conceptions of the field as value-neutral or universally beneficial, and investigate what specific values the field is advancing. In this paper, we first introduce a method and annotation scheme for studying the values encoded in documents such as research papers. Applying the scheme, we analyze 100 highly cited machine learning papers published at premier machine learning conferences, ICML and NeurIPS. We annotate key features of papers which reveal their values: their justification for their choice of project, which attributes of their project they uplift, their consideration of potential negative consequences, and their institutional affiliations and funding sources. We find that few of the papers justify how their project connects to a societal need (15%) and far fewer discuss negative potential (1%). Through line-by-line content analysis, we identify 59 values that are uplifted in ML research, and, of these, we find that the papers most frequently justify and assess themselves based on Performance, Generalization, Quantitative evidence, Efficiency, Building on past work, and Novelty. We present extensive textual evidence and identify key themes in the definitions and operationalization of these values. Notably, we find systematic textual evidence that these top values are being defined and applied with assumptions and implications generally supporting the centralization of power. Finally, we find increasingly close ties between these highly cited papers and tech companies and elite universities. Code: https://github.com/wagnew3/The-Values-Encoded-in-Machine-Learning-Research

Bolukbasi, Tolga, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings.” In Proceedings of of Advances in Neural Information Processing Systems 29 (NIPS 2016), Barcelona, Spain, December 2016, 4349–4357. Red Hook, NY: Curran Associates, Inc. Tags: Gender, bias, NLP
The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because their widespread use, as we describe, often tends to amplify these biases. Geometrically, gender bias is first shown to be captured by a direction in the word embedding. Second, gender neutral words are shown to be linearly separable from gender definition words in the word embedding. Using these properties, we provide a methodology for modifying an embedding to remove gender stereotypes, such as the association between the words receptionist and female, while maintaining desired associations such as between the words queen and female. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks. The resulting embeddings can be used in applications without amplifying gender bias.

Brown, Tom B., Dandelion Mané, Aurko Roy, Martín Abadi, & Justin Gilmer. “Adversarial patch.” Preprint, submitted in May 2018. https://arxiv.org/abs/1712.09665. Tags: Adversariality
We present a method to create universal, robust, targeted adversarial image patches in the real world. The patches are universal because they can be used to attack any scene, robust because they work under a wide variety of transformations, and targeted because they can cause a classifier to output any target class. These adversarial patches can be printed, added to any scene, photographed, and presented to image classifiers; even when the patches are small, they cause the classifiers to ignore the other items in the scene and report a chosen target class.

Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gertchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, BEnjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei. “Language Models are Few-Shot Learners.” Preprint, submitted in July 2020. arXiv:2005.14165v4. Tags: NLP, GPT-3, technical papers.
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.

Bruder, Johannes. “Where the Sun never Shines: Emerging Paradigms of Post-enlightened Cognition.” Digital Culture & Society. 4, no. 1 (June 2018): 133–153. Tags: Cognition, neuroscience, psychology.
In this paper, I elaborate on deliberations of “post-enlightened cognition” between cognitive neuroscience, psychology and artificial intelligence research. I show how the design of machine learning algorithms is entangled with research on creativity and pathology in cognitive neuroscience and psychology through an interest in “episodic memory” and various forms of “spontaneous thought”. The most prominent forms of spontaneous thought – mind wandering and day dreaming – appear when the demands of the environment abate and have for a long time been stigmatized as signs of distraction or regarded as potentially pathological. Recent research in cognitive neuroscience, however, conceptualizes spontaneous thought as serving the purpose of, e. g., creative problem solving and hence invokes older discussions around the links between creativity and pathology. I discuss how attendant attempts at differentiating creative cognition from its pathological forms in contemporary psychology, cognitive neuroscience, and AI puts traditional understandings of rationality into question.

Bruder, Johannes and Orit Halpern. “Optimal Brain Damage: Theorizing our Nervous Present.” Culture Machine, vol. 20 (2021). Tags: Cognition, neuroscience, neural networks.
The COVID 19 pandemic has seemingly naturalized the relationship between computation and human survival. Digital systems, at least in the Global North, sustain our supply chains, labor, vaccine development, public health, and virtually every manner of social life. Nowhere has this link become more powerful then at the intersection of statistics, artificial intelligence and disease modelling.

Bucher, Taina. “A technicity of attention : How software ‘makes sense’”. Culture Machine, vol. 13 (2012). Tags: Governmentality, technicity, Facebook, attention
In this essay, I develop an understanding of a technicity of attention in social networking sites. I argue that these sites treat attention not as a property of human cognition exclusively, but rather as a sociotechnical construct that emerges out of the governmental power of software. I take the Facebook platform as a case in point, and analyse key components of the Facebook infrastructure, including its Open Graph protocol, and its ranking and aggregation algorithms, as specific implementations of an attention economy. Here I understand an attention economy in the sense of organising and managing attention within a localised context. My aim is to take a step back from the prolific, anxiety-ridden discourses of attention and the media which have emerged as part of the so-called ‘neurological turn’ (see Carr, 2012; Wolf, 2007).1 In contrast, this essay focuses on the specific algorithmic and ‘protocological’ mechanisms of Facebook as a proactive means of enabling, shaping and inducing attention, in conjunction with users.

Buckner, Cameron “Empiricism without magic: Transformational abstraction in deep convolutional neural networks.” Synthese, 195, no. 12 (September 2018): 5339–5372. Tags: Abstraction, connectionism, deep learning, convolution, empiricism, nuisance variation
In artificial intelligence, recent research has demonstrated the remarkable potential of Deep Convolutional Neural Networks (DCNNs), which seem to exceed state-of-the-art performance in new domains weekly, especially on the sorts of very difficult perceptual discrimination tasks that skeptics thought would remain beyond the reach of artificial intelligence. However, it has proven difficult to explain why DCNNs perform so well. In philosophy of mind, empiricists have long suggested that complex cognition is based on information derived from sensory experience, often appealing to a faculty of abstraction. Rationalists have frequently complained, however, that empiricists never adequately explained how this faculty of abstraction actually works. In this paper, I tie these two questions together, to the mutual benefit of both disciplines. I argue that the architectural features that distinguish DCNNs from earlier neural networks allow them to implement a form of hierarchical processing that I call “transformational abstraction”. Transformational abstraction iteratively converts sensory-based representations of category exemplars into new formats that are increasingly tolerant to “nuisance variation” in input. Reflecting upon the way that DCNNs leverage a combination of linear and non-linear processing to efficiently accomplish this feat allows us to understand how the brain is capable of bi-directional travel between exemplars and abstractions, addressing longstanding problems in empiricist philosophy of mind. I end by considering the prospects for future research on DCNNs, arguing that rather than simply implementing 80s connectionism with more brute-force computation, transformational abstraction counts as a qualitatively distinct form of processing ripe with philosophical and psychological significance, because it is significantly better suited to depict the generic mechanism responsible for this important kind of psychological processing in the brain.

Buolamwini, Joy and Timnit Gebru. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.” In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, New York, February 2018, 1–15. Proceedings of Machine Learning Research 81. Tags: Social questions, gender, race, computer vision
Recent studies demonstrate that machine learning algorithms can discriminate based on classes like race and gender. In this work, we present an approach to evaluate bias present in automated facial analysis algorithms and datasets with respect to phenotypic subgroups. Using the dermatologist approved Fitzpatrick Skin Type classification system, we characterize the gender and skin type distribution of two facial analysis benchmarks, IJB-A and Adience. We find that these datasets are overwhelmingly composed of lighter-skinned subjects (79.6% for IJB-A and 86.2% for Adience) and introduce a new facial analysis dataset which is balanced by gender and skin type. We evaluate 3 commercial gender classification systems using our dataset and show that darker-skinned females are the most misclassified group (with error rates of up to 34.7%). The maximum error rate for lighter-skinned males is 0.8%. The substantial disparities in the accuracy of classifying darker females, lighter females, darker males, and lighter males in gender classification systems require urgent attention if commercial companies are to build genuinely fair, transparent and accountable facial analysis algorithms.

Burrell, Jenna. “How the machine ‘thinks’: Understanding opacity in machine learning algorithms.” Big Data & Society 3, no. 1 (June 2016). doi:10.1177/2053951715622512. Tags: Opacity, algorithmic thought.
This article considers the issue of opacity as a problem for socially consequential mechanisms of classification and ranking, such as spam filters, credit card fraud detection, search engines, news trends, market segmentation and advertising, insurance or loan qualification, and credit scoring. These mechanisms of classification all frequently rely on computational algorithms, and in many cases on machine learning algorithms to do this work. In this article, I draw a distinction between three forms of opacity: (1) opacity as intentional corporate or state secrecy, (2) opacity as technical illiteracy, and (3) an opacity that arises from the characteristics of machine learning algorithms and the scale required to apply them usefully. The analysis in this article gets inside the algorithms themselves. I cite existing literatures in computer science, known industry practices (as they are publicly presented), and do some testing and manipulation of code as a form of lightweight code audit. I argue that recognizing the distinct forms of opacity that may be coming into play in a given application is a key to determining which of a variety of technical and non-technical solutions could help to prevent harm.

Bzdok, Danilo, Naomi Altman, and Martin Krzywinski. “Statistics versus machine learning.” Nature Methods 15 no. 4 (April 2018): 233–234. Tags: Machine learning, statistics
Many methods from statistics and machine learning (ML) may, in principle, be used for both prediction and inference. However, statistical methods have a long-standing focus on inference, which is achieved through the creation and fitting of a project-specific probability model. The model allows us to compute a quantitative measure of confidence that a discovered relationship describes a ‘true’ effect that is unlikely to result from noise. Furthermore, if enough data are available, we can explicitly verify assumptions (e.g., equal variance) and refine the specified model, if needed. By contrast, ML concentrates on prediction by using general-purpose learning algorithms to find patterns in often rich and unwieldy data. ML methods are particularly helpful when one is dealing with ‘wide data’, where the number of input variables exceeds the number of subjects, in contrast to ‘long data’, where the number of subjects is greater than that of input variables. ML makes minimal assumptions about the data-generating systems; they can be effective even when the data are gathered without a carefully controlled experimental design and in the presence of complicated nonlinear interactions. However, despite convincing prediction results, the lack of an explicit model can make ML solutions difficult to directly relate to existing biological knowledge.

Castelle, Michael. "The Linguistic Ideologies of Deep Abusive Language Classification." In Proceedings of Empirical Methods in Natural Language Processing (EMNLP) Abusive Language Workshop, Brussels, Belgium, November 2018, 160-170. Association for Computational Linguistics. Tags: NLP, sociolinguistics, pragmatics
This paper brings together theories from sociolinguistics and linguistic anthropology to critically evaluate the so-called “language ideologies” — the set of beliefs and ways of speaking about language—in the practices of abusive language classification in modern machine learning-based NLP. This argument is made at both a conceptual and empirical level, as we review approaches to abusive language from different fields, and use two neural network methods to analyze three datasets developed for abusive language classification tasks (drawn from Wikipedia, Facebook, and StackOverflow). By evaluating and comparing these results, we argue for the importance of incorporating theories of pragmatics and metapragmatics into both the design of classification tasks as well as in ML architectures.

Castelle, Michael. “Deep Learning as an Epistemic Ensemble.” · Castelle.org. Last modified September 15, 2018. https://castelle.org/pages/deep-learning-as-an-epistemic-ensemble.html. Tags: deep learning, epistemology
The research area of deep learning undergirding the current explosion of interest in artificial intelligence is, as with many fast-growing fields of study, torn between two narratives: the first a familiar language of revolution and disruption (in this case with an added dose of born-again millenarianism), the other a more restrained (and sometimes dismissive) perspective which often uses as its foil the most hysterical proponents of the former. Are we at the cusp of a revolution in automation and “superintelligence”? Or are these techniques merely the temporary best-in-class performers in the present iteration of machine learning research? Alternatively, once these techniques are understood by a broader variety of thinkers, will there be a conceptual impact from deep learning in the social sciences and humanities? What will be its scale and scope?

Castelle, Michael. “Social Theory for Generative Networks (and Vice Versa).” Castelle.org, Last modified September 28, 2018. https://castelle.org/pages/social-theory-for-generative-networks-and-vice-versa.html Tags: GANs, semiotics, NLP, habitus
Similarly, Mark Zuckerberg’s April 2018 testimony to the U.S. Senate reveals a misrecognition between his company’s ability to detect terrorist propaganda (perhaps identifiable by stereotyped iconography and particular keywords/phrases) and the potential to automatically detect online hate speech, which, according to Zuckerberg, his “A.I. tools” currently find challenging because of their “nuances”. From Zuckerberg’s perspective, with the right training data, specialized recurrent neural network architectures, and an appropriately-defined loss function, it will eventually be possible to efficiently separate the hateful wheat from the chaff. To believe this, however, is to mistake instances of abusive language — as perhaps computer vision practitioners mistake impressionist paintings — for artifacts whose meaningfulness exists independently of their social and interpretative contexts; and it is clear that as machine learning techniques and methodology become increasingly immersed in our everyday social and semiotic lives, ML/DL practitioners will come to need to recognize the relationships between their interventions and more complex understandings of aesthetic and linguistic meaning.

Castelle, Michael. “The Social Lives of Generative Adversarial Networks.” In Proceedings of Conference on Fairness, Accountability, and Transparency (FAT* ’20), Barcelona, Spain, January 2020, 413-425. New York, NY: Association for Computational Machinery. https://doi.org/10.1145/3351095.3373156. Tags: GANs, sociological theory, habitus, social learning, FAT
Generative adversarial networks (GANs) are a genre of deep learning model of significant practical and theoretical interest for their facility in producing photorealistic ‘fake’ images which are plausibly similar, but not identical, to a corpus of training data. But from the perspective of a sociologist, the distinctive architecture of GANs is highly suggestive. First, a convolutional neural network for classification, on its own, is (at present) popularly considered to be an ‘AI’; and a generative neural network is a kind of inversion of such a classification network (i.e. a layered transformation from a vector of numbers to an image, as opposed to a transformation from an image to a vector of numbers). If, then, in the training of GANs, these two ‘AIs’ interact with each other in a dyadic fashion, shouldn’t we consider that form of learning... social? This observation can lead to some surprising associations as we compare and contrast GANs with the theories of the sociologist Pierre Bourdieu, whose concept of the so-called habitus is one which is simultaneously cognitive and social: a productive perception in which classification practices and practical action cannot be fully disentangled. Significantly, Bourdieu used this habitus concept to help explain the reproduction of social stratification in both education and the arts. In the case of learning, Bourdieu showed how educational institutions promote inequality in the name of fairness and meritocracy through the valorization of elite forms of ‘symbolic capital’; and in the arts, he often focused on the disruptive transitions in 19th-century French painting from realism to impressionism. These latter avant-garde movements were often characterized by a stylistic detachment from economic capital, as “art for art’s sake”, and this cultural rejection of objective-maximization—a kind of denial of an aesthetic ‘loss function’—can in turn help highlight a profound paradox at the core of contemporary machine learning research.

Christodoulou, Evangelia, Jie Ma, Gary S. Collins, Ewout W. Steyerberg, Jan Y. Verbakel, and Ben Van Calster. “A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models.” Journal of Clinical Epidemiology 110:12–22, 2019. Tags: Model selection, complexity, technical paper
Objectives: The objective of this study was to compare performance of logistic regression (LR) with machine learning (ML) for clinical prediction modeling in the literature.
Study Design and Setting: We conducted a Medline literature search (1/2016 to 8/2017) and extracted comparisons between LR andML models for binary outcomes.
Results: We included 71 of 927 studies. The median sample size was 1,250 (range 72-3,994,872), with 19 predictors considered (range 5-563) and eight events per predictor (range 0.3-6,697). The most common ML methods were classification trees, random forests, artificial neural networks, and support vector machines. In 48 (68%) studies, we observed potential bias in the validation procedures. Sixty-four (90%) studies used the area under the receiver operating characteristic curve (AUC) to assess discrimination. Calibration was not addressed in 56 (79%) studies. We identified 282 comparisons between an LR and ML model (AUC range, 0.52-0.99). For 145 comparisons at low risk of bias, the difference in logit (AUC) between LR and ML was 0.00 (95% confidence interval,0.18 to 0.18). For 137 comparisons at high risk of bias, logit(AUC) was 0.34 (0.20e0.47) higher for ML.
Conclusion:We found no evidence of superior performance of ML over LR. Improvements in methodology and reporting are needed for studies that compare modeling algorithms.

Cho, Kyunghyun, Bart van Merrienboer, Dzmitry Bahdanau, Yoshua Bengio"On the Properties of Neural Machine Translation: Encoder–Decoder Approaches." Preprint, submitted in October 2014. arXiv:1409.1259. Tags: Neural Machine Translation, technical paper
Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and a decoder. The encoder extracts a fixed-length representation from a variable-length input sentence, and the decoder generates a correct translation from this representation. In this paper, we focus on analyzing the properties of the neural machine translation using two models; RNN Encoder--Decoder and a newly proposed gated recursive convolutional neural network. We show that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Furthermore, we find that the proposed gated recursive convolutional network learns a grammatical structure of a sentence automatically.

Coenen, Andy, Emily Reif, Ann Yuan, Been Kim, Adam Pearce, Fernanda Viégas, Martin Wattenberg. “Visualizing and Measuring the Geometry of BERT.” Preprint, submitted in October 2019. arXiv: 1906.02715, 2019. Tags: NLP, BERT, large language models, technical paper
Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT. At a high level, linguistic features seem to be represented in separate semantic and syntactic subspaces. We find evidence of a fine-grained geometric representation of word senses. We also present empirical descriptions of syntactic representations in both attention matrices and individual word embeddings, as well as a mathematical argument to explain the geometry of these representations.

Covington, Paul, Jay Adams, and Emre Sargin. "Deep neural networks for youtube recommendations." Proceedings of the 10th ACM conference on recommender systems. 2016. Tags: Deep learning, recommender systems, YouTube, technical paper
YouTube represents one of the largest scale and most sophisticated industrial recommendation systems in existence. In this paper, we describe the system at a high level and focus on the dramatic performance improvements brought by deep learning. The paper is split according to the classic two-stage information retrieval dichotomy: first, we detail a deep candidate generation model and then describe a separate deep ranking model. We also provide practical lessons and insights derived from designing, iterating and maintaining a massive recommendation system with enormous user-facing impact.

Cowgill, Bo. 2018. “The impact of algorithms on judicial discretion: Evidence from regression riscontinuities.” NBER working paper, National Bureau of Economics Research, Cambridge, MA. Tags: Social questions, carcerality, law, recidivism
How do judges use algorithmic suggestions in criminal proceedings? I study bail-setting in criminal cases in Broward County Florida, where judges are provided predictions of defendants’ recidivism using an algorithm derived from historical data. The algorithm’s output is continuous, but is shared with judges in rounded buckets (low, medium and high). Using the underlying continuous score, I examine judicial decisions close to the thresholds using a regression discontinuity design. Defendants slightly above the thresholds are detained an average extra one to four weeks before trial, depending on the threshold. Black defendants’ outcomes are more sensitive to the thresholds than white defendants. When I link jail decisions to outcomes, I find that the extra jail-time given to defendants above the thresholds corresponds to a small increase in recidivism within two years. These results suggest that algorithmic suggestions have a causal impact on criminal proceedings and recidivism.

Crowe, Simon. “Micropolitics of a Recommender System - Machine Learning and the Machinic Unconscious.” spheres: A Journal for Digital Cultures 5 (November 2019). Tags: Micropolitics, recommender systems, algorithmic unconscious. | With Crowe (2019)--Source Code
In this text I set out to critically examine part of the source code of the recommender system LightFM. To this end I deploy the micropolitics developed by Gilles Deleuze and Félix Guattari, as well as Andrew Goffey’s and Maurizio Lazzarato’s readings of their micropolitical ideas. I build an argument around Guattari’s suggestion that subjectivity is not solely the product of human brains and bodies, and that the technical machines of computation intersperse with what might be thought of as human in the production of subjectivity. Drawing upon contemporary approaches to the nonhuman, machine learning and planetary-scale computation, I develop a framework that situates the recommender system in assemblages of self-ordering matter and links it to historical practices of control through tabulation. While I acknowledge the power of source code in that it always carries the potential for control, in this reading, I impute greater agency to computation. In what follows, rather than reducing it to an algorithm, I attempt to address the recommender system as manifold: a producer of subjectivity, a resident of planet-spanning cloud computing infrastructures, a conveyor of inscrutable semiotics and a site of predictive control.

Crowe, Simon. “Micropolitics of a Recommender System - Source Code.” spheres: A Journal for Digital Cultures 5 (November 2019). Tags: Micropolitics, recommender systems. | With Crowe (2019)
This text aims to explain some of the source code of the open source recommender system LightFM. This piece of software was originally developed by Maciej Kula while working as a data scientist for the online fashion store Lyst, who aggregates millions of products from across the web. It was written with the aim of recommending fashion products from this vast catalogue based to users with few or no prior interactions with Lyst. At the time of writing, LightFM is still under active development by Kula with minor contributions from 17 other developers over the past three years. The repository is moderately popular, having been starred by 2,032 GitHub users; 352 users have forked the repository, creating their own version of it that they can modify. Users have submitted 233 issues, such as error reports and feature requests to LightFM over the course of its existence, which suggests a modest but active community of users. To put these numbers in perspective, the most popular machine learning framework on Github, Tensorflow, has been starred 113,569 times and forked 69,233 times with 14,306 issues submitted. While the theoretical text that accompanies this one addresses aspects of machine learning in LightFM, none of the source code quoted here actually does any machine learning. Instead, the examples here are chosen to demonstrate how an already trained model is processed so as to arrive at recommendations. The machine learning aspect of LightFM can be briefly explicated using Tim Mitchell’s much-cited definition: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P improves with experience E.” Task T, in this case, is recommending products to users that they are likely to buy. E is historical data on interactions between users and products as well as metadata about those users and products. P is the extent to which the model’s predictions match actual historical user-item interactions.

Das, Sauvik. "Subversive AI: Resisting automated algorithmic surveillance with human-centered adversarial machine learning." Proceedings of Resistance AI Workshop at NeurIPS, Virtual event, December 2020. Tags: Subversive AI, Adversariality, human agency, surveillance.
How can we balance the power dynamics of AI in favor of everyday Internet users, particularly those from populations who are disproportionately harmed by automated algorithmic surveillance? I argue that, when advanced from a human-centered perspective, "adversarial" machine learning (AML) can make way for "subversive" AI (SAI). The goal of SAI is to empower end-users with usable obfuscation technologies that protects the content they share online against automated algorithmic surveillance without affecting how that content is consumed by the intended human audience. SAI employs a humancentered design process that spans three-phases of work: (i) modeling lived threats; (ii) exploratory co-design; and, (iii) implementation with humancentered evaluations. I outline a research agenda for Subversive AI to help orient interested researchers and practitioners.

Dancy, Christopher L. and P. Khalil Saucier. In Press. “AI & Blackness: Towards moving beyond bias and representation. IEEE Transactions on Technology and Society.” IEEE Transactions on Technology and Society 3, no. 1 (March 2022): 31-40. Tags: Antiblackness, bias, representation.
In this paper, we argue that AI ethics must move beyond the concepts of race-based representation and bias, and towards those that probe the deeper relations that impact how these systems are designed, developed, and deployed. Many recent discussions on ethical considerations of bias in AI systems have centered on racial bias. We contend that antiblackness in AI requires more of an examination of the ontological space that provides a foundation for the design, development, and deployment of AI systems. We examine what this contention means from the perspective of the sociocultural context in which AI systems are designed, developed, and deployed and focus on intersections with anti-Black racism (antiblackness). To bring these multiple perspectives together and show an example of antiblackness in the face of attempts at de-biasing, we discuss results from auditing an existing open-source semantic network (ConceptNet). We use this discussion to further contextualize antiblackness in design, development, and deployment of AI systems and suggest questions one may ask when attempting to combat antiblackness in AI systems.

Deng, Jia, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. “ImageNet: a Large-Scale Hierarchical Image Database.” In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, Florida, June 2009, 248-255. Tags: Computer Vision, Deep Learning, Foundation models, databases, technical paper
The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images and multimedia data. But exactly how such data can be harnessed and organized remains a critical problem. We introduce here a new database called “ImageNet”, a large-scale ontology of images built upon the backbone of the WordNet structure. ImageNet aims to populate the majority of the 80,000 synsets of WordNet with an average of 500- 1000 clean and full resolution images. This will result in tens of millions of annotated images organized by the semantic hierarchy of WordNet. This paper offers a detailed analysis of ImageNet in its current state: 12 subtrees with 5247 synsets and 3.2 million images in total. We show that ImageNet is much larger in scale and diversity and much more accurate than the current image datasets. Constructing such a large-scale database is a challenging task. We describe the data collection scheme with Amazon Mechanical Turk. Lastly, we illustrate the usefulness of ImageNet through three simple applications in object recognition, image classification and automatic object clustering. We hope that the scale, accuracy, diversity and hierarchical structure of ImageNet can offer unparalleled opportunities to researchers in the computer vision community and beyond.

Denton, Emily, Alex Hanna, Razvan Amironesei, Andrew Smart, and Hilary Nicole. “On the Genealogy of Machine Learning Datasets: A Critical History of ImageNet.” Big Data & Society 8, no. 2 (July 2021): 1-14. https://doi.org/10.1177/20539517211035955. Tags: Genealogy, ethics, algorithmic fairness.
In response to growing concerns of bias, discrimination, and unfairness perpetuated by algorithmic systems, the datasets used to train and evaluate machine learning models have come under increased scrutiny. Many of these examinations have focused on the contents of machine learning datasets, finding glaring underrepresentation of minoritized groups. In contrast, relatively little work has been done to examine the norms, values, and assumptions embedded in these datasets. In this work, we conceptualize machine learning datasets as a type of informational infrastructure, and motivate a genealogy as method in examining the histories and modes of constitution at play in their creation. We present a critical history of ImageNet as an exemplar, utilizing critical discourse analysis of major texts around ImageNet’s creation and impact. We find that assumptions around ImageNet and other large computer vision datasets more generally rely on three themes: the aggregation and accumulation of more data, the computational construction of meaning, and making certain types of data labor invisible. By tracing the discourses that surround this influential benchmark, we contribute to the ongoing development of the standards and norms around data development in machine learning and artificial intelligence research.

Dobbe, Roel, Sarah Dean, Thomas Gilbert, Nitin Kobli. “A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics.” Preprint, submitted in July 2018. arXiv:1807.00553. Tags: Epistemology, bias, design.
Machine learning (ML) is increasingly deployed in real world contexts, supplying actionable insights and forming the basis of automated decision-making systems. While issues resulting from biases pre-existing in training data have been at the center of the fairness debate, these systems are also affected by technical and emergent biases, which often arise as context-specific artifacts of implementation. This position paper interprets technical bias as an epistemological problem and emergent bias as a dynamical feedback phenomenon. In order to stimulate debate on how to change machine learning practice to effectively address these issues, we explore this broader view on bias, stress the need to reflect on epistemology, and point to value-sensitive design methodologies to revisit the design and implementation process of automated decision-making systems.

Dodge, Jesse, Suchin Gururangan, Dallas Card, Roy Schwartz, and Noah A. Smith. “Show your work: Improved reporting of experimental results.” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, November 2019, 2185–2194. Association for Computational Linguistics. Tags: NLP, methodology, evaluation
Research in natural language processing proceeds, in part, by demonstrating that new models achieve superior performance (e.g., accuracy) on held-out test data, compared to previous results. In this paper, we demonstrate that test-set performance scores alone are insufficient for drawing accurate conclusions about which model performs best. We argue for reporting additional details, especially performance on validation data obtained during model development. We present a novel technique for doing so: expected validation performance of the best-found model as a function of computation budget (i.e., the number of hyperparameter search trials or the overall training time). Using our approach, we find multiple recent model comparisons where authors would have reached a different conclusion if they had used more (or less) computation. Our approach also allows us to estimate the amount of computation required to obtain a given accuracy; applying it to several recently published results yields massive variation across papers, from hours to weeks. We conclude with a set of best practices for reporting experimental results which allow for robust future comparisons, and provide code to allow researchers to use our technique.

Doshi-Velez, Finale and Been Kim. “Towards a rigorous science of interpretable machine learning.” Preprint, submitted in March 2017. arXiv:1702.08608, 2017. Tags: Black boxes
As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs. These explanations are often used to qualitatively assess other criteria such as safety or non-discrimination. However, despite the interest in interpretability, there is very little consensus on what interpretable machine learning is and how it should be measured. In this position paper, we first define interpretability and describe when interpretability is needed (and when it is not). Next, we suggest a taxonomy for rigorous evaluation and expose open questions towards a more rigorous science of interpretable machine learning.

Elsayed, Gamaledin F., Shreya Shankar, Brian Cheung, Nicolas Papernot, Alexey Kurakin, Ian Goodfellow, and Jascha Sohl-Dickstein. “Adversarial examples that fool both computer vision and time-limited humans.” In Proceedings of the 32nd international conference on neural information processing systems, Montreal, Canada, December 2018, 3914–3924. Red Hook, NY: Curran Associates, Inc. Tags: Adversariality, humans vs. computers, computer vision.
Machine learning models are vulnerable to adversarial examples: small changes to images can cause computer vision models to make mistakes such as identifying a school bus as an ostrich. However, it is still an open question whether humans are prone to similar mistakes. Here, we address this question by leveraging recent techniques that transfer adversarial examples from computer vision models with known parameters and architecture to other models with unknown parameters and architecture, and by matching the initial processing of the human visual system. We find that adversarial examples that strongly transfer across computer vision models influence the classifications made by time-limited human observers.

Ernst, Cristoph, Jens Schröter, Andreas Sudmann. “AI and the Imagination to Overcome Difference.” Spheres: A Journal for Digital Cultures 5, (November 2019). Tags: sociotechnical imaginaries, idealizations, imaginations, positivism. | With Garesh (2019)
The history of AI is essentially characterised by high expectations. Much has been written about these expectations and the disappointments they result in. This is due to the fact that future-oriented ideas of what is technically feasible have always been closely related to the ways in which (popular) culture has been imagining different applications of AI. Maybe we are now for the first time confronted with the historical situation in which the divide between AI as science fiction and AI as empirical research has become so minimal that it is no longer an easy task to distinguish both realms. With this contribution, we seek to demonstrate that the high expectations, manifesting both in the historical research as well as in the imagination of AI in (popular) culture, share a substantial similarity: various “sociotechnical imaginaries” to overcome difference.
Given the rapid development of diverse social applications of AI-based technologies, this essay aims to discuss how idealisations of AI as a ‘universal’ technology in key fields of current debates mirror the imaginations (or even phantasms) of overcoming social and cultural differences in particular, and the difference between humans and machines in general. In the following, we give an overview of the concept of a universal translation of language, the idea of machines erasing the difference to human labour, and discuss the notion of ‘autonomy’ in debates on autonomous weapons systems. Of course, the articulation of overcoming difference varies in all of these scenarios. We believe, however, that there are significant similarities and relations between those articulations that reveal important aspects of how AI has been and continues to be imagined and explored.

Eykholt, Kevin, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Songcha. “Robust Physical-World Attacks on Deep Learning Models.” Preprint, submitted April 2018. arXiv: 1707.08945. Tags: Adversarialiy, Deep neural networks, technical paper
Recent studies show that the state-of-the-art deep neural networks (DNNs) are vulnerable to adversarial examples, resulting from small-magnitude perturbations added to the input. Given that emerging physical systems are using DNNs in safety-critical situations, adversarial examples could mislead these systems and cause dangerous situations. Therefore, understanding adversarial examples in the physical world is an important step towards developing resilient learning algorithms. We propose a general attack algorithm, Robust Physical Perturbations (RP2), to generate robust visual adversarial perturbations under different physical conditions. Using the real-world case of road sign classification, we show that adversarial examples generated using RP2 achieve high targeted misclassification rates against standard-architecture road sign classifiers in the physical world under various environmental conditions, including viewpoints. Due to the current lack of a standardized testing method, we propose a two-stage evaluation methodology for robust physical adversarial examples consisting of lab and field tests. Using this methodology, we evaluate the efficacy of physical adversarial manipulations on real objects. With a perturbation in the form of only black and white stickers,we attack a real stop sign, causing targeted misclassification in 100% of the images obtained in lab settings, and in 84.8%of the captured video frames obtained on a moving vehicle(field test) for the target classifier.

Fazi, M. Beatrice. “Introduction: Algorithmic Thought.” Theory, Culture & Society 38, no. 7-8 (December 2021): 5-11. doi:10.1177/02632764211054122. Tags: Epistemology, cognition, algorithmic thought.
In our contemporary moment, when machine learning algorithms are reshaping many aspects of society, the work of N. Katherine Hayles stands as a powerful corpus for understanding what is at stake in a new regime of computation. A renowned literary theorist whose work bridges the humanities and sciences among her many works, Hayles has detailed ways to think about embodiment in an age of virtuality (How We Became Posthuman, 1999), how code as performative practice is located (My Mother Was a Computer, 2005), and the reciprocal relations among human bodies and technics (How We Think, 2012). This special issue follows the 2017 publication of her book Unthought: The Power of the Cognitive Nonconscious, in which Hayles traces the nonconscious cognition of biological life-forms and computational media. The articles in the special issue respond in different ways to Hayles’ oeuvre, mapping the specific contours of computational regimes and developing some of the ‘inflection points’ she advocates in the deep engagement with technical systems.

Fazi, M. Beatrice. “Beyond Human: Deep Learning, Explainability and Representation.” Theory, Culture & Society 38, no. 7-8 (December 2021): 55-77. doi:10.1177/0263276420966386. Tags: Algorithmic thought, deep neural networks, explainability, interpretability.
This article addresses computational procedures that are no longer constrained by human modes of representation and considers how these procedures could be philosophically understood in terms of ‘algorithmic thought’. Research in deep learning is its case study. This artificial intelligence (AI) technique operates in computational ways that are often opaque. Such a black-box character demands rethinking the abstractive operations of deep learning. The article does so by entering debates about explainability in AI and assessing how technoscience and technoculture tackle the possibility to ‘re-present’ the algorithmic procedures of feature extraction and feature learning to the human mind. The article thus mobilises the notion of incommensurability (originally developed in the philosophy of science) to address explainability as a communicational and representational issue, which challenges phenomenological and existential modes of comparison between human and algorithmic ‘thinking’ operations.

Finlayson, Samuel G., John D. Bowers, Joichi Ito, Jonathan L. Zittrain, Andrew L. Beam, and Isaac S. Kohane. “Adversarial attacks on medical machine learning.” Science 363, no. 6433, (March 2019): 1287–1289. Tags: Adversariality, medicine
With public and academic attention increasingly focused on the new role of machine learning in the health information economy, an unusual and no-longer-esoteric category of vulnerabilities in machine-learning systems could prove important. These vulnerabilities allow a small, carefully designed change in how inputs are presented to a system to completely alter its output, causing it to confidently arrive at manifestly wrong conclusions. These advanced techniques to subvert otherwise-reliable machine-learning systems—socalled adversarial attacks—have, to date, been of interest primarily to computer science researchers (1). However, the landscape of often-competing interests within health care, and billions of dollars at stake in systems’ outputs, implies considerable problems. We outline motivations that various players in the health care system may have to use adversarial attacks and begin a discussion of what to do about them. Far from discouraging continued innovation with medical machine learning, we call for active engagement of medical, technical, legal, and ethical experts in pursuit of efficient, broadly available, and effective health care that machine learning will enable.

Fradkov, Alexander L. “Early History of Machine Learning.” IFAC-PapersOnLine 53, no. 2 (July 2020): 1385-1390, ISSN 2405-8963. Tags: History of Machine Learning.
Machine learning belongs to the crossroad of cybernetics (control science) and computer science. It is attracting recently an overwhelming interest, both of professionals and of the general public. In the talk a brief overview of the historical development of the machine learning field with a focus on the development of mathematical apparatus in its first decades is provided. A number of little-known facts published in hard to reach sources are presented.

Frey, William R., Desmond U. Patton, Michael B. Gaskell, and Kyle A. McGregor. “Artificial intelligence and inclusion: Formerly gang-involved youth as domain experts for analyzing unstructured Twitter data.” Social Science Computer Review, 38 no. 1 (February 2018):42–56. Tags: Social media, domain expertise, NLP
Mining social media data for studying the human condition has created new and unique challenges. When analyzing social media data from marginalized communities, algorithms lack the ability to accurately interpret off-line context, which may lead to dangerous assumptions about and implications for marginalized communities. To combat this challenge, we hired formerly gang-involved young people as domain experts for contextualizing social media data in order to create inclusive, community-informed algorithms. Utilizing data from the Gang Intervention and Computer Science Project—a comprehensive analysis of Twitter data from gang-involved youth in Chicago—we describe the process of involving formerly gang-involved young people in developing a new part-of-speech tagger and content classifier for a prototype natural language processing system that detects aggression and loss in Twitter data. We argue that involving young people as domain experts leads to more robust understandings of context, including localized language, culture, and events. These insights could change how data scientists approach the development of corpora and algorithms that affect people in marginalized communities and who to involve in that process. We offer a contextually driven interdisciplinary approach between social work and data science that integrates domain insights into the training of qualitative annotators and the production of algorithms for positive social impact.

Fuchs, Mathis and Ramón Reichert. “Introduction Rethinking AI. Neural Networks, Biometrics and the New Artificial Intelligence.” Digital Culture & Society 4, no. 1 (October 2018): 5–13. Tags: Neural Networks, Artificial Intelligence, Biometrics.
Recently, the long-standing research tradition of Artificial Intelligence has undergone a far-reaching re-evaluation. When Herbert Simon in 1956 announced in one of his classes that “[…] over Christmas Allen Newell and I invented a thinking machine” (Gardner 1985: 146) the pioneers of Artificial Intelligence overstated the possibilities of algorithmic problem solving and they underestimated the pragmatic potential of it. They overrated the power of their program by proposing that human thinking can be performed algorithmically and they underestimated it by not being able to foresee what machine learning algorithms would be able to accomplish some 60 years later. Simon and Newell’s “thinking machine” was the Logic Theorist, a programme that could create logical statements by combining any out of five logical axioms. This was a scientific sensation in the 1950s and was celebrated as evidence-by-machine of Alfred North Whitehead and Bertrand Russel’s theoretical exposition in the Principia Mathematica. (Russel and Whitehead 1910) Russel and Whitehead demonstrated in an intelligent way that logical theorems could be deduced in an entirely formal manner, i. e. without creative intelligence. Raymond Fancher reports that Russel later admitted, that one of the machine deductions was “more elegant and efficient than his own”. (Fancher 1979) Today we have arrived at a state of computational power, that makes automated problem solving of many tasks more efficient than the ones under human conduct. “Elegance” however, seems not to be an issue any longer.

Garesh, Maya Indira. “The Difference that Difference Makes.” Spheres: A Journal for Digital Cultures 5 (November 2019). Tags: Difference, imaginations. With Ernst (2019)
Christoph Ernst, Jens Schröter, and Andreas Sudmann’s essay, “AI and the Imagination to Overcome Difference” examines how the imagination of AI systems emerges from the instrumentalization of technology – that a singular, unified technology will address an astonishing diversity of nuanced social conditions like language translation, work, and the automation of war. There is a flattening of differences, they say, between human and machine, that ignores the social, cultural and political dimensions of these complex technologies. In this comment piece, I want to think through ‘difference’ in terms of some of its synonyms, such as ‘gap’, ‘distinction’, ‘diversity’ and ‘discrimination’; and differences not just between human and machine, but also between humans; and thus discuss the further implications of AI technologies in society.

Gerkin, Richard C. “Parsing Sage and Rosemary in Time: The Machine Learning Race to Crack Olfactory Perception.” Chemical Senses 46, (April 2021): 1-5. Tags: Modeling the senses, olfactory perception.
Color and pitch perception are largely understandable from characteristics of physical stimuli: the wavelengths of light and sound waves, respectively. By contrast, understanding olfactory percepts from odorous stimuli (volatile molecules) is much more challenging. No intuitive set of molecular features is up to the task. Here in Chemical Senses, the Ray lab reports using a predictive modeling framework—first breaking molecular structure into thousands of features and then using this to train a predictive statistical model on a wide range of perceptual descriptors—to create a tool for predicting the odor character of hundreds of thousands of available but previously uncharacterized molecules (Kowalewski et al. 2021). This will allow future investigators to representatively sample the space of odorous molecules as well as identify previously unknown odorants with a target odor character. Here, I put this work into the context of other modeling efforts and highlight the urgent need for large new datasets and transparent benchmarks for the field to make and evaluate modeling breakthroughs, respectively.

Goodfellow, Ian, Jonathon Shlens, and Christian Szegedy. “Explaining and harnessing adversarial Examples.” Preprint, submitted March 2015. arXiv:1412.6572, 2015. Tags: Adversariality, neural networks, technical paper
Several machine learning models, including neural networks, consistently misclassify adversarial examples---inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed input results in the model outputting an incorrect answer with high confidence. Early attempts at explaining this phenomenon focused on nonlinearity and overfitting. We argue instead that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature. This explanation is supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Moreover, this view yields a simple and fast method of generating adversarial examples. Using this approach to provide examples for adversarial training, we reduce the test set error of a maxout network on the MNIST dataset.

Griffiths, Catherine. “Visual Tactics Toward an Ethical Debugging.” Digital Culture & Society. 4, no. 1 (January 2018): 217-226. Tags: Black boxes, interpretability, ethics, ethics of design
To advance design research into a critical study of artificially intelligent algorithms, strategies from the fields of critical code studies and data visualisation are combined to propose a methodology for computational visualisation. By opening the algorithmic black box to think through the meaning created by structure and process, computational visualisation seeks to elucidate the complexity and obfuscation at the heart of artificial intelligence systems. There are rising ethical dilemmas that are a consequence of the use of machine learning algorithms in socially sensitive spaces, such as in determining criminal sentencing, job performance, or access to welfare. This is in part due to the lack of a theoretical framework to understand how and why decisions are made at the algorithmic level. The ethical implications are becoming more severe as such algorithmic decision-making is being given higher authority while there is a simultaneous blind spot in where and how biases arise. Computational visualisation, as a method, explores how contemporary visual design tactics including generative design and interaction design, can intersect with a critical exegesis of algorithms to challenge the black box and obfuscation of machine learning and work toward an ethical debugging of biases in such systems.

Guidotti, Riccardo, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. “A Survey of Methods for Explaining Black Box Models.” ACM Computing Surveys 51, no. 5 (August 2018): Article 93, 42 pages. Tags: Black boxes, interpretability.
In recent years, many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness, sometimes at the cost of sacrificing accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, it explicitly or implicitly delineates its own definition of interpretability and explanation. The aim of this article is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.

Hassabis, Demis, Dharshan Kumaran, Christopher Summerfield, and Matthew Botvinick. “Neuroscience-Inspired Artificial Intelligence.” Neuron 95, no. 2 (June 2017): 245-258. Tags: Neural networks, cognition, neuroscience
The fields of neuroscience and artificial intelligence (AI) have a long and intertwined history. In more recent times, however, communication and collaboration between the two fields has become less commonplace. In this article, we argue that better understanding biological brains could play a vital role in building intelligent machines. We survey historical interactions between the AI and neuroscience fields and emphasize current advances in AI that have been inspired by the study of neural computation in humans and other animals. We conclude by highlighting shared themes that may be key for advancing future research in both fields.

Hadzi, Adnan and Denis Roio. “Restorative Justice in Artificial Intelligence Crimes.” Spheres: A Journal for Digital Cultures 5 (November 2019). Tags: Restorative justice, predictive crime, jurisprudence.
In order to address AI crimes, the paper will start by outlining what might constitute personhood in discussing legal positivism and natural law. Concerning what constitutes AI crimes the paper uses the criteria given in Thomas King et al.’s paper Artificial Intelligence Crime: An Interdisciplinary Analysis of Foreseeable Threats and Solutions,6 where King et al. coin the term “AI crime”, mapping five areas in which AI might, in the foreseeable future, commit crimes, namely: commerce, financial markets, and insolvency; harmful or dangerous drugs; offences against persons; sexual offences; theft and fraud, and forgery and personation.

Hu, Lily, Nicole Immorlica, and Jennifer Wortman Vaughan. “The Disparate Effects of Strategic Manipulation.” Preprint, submitted May 2019. arXiv:1808.08646v4. 2019. Tags: Strategic manipulation, agency.
When consequential decisions are informed by algorithmic input, individuals may feel compelled to alter their behavior in order to gain a system’s approval. Models of agent responsiveness, termed ”strategic manipulation,” analyze the interaction between a learner and agents in a world where all agents are equally able to manipulate their features in an attempt to “trick” a published classifier. In cases of real world classification, however, an agent’s ability to adapt to an algorithm is not simply a function of her personal interest in receiving a positive classification, but is bound up in a complex web of social factors that affect her ability to pursue certain action responses. In this paper, we adapt models of strategic manipulation to capture dynamics that may arise in a setting of social inequality wherein candidate groups face different costs to manipulation. We find that whenever one group’s costs are higher than the other’s, the learner’s equilibrium strategy exhibits an inequality-reinforcing phenomenon wherein the learner erroneously admits some members of the advantaged group, while erroneously excluding some members of the disadvantaged group. We also consider the effects of interventions in which a learner subsidizes members of the disadvantaged group, lowering their costs in order to improve her own classification performance. Here we encounter a paradoxical result: there exist cases in which providing a subsidy improves only the learner’s utility while actually making both candidate groups worse-off—even the group receiving the subsidy. Our results reveal the potentially adverse social ramifications of deploying tools that attempt to evaluate an individual’s “quality” when agents’ capacities to adaptively respond differ.

Hu, Lily and Issa Kohler-Hausmann. “What's Sex Got To Do With Fair Machine Learning?” In Proceedings of Conference on Fairness, Accountability, and Transparency (FAT* ’20), Barcelona, Spain, January 2020, 513-524. New York: Association for Computing Machinery. Tags: Sex, modularity, social groups
The debate about fairness in machine learning has largely centered around competing substantive definitions of what fairness or nondiscrimination between groups requires. However, very little attention has been paid to what precisely a group is. Many recent approaches have abandoned observational, or purely statistical, definitions of fairness in favor of definitions that require one to specify a causal model of the data generating process. The implicit ontological assumption of these exercises is that a racial or sex group is a collection of individuals who share a trait or attribute, for example: the group “female” simply consists in grouping individuals who share female-coded sex features. We show this by exploring the formal assumption of modularity in causal models, which holds that the dependencies captured by one causal pathway are invariant to interventions on any other causal pathways. Modeling sex, for example, in a causal model proposes two substantive claims: 1) There exists a feature, sex-on-its-own, that is an inherent trait of an individual that then (causally) brings about social phenomena external to it in the world; and 2) the relations between sex and its downstream effects can be modified in whichever ways and the former feature would still retain the meaning that sex has in our world. We argue that these ontological assumptions about social groups like sex are conceptual errors. Many of the “effects” that sex purportedly “causes” are in fact constitutive features of sex as a social status. Together, they give the social meaning of sex features, and these social meanings are precisely what make sex discrimination a distinctively morally problematic type of act that differs from mere irrationality or meanness on the basis of a physical feature. Correcting this conceptual error has a number of important implications for how analytic models can be used to detect discrimination. If what makes something discrimination on the basis of a particular social grouping is that the practice acts on what it means to be in that group in a way that we deem wrongful, then what we need from analytic diagrams is a model of what constitutes the social grouping. Only then can we have a normative debate about what is fair or nondiscriminatory vis-à-vis that group. We suggest that formal diagrams of constitutive relations would present an entirely different path toward reasoning about discrimination (and relatedly, counterfactuals) because they proffer a model of how the meaning of a social group emerges from its constitutive features. Whereas the value of causal diagrams is to guide the construction and testing of sophisticated modular counterfactuals, the value of constitutive diagrams would be to identify a different kind of counterfactual as central to an inquiry on discrimination: one that asks how the social meaning of a group would be changed if its non-modular features were altered.

Huang, Ling, Anthony D. Joseph, Blaine Nelson, Benjamin IP Rubinstein, and J. Doug Tygar. "Adversarial machine learning." In Proceedings of the 4th ACM workshop on Security and artificial intelligence, Chicago, Illinois, October 2011, 43-58. ACM. Tags: Adversariality.
In this paper (expanded from an invited talk at AISEC 2010), we discuss an emerging field of study: adversarial machine learning---the study of effective machine learning techniques against an adversarial opponent. In this paper, we: give a taxonomy for classifying attacks against online machine learning algorithms; discuss application-specific factors that limit an adversary's capabilities; introduce two models for modeling an adversary's capabilities; explore the limits of an adversary's knowledge about the algorithm, feature space, training, and input data; explore vulnerabilities in machine learning algorithms; discuss countermeasures against attacks; introduce the evasion challenge; and discuss privacy-preserving learning techniques.

IEE Spectrum, “History of Natural Language Processing.” https://spectrum.ieee.org/tag/history+of+natural+language+processing. Tags: NLP, history

Ilyas, Andrew, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, Aleksander Madry. “Adversarial Examples Are Not Bugs, They Are Features.” Preprint, submitted in August 2019. arXiv: 1905.02175. Tags: Adversariality
Adversarial examples have attracted significant attention in machine learning, but the reasons for their existence and pervasiveness remain unclear. We demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans. After capturing these features within a theoretical framework, we establish their widespread existence in standard datasets. Finally, we present a simple setting where we can rigorously tie the phenomena we observe in practice to a misalignment between the (human-specified) notion of robustness and the inherent geometry of the data.

Jannach, Dietmar, Paul Resnick, Alexander Tuzhilin, and Markus Zanker “Recommender systems---: beyond matrix completion.” Communications of the ACM 59, no. 10 (October 2016): 94-102. 10.1145/2891406. Tags: Recommender systems.
Recommender systems have become a natural part of the user experience in today's online world. These systems are able to deliver value both for users and providers and are one prominent example where the output of academic research has a direct impact on the advancements in industry. In this article, we have briefly reviewed the history of this multidis-ciplinary field and looked at recent efforts in the research community to consider the variety of factors that may influence the long-term success of a recommender system. The list of open issues and success factors is still far from complete and new challenges arise constantly that require further research. For example, the huge amounts of user data and preference signals that become available through the Social Web and the Internet of Things not only leads to technical challenges such as scalability, but also to societal questions concerning user privacy. Based on our reflections on the developments in the field, we finally emphasize the need for a more holistic research approach that combines the insights of different disciplines. We urge that research focuses even more on practical problems that matter and are truly suited to increase the utility of recommendations from the viewpoint of the users.

Jones, Matthew. “How we became instrumentalists (again): Data positivism since World War II.” Historical Studies in the Natural Sciences 48, no. 5 (November 2018): 673–684. Tags: History, computational statistics, big data
In the last two decades, a highly instrumentalist form of statistical and machine learning has achieved an extraordinary success as the computational heart of the phenomenon glossed as “predictive analytics,” “data mining,” or “data science.” This instrumentalist culture of prediction emerged from subfields within applied statistics, artificial intelligence, and database management. This essay looks at representative developments within computational statistics and pattern recognition from the 1950s onward, in the United States and beyond, central to the explosion of algorithms, techniques, and epistemic values that ultimately came together in the data sciences of today. This essay is part of a special issue entitled Histories of Data and the Database edited by Soraya de Chadarevian and Theodore M. Porter.

Kalchbrenner, Nal and Philip Blunsom. "Recurrent Continuous Translation Models". In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, October 2013, 1700–1709. Association for Computational Linguistics. Tags: Neural Machine Translation, technical paper.
We introduce a class of probabilistic continuous translation models called Recurrent Continuous Translation Models that are purely based on continuous representations for words, phrases and sentences and do not rely on alignments or phrasal translation units. The models have a generation and a conditioning aspect. The generation of the translation is modelled with a target Recurrent Language Model, whereas the conditioning on the source sentence is modelled with a Convolutional Sentence Model. Through various experiments, we show first that our models obtain a perplexity with respect to gold translations that is > 43% lower than that of stateof-the-art alignment-based translation models. Secondly, we show that they are remarkably sensitive to the word order, syntax, and meaning of the source sentence despite lacking alignments. Finally we show that they match a state-of-the-art system when rescoring n-best lists of translations.

Kellogg, Sam P. “The Mountain in the Machine: Optimization and the Landscapes of Machine Learning.” Culture Machine 20 (2021). Tags: Epistemology, mythology/allegory.
Where is the privileged site of knowledge today? ‘Prophecies are no longer proclaimed from the mountaintop but from the metrics’, write Arjun Appadurai and Paula Kift (2020), warning of the emplacement of quantification at the very apex of knowledge production. Neither Sinai, nor Sri Pada, nor Olympus, nor Fuji, but Number, set on high with solid footing and a place of normative exception from which to speak. Our metric societies are obsessed with the precise measurement of deviation and error; everything is to be optimized, but to questionable ends. Meanwhile, the essential computability reality is taken for granted, or else the question is sidestepped entirely: what matters most is that an algorithm works (or is thought to work), less often how or why. Despite this, the new machine learning systems will, we are told, eventually realize human-level artificial general intelligence (AGI): the holy grail of the quantified society. Surveyed from the lofty heights of algorithmic rationality, entire worlds unfold as vast terrains of data—measurable and mappable—receding towards the horizon.

Kim, Been, Emily Reif, Martin Wattenberg, Samy Bengio, and Michael C. Mozer. “Neural Networks Trained on Natural Scenes Exhibit Gestalt Closure.” Preprint, submitted in June 2020. arXiv:1903.01069, 2019. Tags: Neural Networks, Gestalt theory, CNN, perception
The Gestalt laws of perceptual organization, which describe how visual elements in an image are grouped and interpreted, have traditionally been thought of as innate despite their ecological validity. We use deep-learning methods to investigate whether natural scene statistics might be sufficient to derive the Gestalt laws. We examine the law of closure, which asserts that human visual perception tends to "close the gap" by assembling elements that can jointly be interpreted as a complete figure or object. We demonstrate that a state-of-the-art convolutional neural network, trained to classify natural images, exhibits closure on synthetic displays of edge fragments, as assessed by similarity of internal representations. This finding provides support for the hypothesis that the human perceptual system is even more elegant than the Gestaltists imagined: a single law---adaptation to the statistical structure of the environment---might suffice as fundamental.

Kleinberg, Jon, Jens Ludwig, Sendhil Mullainathan, and Cass R. Sunstein. “Discrimination in the age of algorithms.” Journal of Legal Analysis 10 (April 2019): 113-174. https://doi.org/10.1093/jla/laz001. Tags: Discrimination, law, discrimination in law, algorithms in law.
The law forbids discrimination. But the ambiguity of human decision-making often makes it hard for the legal system to know whether anyone has discriminated. To understand how algorithms affect discrimination, we must understand how they affect the detection of discrimination. With the appropriate requirements in place, algorithms create the potential for new forms of transparency and hence opportunities to detect discrimination that are otherwise unavailable. The specificity of algorithms also makes transparent tradeoffs among competing values. This implies algorithms are not only a threat to be regulated; with the right safeguards, they can be a potential positive force for equity

Kroll, Joshua A, Joanna Huey, Solon Barocas, Edward W Felten, Joel R Reidenberg, David G Robinson, Harlan Yu. “Accountable Algorithms.” University of Pennsylvania Law Review 165, no. 7 (2017): 633-705. Tags: Accountability, transparency, legality
Many important decisions historically made by people are now made by computers. Algorithms count votes, approve loan and credit card applications, target citizens or neighborhoods for police scrutiny, select taxpayers for IRS audit, grant or deny immigration visas, and more. The accountability mechanisms and legal standards that govern such decision processes have not kept pace with technology. The tools currently available to policymakers, legislators, and courts were developed to oversee human decisionmakers and often fail when applied to computers instead. For example, how do you judge the intent of a piece of software? Because automated decision systems can return potentially incorrect, unjustified, or unfair results, additional approaches are needed to make such systems accountable and governable. This Article reveals a new technological toolkit to verify that automated decisions comply with key standards of legal fairness. We challenge the dominant position in the legal literature that transparency will solve these problems. Disclosure of source code is often neither necessary (because of alternative techniques from computer science) nor sufficient (because of the issues analyzing code) to demonstrate the fairness of a process. Furthermore, transparency may be undesirable, such as when it discloses private information or permits tax cheats or terrorists to game the systems determining audits or security screening. The central issue is how to assure the interests of citizens, and society as a whole, in making these processes more accountable. This Article argues that technology is creating new opportunities—subtler and more flexible than total transparency—to design decision making algorithms so that they better align with legal and policy objectives. Doing so will improve not only the current governance of automated decisions, but also—in certain cases—the governance of decision making in general. The implicit (or explicit) biases of human decision-makers can be difficult to find and root out, but we can peer into the “brain” of an algorithm: computational processes and purpose specifications can be declared prior to use and verified afterward.

Kurakin, Alexey, Ian Goodfellow, Samy Bengio. “Adversarial examples in the physical world.” Preprint, submitted February 2017. arXiv:1607.02533. Tags: Adversariality, technical paper
Most existing machine learning classifiers are highly vulnerable to adversarial examples. An adversarial example is a sample of input data which has been modified very slightly in a way that is intended to cause a machine learning classifier to misclassify it. In many cases, these modifications can be so subtle that a human observer does not even notice the modification at all, yet the classifier still makes a mistake. Adversarial examples pose security concerns because they could be used to perform an attack on machine learning systems, even if the adversary has no access to the underlying model. Up to now, all previous work have assumed a threat model in which the adversary can feed data directly into the machine learning classifier. This is not always the case for systems operating in the physical world, for example those which are using signals from cameras and other sensors as an input. This paper shows that even in such physical world scenarios, machine learning systems are vulnerable to adversarial examples. We demonstrate this by feeding adversarial images obtained from cell-phone camera to an ImageNet Inception classifier and measuring the classification accuracy of the system. We find that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera.

Larsonner, Claire. “The Disruptions of Neural Machine Translation.” Spheres: A Journal for Digital Cultures 5 (November 2019). Tags: Translation, Neural Machine Translation.
According to the Future of Humanity Institute in Oxford, an estimate recently widely relayed by the World Economic Forum, machine translation should outperform human translation by 2024… and so translation and more generally the language services industry stand to be among the first to be disrupted by AI technologies. The recent launch of free online neural machine translation tools, such as DeepL or Google Translate’s new version, are bound to disrupt the translation market due to their worldwide availability and enhanced performance. These tools are already easily accessible either on the net or via apps, and the next step for the industry is to embed them in many smart appliances like digital assistants or cars. This obfuscates the whole production and mediation process, masking the materiality of translation. Yet this very process, the medium of production is essential, especially when dealing with such a core social and political activity as language. In his study of digital economy, Olivier Bomsel pointed out that the medium acts both as a materialization of the symbols it transmits, a tool to organize meaning, a tool to enact physical distribution and an exclusion tool upon which the definition of property and the attending rules rely. One might add that the medium also acts as an identification tool (specifying the source, the author of the text) and as a venue for the conquest and assertion of power. As the materiality of translation disappears from view (and public debate), it becomes urgent to investigate the making of neural machine translation (NMT), to identify the genealogy and specificity of translation tools, to uncover the current sociology and geography of NMT agents, and to examine its impact on our relation to language.

LeCun, Yann, Yoshua Bengio, and Geoffrey E. Hinton. “Deep Learning.” Nature 521 (May 2015): 436–444. Tags: Deep learning, neural networks, technical paper
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

Lipton, Zachary C. “The Mythos of Model Interpretability.” Preprint, submitted March 2017. arXiv:1606.03490, 2019. Tags: Interpretability, black boxes
Supervised machine learning models boast remarkable predictive capabilities. But can you trust your model? Will it work in deployment? What else can it tell you about the world? We want models to be not only good, but interpretable. And yet the task of interpretation appears underspecified. Papers provide diverse and sometimes non-overlapping motivations for interpretability, and offer myriad notions of what attributes render models interpretable. Despite this ambiguity, many papers proclaim interpretability axiomatically, absent further explanation. In this paper, we seek to refine the discourse on interpretability. First, we examine the motivations underlying interest in interpretability, finding them to be diverse and occasionally discordant. Then, we address model properties and techniques thought to confer interpretability, identifying transparency to humans and post-hoc explanations as competing notions. Throughout, we discuss the feasibility and desirability of different notions, and question the oft-made assertions that linear models are interpretable and that deep neural networks are not.

Lipton, Zachary C. and Jacob Steinhardt. “Troubling trends in machine learning scholarship.” Preprint, submitted in July 2018. arXiv:1807.03341, 2018. Tags: Shortcomings, disciplinary concerns
Collectively, machine learning (ML) researchers are engaged in the creation and dissemination of knowledge about data-driven algorithms. In a given paper, researchers might aspire to any subset of the following goals, among others: to theoretically characterize what is learnable, to obtain understanding through empirically rigorous experiments, or to build a working system that has high predictive accuracy. While determining which knowledge warrants inquiry may be subjective, once the topic is fixed, papers are most valuable to the community when they act in service of the reader, creating foundational knowledge and communicating as clearly as possible.
Recent progress in machine learning comes despite frequent departures from these ideals. In this paper, we focus on the following four patterns that appear to us to be trending in ML scholarship: (i) failure to distinguish between explanation and speculation; (ii) failure to identify the sources of empirical gains, e.g., emphasizing unnecessary modifications to neural architectures when gains actually stem from hyper-parameter tuning; (iii) mathiness: the use of mathematics that obfuscates or impresses rather than clarifies, e.g., by confusing technical and non-technical concepts; and (iv) misuse of language, e.g., by choosing terms of art with colloquial connotations or by overloading established technical terms.
While the causes behind these patterns are uncertain, possibilities include the rapid expansion of the community, the consequent thinness of the reviewer pool, and the often-misaligned incentives between scholarship and short-term measures of success (e.g., bibliometrics, attention, and entrepreneurial opportunity). While each pattern offers a corresponding remedy (don't do it), we also discuss some speculative suggestions for how the community might combat these trends.

Lundberg, Scott M. and Su-In Lee. “A Unified Approach to Interpreting Model Predictions.” In Proceedings of Advances in Neural Information Processing Systems 30, Long Beach, California, December 2017, 4765–4774. Red Hook, NY: Curran Associates, Inc. Tags: Interpretability, SHAP values, technical papers.
Understanding why a model makes a certain prediction can be as crucial as the prediction’s accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.

Lundgren, Bjorn. “Ethical machine decisions and the input-selection problem.” Synthese 199 (August 2021): 11423–11443. https://doi.org/10.1007/s11229-021-03296-0. Tags: Ethics, machine decisions, data choices, uncertainty.
This article is about the role of factual uncertainty for moral decision-making as it concerns the ethics of machine decision-making (i.e., decisions by AI systems, such as autonomous vehicles, autonomous robots, or decision support systems). The view that is defended here is that factual uncertainties require a normative evaluation and that ethics of machine decision faces a triple-edged problem, which concerns what a machine ought to do, given its technical constraints, what decisional uncertainty is acceptable, and what trade-ofs are acceptable to decrease the decisional uncertainty

Mackenzie, Adrian. “Machine Learning and Genomic Dimensionality: From Features to Landscapes.” In Postgenomics : Perspectives on Biology after the Genome /, edited by Sarah S. Richardson, and Hallam Stevens, 73-102. Durham ; London :: Duke University Press, 2015. Print. Tags: Genomics
The Google Compute Engine, a globally distributed ensemble of comput-ers, was briefly turned over to exploration of cancer genomics during 2012 and publicly demonstrated during the annual Google I/O conference. Mid-way through the demonstration, in which a human genome is visualized as a ring in “Circos” form (see chap.6 of this volume),1 the speaker, Urs Hölzle, senior vice president of infrastructure at Google, “then went even further and scaled the application to run on 600,000 cores across Google’s global data centers.”2 The audience clapped. The world’s “3rd largest super-computer,” as it was called by TechCrunch, a prominent technology blog, “learns associations between genomic features.”3 We are in the midst of many such demonstrations of “scaling applications” of data in the pursuit of associations between “features.”

Malik, Momin M and Jurgen Pfeffer. “Identifying Pattern Effects in Social Media Data” In Proceedings of the Tenth International AAAI Conference on Web and Social Media, Cologne, Germany, May 2016, 241-250. Palo Alto, California: The AAAI Press. Tags: Bias, datasets, platforms, social media
Even when external researchers have access to social media data, they are not privy to decisions that went into platform design—including the measurement and testing that goes into deploying new platform features, such as recommender systems, seeking to shape user behavior towards desirable ends. Finding ways to identify platform effects is thus important both for generalizing findings, as well as understanding the nature of platform usage. One approach is to find temporal data covering the introduction of a new feature; observing differences in behavior before and after allow us to estimate the effect of the change. We investigate platform effects using two such datasets, the Netflix Prize dataset and the Facebook New Orleans data, in which we observe seeming discontinuities in user behavior but that we know or suspect are the result of a change in platform design. For the Netflix Prize, we estimate user ratings changing by an average of about 3% after the change, and in Facebook New Orleans, we find that the introduction of the ‘People You May Know’ feature locally nearly doubled the average number of edges added daily, and increased by 63% the average proportion of triangles created by each new edge. Our work empirically verifies several previously expressed theoretical concerns, and gives insight into the magnitude and variety of platform effects.

Malik, Momin M. “A Hierarchy of Limitations in Machine Learning.” Preprint, submitted February 2020. ArXiv:2002.05193. Tags: Shortcomings
“All models are wrong, but some are useful,” wrote George E. P. Box (1979). Machine learning has focused on the usefulness of probability models for prediction in social systems, but is only now coming to grips with the ways in which these models are wrong—and the consequences of those shortcomings. This paper attempts a comprehensive, structured overview of the specific conceptual, procedural, and statistical limitations of models in machine learning when applied to society. Machine learning modelers themselves can use the described hierarchy to identify possible failure points and think through how to address them, and consumers of machine learning models can know what to question when confronted with the decision about if, where, and how to apply machine learning. The limitations go from commitments inherent in quantification itself, through to showing how unmodeled dependencies can lead to cross-validation being overly optimistic as a way of assessing model performance.

Manovich, Lev. “Can We Think Without Categories?” Digital Culture & Society 4, no. 1 (2018): 17-27. Tags: Media analytics, cultural analytics, data society
In this article methods developed for the purpose of what I call “Media Analytics” are contextualized, put into a historical framework and discussed in regard to their relevance for “Cultural Analytics”. Largescale analysis of media and interactions enable NGOs, small and big businesses, scientific research and civic media to create insight and information on various cultural phenomena. They provide quantitative analytical data about aspects of digital culture and are instrumental in designing procedural components for digital applications such as search, recommendations, and contextual advertising. A survey on key texts and propositions from 1830 on until the present sketches the development of “Data Society’s Mind”. I propose that even though Cultural Analytics research uses dozens of algorithms, behind them there is a small number of fundamental paradigms. We can think them as types of data society’s and AI society’s cognition. The three most general paradigmatic approaches are data visualization, unsupervised machine learning, and supervised machine learning. I will discuss important challenges for Cultural Analytics research. Now that we have very large cultural data available, and our computers can do complex analysis quite quickly, how shall we look at culture? Do we only use computational methods to provide better answers to questions already established in the 19th and 20th century humanities paradigms, or do these methods allow fundamentally different new concepts?

Manovich, Lev. "Computer vision, human senses, and language of art," AI & SOCIETY 36, no. 4 (December 2021), https://doi.org/10.1007/s00146-020-01094-9. Tags: Computer Vision, language, cultural analytics
What is the most important reason for using Computer Vision methods in humanities research? In this article, I argue that the use of numerical representation and data analysis methods offers a new language for describing cultural artifacts, experiences and dynamics. The human languages such as English or Russian that developed rather recently in human evolution are not good at capturing analog properties of human sensorial and cultural experiences. These limitations become particularly worrying if we want to compare thousands, millions or billions of artifacts—i.e. to study contemporary media and cultures at their new twenty-first century scale. When we instead use numerical measurements of image properties standard in Computer Vision, we can better capture details of a single artifact as well as visual differences between a number of artifacts–even if they are very small. The examples of visual dimensions that numbers can capture better than languages include color, shape, texture, contours, composition, and visual characteristics of represented faces, bodies and objects. The methods of finding structures and relationships in large numerical datasets developed in statistics and machine learning allow us to extend this analysis to very big datasets of cultural objects. Equally importantly, numerical image features used in Computer Vision also give us a new language to represent gradual and continuous temporal changes—something which natural languages are also bad at. This applies to both single artworks such as a film or a dance piece (describing movement and rhythm) and also to changes in visual characteristics in millions of artifacts over decades or centuries.

Marcus, Gary. “Deep learning: A critical appraisal.” Preprint, submitted January 2018. arXiv:1801.00631, 2018. Tags: Deep learning, shortcomings
Although deep learning has historical roots going back decades, neither the term "deep learning" nor the approach was popular just over five years ago, when the field was reignited by papers such as Krizhevsky, Sutskever and Hinton's now classic (2012) deep network model of Imagenet. What has the field discovered in the five subsequent years? Against a background of considerable progress in areas such as speech recognition, image recognition, and game playing, and considerable enthusiasm in the popular press, I present ten concerns for deep learning, and suggest that deep learning must be supplemented by other techniques if we are to reach artificial general intelligence.

McCoy, Tom, Ellie Pavlick, and Tal Linzen. “Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference.” In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, July 2019, 3428–3448, ACL. Tags: NLI, NLP, methodology, datasets
A machine learning system can score well on a given test set by relying on heuristics that are effective for frequent example types but break down in more challenging cases. We study this issue within natural language inference (NLI), the task of determining whether one sentence entails another. We hypothesize that statistical NLI models may adopt three fallible syntactic heuristics: the lexical overlap heuristic, the subsequence heuristic, and the constituent heuristic. To determine whether models have adopted these heuristics, we introduce a controlled evaluation set called HANS (Heuristic Analysis for NLI Systems), which contains many examples where the heuristics fail. We find that models trained on MNLI, including BERT, a state-of-the-art model, perform very poorly on HANS, suggesting that they have indeed adopted these heuristics. We conclude that there is substantial room for improvement in NLI systems, and that the HANS dataset can motivate and measure progress in this area.

McQuillan, Dan. “People’s councils for ethical machine learning.” Social Media + Society 4, no. 2 (June 2018): 1-10. 2056305118768303, 2018. Tags: Ethics, politics
Machine learning is a form of knowledge production native to the era of big data. It is at the core of social media platforms and everyday interactions. It is also being rapidly adopted for research and discovery across academia, business, and government. This article will explores the way the affordances of machine learning itself, and the forms of social apparatus that it becomes a part of, will potentially erode ethics and draw us in to a drone-like perspective. Unconstrained machine learning enables and delimits our knowledge of the world in particular ways: the abstractions and operations of machine learning produce a “view from above” whose consequences for both ethics and legality parallel the dilemmas of drone warfare. The family of machine learning methods is not somehow inherently bad or dangerous, nor does implementing them signal any intent to cause harm. Nevertheless, the machine learning assemblage produces a targeting gaze whose algorithms obfuscate the legality of its judgments, and whose iterations threaten to create both specific injustices and broader states of exception. Given the urgent need to provide some kind of balance before machine learning becomes embedded everywhere, this article proposes people’s councils as a way to contest machinic judgments and reassert openness and discourse.

Mendon-Plasek A. “Mechanized Significance and Machine Learning: Why It Became Thinkable and Preferable to Teach Machines to Judge the World.” In The Cultural Life of Machine Learning, edited by Jonathan Roberge and Michael Castelle. London: Palgrave Macmillan, 2021. https://doi.org/10.1007/978-3-030-56286-1_2. Tags: Genealogy, history
The slow and uneven forging of a novel constellation of practices, concerns, and values that became machine learning occurred in 1950s and 1960s pattern recognition research through attempts to mechanize contextual significance that involved building “learning machines” that imitated human judgment by learning from examples. By the 1960s two crises emerged: the first was an inability to evaluate, compare, and judge different pattern recognition systems; the second was an inability to articulate what made pattern recognition constitute a distinct discipline. The resolution of both crises through the problem-framing strategies of supervised and unsupervised learning and the incorporation of statistical decision theory changed what it meant to provide an adequate description of the world even as it caused researchers to reimagine their own scientific self-identities.

Mikolov, Tomas, Kai Chen, Greg Corrado, Jeffrey Dean. "Efficient Estimation of Word Representations in Vector Space.” Preprint, submitted in September 2013. arXiv:1301.3781. Tags: NLP, Word2Vec, technical paper. With Mikolov (2013)
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.

Mikolov, Tomas, Kai Chen, Greg Corrado, Jeffrey Dean. "Distributed representations of words and phrases and their compositionality.” Preprint, submitted October 2013. arXiv:1310.4546. Tags: NLP, Word2Vec, technical paper. With Mikolove (2013)
The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of “Canada” and “Air” cannot be easily combined to obtain “Air Canada”. Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

Milano, Silvia, Mariarosaria Taddeo, and Luciano Floridi. “Recommender systems and their ethical challenges.” AI & SOCIETY 35, no. 4 (December 2020): 957–967. https://doi.org/10.1007/s00146-020-00950-y. Tags: Ethics, recommender systems
This article presents the first, systematic analysis of the ethical challenges posed by recommender systems through a literature review. The article identifies six areas of concern, and maps them onto a proposed taxonomy of different kinds of ethical impact. The analysis uncovers a gap in the literature: currently user-centred approaches do not consider the interests of a variety of other stakeholders—as opposed to just the receivers of a recommendation—in assessing the ethical impacts of a recommender system.

Mittelstadt, Brent Daniel, Patrick Allo, Mariarosaria Taddeo, Sandra Wachter and Luciano Floridi “The ethics of algorithms: Mapping the debate.” Big Data & Society 3, no. 2 (December 2016): 1-21. Tags: Ethics, discourse
In information societies, operations, decisions and choices previously left to humans are increasingly delegated to algorithms, which may advise, if not decide, about how data should be interpreted and what actions should be taken as a result. More and more often, algorithms mediate social processes, business transactions, governmental decisions, and how we perceive, understand, and interact among ourselves and with the environment. Gaps between the design and operation of algorithms and our understanding of their ethical implications can have severe consequences affecting individuals as well as groups and whole societies. This paper makes three contributions to clarify the ethical importance of algorithmic mediation. It provides a prescriptive map to organise the debate. It reviews the current discussion of ethical aspects of algorithms. And it assesses the available literature in order to identify areas requiring further work to develop the ethics of algorithms.

Mittelstadt, Brent, Chris Russell, and Sandra Wachter. “Explaining explanations in AI.” In Proceedings of FAT*’19: Conference on fairness, accountability, and transparency, Atlanta, Georgia, January 2019, 279-289. ACM. Tags: Explainability, interpretability, accountability.
Recent work on interpretability in machine learning and AI has focused on the building of simplified models that approximate the true criteria used to make decisions. These models are a useful pedagogical device for teaching trained professionals how to predict what decisions will be made by the complex system, and most importantly how the system might break. However, when considering any such model it’s important to remember Box’s maxim that "All models are wrong but some are useful." We focus on the distinction between these models and explanations in philosophy and sociology. These models can be understood as a "do it yourself kit" for explanations, allowing a practitioner to directly answer "what if questions" or generate contrastive explanations without external assistance. Although a valuable ability, giving these models as explanations appears more difficult than necessary, and other forms of explanation may not have the same trade-offs. We contrast the different schools of thought on what makes an explanation, and suggest that machine learning might benefit from viewing the problem more broadly.

Monea, Alexander. “Race and computer vision.” In The democratization of artificial intelligence: Net politics in the era of learning algorithms, edited by Andreas Sudmann, (pp. 189–208). Bliefield, Germany: Transcript (distributed by Columbia University Press). Tags: Computer vision, race, race and visuality, interpretability
Any analysis of the intersection of democracy with AI must first and foremost engage the intersection of AI with pre-existing practices of marginalization. In the United States, perhaps no intersection is more salient than that of AI and race. As AI is increasingly positioned as the future of the economy, the military, state bureaucracy, communication and transportation across time and space, in short, as the bedrock of humanity’s future, questions of how AI intersects with pre-existing practices of racial marginalization become central. These questions are particularly difficult to answer given the black boxed nature of most contemporary AI systems. While it is certainly a worthwhile endeavor to push for increasing transparency into the datasets and algorithms powering AI systems, that transparency lies in an anticipated future and cannot help us now to analyze the operations of current AI systems. This picture is only complicated by the fact that AI systems, particularly those operating at web scale, are difficult for even their engineers to understand at later stages in their operation. For instance, a programmer may be able to easily describe the seed data and the machine learning algorithm that she started with, but may be completely unable to explain the rationale behind the subsequent classifications that the system learns to make. Again, it is certainly worthwhile to call for AI explicability—namely, requiring AI programmers and engineers to develop systems that can explain their decision-making processes or, in the most extreme case, only make decisions that can be explained clearly to a human— this again is an anticipated future that is of little use to answering the immediate question of race and AI which already has dire consequences at this very moment.

Olah, Chris, Arvind Satyanarayan, Ian Johnson, Shan Carter, Ludwig Schubert, Katherine Ye, and Alexander Mordvintsev. “The building blocks of interpretability.” Distill (2018). https://doi.org/10.23915/distill.00010. Tags: Interpretability
With the growing success of neural networks, there is a corresponding need to be able to explain their decisions — including building confidence about how they will behave in the real-world, detecting model bias, and for scientific curiosity. In order to do so, we need to both construct deep abstractions and reify (or instantiate) them in rich interfaces. With a few exceptions, existing work on interpretability fails to do these in concert.

Parisi, Luciana. “Reprogramming Decisionism.” e-flux journal 85 (2017). Tags: Decisionism, affect, post-truth
Post-truth politics is the art of relying on affective predispositions or reactions already known or expressed to stage old beliefs as though they were new. Algorithms are said to capitalize on these predispositions or reactions recorded as random data traces left when we choose this or that music track, this or that pair of shorts, this or that movie streaming website. In other words, the post-truth computation machine does not follow its own internal, binary logic of either/or, but follows instead whatever logic we leave enclosed within our random selections. To the extent that post-truth politics has a computational machine, then, this machine is no longer digital, because it is no longer concerned with verifying and explaining problems. The logic of this machine has instead gone meta-digital because it is no longer concerned with the correlation between truths or ideas on the one hand, and proofs or facts on the other, but is instead overcome by a new level of automated communication enabled by the algorithmic quantification of affects.

Parisi, Luciana. “The alien subject of AI.” Subjectivity 12, no. 1 (March 2019): 27–48. https://doi.org/10.1057/s41286-018-00064-3. Tags: Subjectivity
Immersed in the networks of artificial intelligences that are constantly learning from each other, the subject today is being configured by the automated architecture of a computational sovereignty (Bratton 2015). All levels of decision-making are harnessed in given sets of probabilities where the individuality of the subject is broken into endlessly divisable digits. These are specifically re-assembled at check points (Deleuze in Negotiations: 1972–1990, Columbia University Press, New York, 1995), in ever growing actions of predictive data (Cheney-Lippold in We are data and the making of our digital selves, NYU Press, New York, 2017), where consciousness is replaced by mindless computations (Daston in “The rule of rules”, lecture Wissenschaftskolleg Berlin, November 21st, 2010). As a result of the automation of cognition, the subject has thus become ultimately deprived of the transcendental tool of reason. This article discusses the consequences of this crisis of conscious cognition by the hands of machines by asking whether the servo-mechanic model of technology can be overturned to expose the alien subject of artificial intelligence as a mode of thinking originating at, but also beyond, the transcendental schema of the self-determining subject. As much as the socio-affective qualities of the user have become the primary sources of capital abstraction, value, quantification and governmental control, so has technology, as the means of abstraction, itself changed nature. This article will suggest that the cybernetic network of communication has not only absorbed physical and cognitive labour into its circuits of reproduction, but is, more importantly, learning from human culture, through the data analysis of behaviours, the contextual use of content and the sourcing of knowledge. The theorisation of machine learning as involving a process of thinking will be taken here as a fundamental inspiration to argue that the expansion of an alien space of reasoning, envisioning the possibility of machine thinking against the servo-mechanic model of cybernetics.

Pasquinelli, Matteo. “Machines That Morph Logic: Neural Networks and the Distorted Automation of Intelligence as Statistical Inference.” Glass Bead site 1 (2017). Tags: Neural networks, perception, phenomenology
The term Artificial Intelligence is often cited in popular press as well as in art and philosophy circles as an alchemic talisman whose functioning is rarely explained. The hegemonic paradigm to date (also crucial to the automation of labour) is not based on GOFAI (Good Old-Fashioned Artificial Intelligence that never succeeded at automating symbolic deduction) but on the neural networks designed by Frank Rosenblatt back in 1958 to automate statistical induction. The text highlights the role of logic gates in the distributed architecture of neural networks, in which a generalized control loop affects each node of computation to perform pattern recognition. In this distributed and adaptive architecture of logic gates, rather than applying logic to information top-down, information turns into logic, that is a representation of the world becomes a new function in the same world description. This basic formulation is suggested as a more accurate definition of learning to challenge the idealistic definition of (artificial) intelligence. If pattern recognition via statistical induction is the most accurate descriptor of what is popularly termed Artificial Intelligence, the distorting effects of statistical induction on collective perception, intelligence and governance (over-fitting, apophenia, algorithmic bias, ‘deep dreaming’, etc.) are yet to be fully understood.

Pasquinelli, Matteo. “Three Thousand Years of Algorithmic Rituals: The Emergence of AI from the Computation of Space.” e-flux journal 101 ,(2019). Tags: ML Genealogy.
In a fascinating myth of cosmogenesis from the ancient Vedas, it is said that the god Prajapati was shattered into pieces by the act of creating the universe. After the birth of the world, the supreme god is found dismembered, undone. In the corresponding Agnicayana ritual, Hindu devotees symbolically recompose the fragmented body of the god by building a fire altar according to an elaborate geometric plan.2 The fire altar is laid down by aligning thousands of bricks of precise shape and size to create the profile of a falcon. Each brick is numbered and placed while reciting its dedicated mantra, following step-by-step instructions. Each layer of the altar is built on top of the previous one, conforming to the same area and shape. Solving a logical riddle that is the key of the ritual, each layer must keep the same shape and area of the contiguous ones, but using a different configuration of bricks. Finally, the falcon altar must face east, a prelude to the symbolic flight of the reconstructed god towards the rising sun—an example of divine reincarnation by geometric means.

Pasquinelli, Matteo. How a machine learns and fails — a grammar of error for Artificial Intelligence. spheres: A Journal for Digital Cultures, 5, November 2019. Tags: Shortcomings. With Velasco (2019)
What does it mean for intelligence and, in particular, for Artificial Intelligence to fail, to make a mistake, to break a rule? Reflecting upon a previous epoch of modern rationality, the epistemologist David Bates has argued that the novelty of the Enlightenment, as a quest for knowledge, was a new methodology of error rather than dogmatic instrumental reason. In contrast, the project of AI (that is to say, almost always, corporate AI), regardless and maybe due to its dreams of superhuman cognition, falls short in recognising and discussing the limits, approximations, biases, errors, fallacies, and vulnerabilities that are native to its paradigm. A paradigm of rationality that fails at providing a methodology of error is bound to end up, presumably, to become a caricature for puppetry fairs, as it is the case with the flaunted idea of AGI (Artificial General Intelligence).

Pasquinelli, Matteo and Vladan Joler, “The Nooscope Manifested: Artificial Intelligence as Instrument of Knowledge Extractivism.” AI & SOCIETY 36, no. 4 (November 2020): 1263-1280. Tags: Nooscope, limitations
Some enlightenment regarding the project to mechanise reason. The assembly line of machine learning: data, algorithm, model. The training dataset: the social origins of machine intelligence. The history of AI as the automation of perception. The learning algorithm: compressing the world into a statistical model. All models are wrong, but some are useful. World to vector: the society of classifcation and prediction bots. Faults of a statistical instrument: the undetection of the new. Adversarial intelligence vs. statistical intelligence: labour in the age of AI. Diagram

Phan, Thao and Scott Wark. “What personalisation can do for you! Or: how to do racial discrimination without ‘race’.” Culture Machine 20 (2021). Tags: Race, Facebook, homophily.
Between 2016 and 2020, Facebook allowed advertisers in the United States to target their advertisements using three broad ‘ethnic affinity’ categories: African American, U.S.-Hispanic, and Asian American. Superficially, these categories were supposed to allow advertisers to target demographic groups without using data about users’ race, which Facebook explicitly does not collect. This article uses the life and death of Facebook’s ‘ethnic affinity’ categories to argue that they exemplify a novel mode of racialisation made possible by machine learning techniques.

Possati, Luca M. “Algorithmic unconscious: why psychoanalysis helps in understanding AI.” Palgrave Commun 6, article 70 (April 2020). https://doi.org/10.1057/s41599-020-0445-0. Tags: Psychoanalysis, algorithmic unconscious.
The central hypothesis of this paper is that the concepts and methods of psychoanalysis can be applied to the study of AI and human/AI interaction. The paper connects three research fields: machine behavior approach, psychoanalysis and anthropology of science. In the “Machine behavior: research perspectives” section, I argue that the behavior of AI systems cannot be studied only in a logical-mathematical or engineering perspective. We need to study AI systems not merely as engineering artifacts, but as a class of social actors with particular behavioral patterns and ecology. Hence, AI behavior cannot be fully understood without human and social sciences. In the “Why an unconscious for AI? What this paper is about” section, I give some clarifications about the aims of the paper. In the “Unconscious and technology. Lacan and Latour” section, I introduce the central thesis. I propose a re-interpretation of Lacan’s psychoanalysis through Latour’s anthropology of sciences. The aim of this re-interpretation is to show that the concept of unconscious is not so far from technique and technology. In the “The difficulty of being an AI” section, I argue that AI is a new stage in the human identification process, namely, a new development of the unconscious identification. After the imaginary and symbolic registers, AI is the third register of identification. Therefore, AI extends the movement that is at work in the Lacanian interpretation of the mirror stage and Oedipus complex and which Latour’s reading helps us to clarify. From this point of view, I describe an AI system as a set of three contrasting forces: the human desire for identification, logic and machinery. In the “Miscomputation and information” section, I show how this interpretative model improves our understanding of AI.

Radford, Alec, Narasimhan, Karthik; Salimans, Tim; Sutskever, Ilya (11 June 2018). "Improving Language Understanding by Generative Pre-Training". OpenAI. p. 12. Tags: NLP, GPT, technical paper.
Natural language understanding comprises a wide range of diverse tasks such as textual entailment, question answering, semantic similarity assessment, and document classification. Although large unlabeled text corpora are abundant, labeled data for learning these specific tasks is scarce, making it challenging for discriminatively trained models to perform adequately. We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. In contrast to previous approaches, we make use of task-aware input transformations during fine-tuning to achieve effective transfer while requiring minimal changes to the model architecture. We demonstrate the effectiveness of our approach on a wide range of benchmarks for natural language understanding. Our general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon the state of the art in 9 out of the 12 tasks studied. For instance, we achieve absolute improvements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% on question answering (RACE), and 1.5% on textual entailment (MultiNLI).

Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilua Sutskever. "Language models are unsupervised multitask learners". Preprint, submitted February 2019. Tags: NLP, GPT-2, technical paper.
Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset - matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.

Ramesh, Aditya, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen and Ilya Sutskever. “Zero-Shot Text-to-Image Generation.” Preprint, submitted February 2021. arXiv:2102.112092. Tags: DALL-E, text-to-image, technical paper.
Text-to-image generation has traditionally focused on finding better modeling assumptions for training on a fixed dataset. These assumptions might involve complex architectures, auxiliary losses, or side information such as object part labels or segmentation masks supplied during training. We describe a simple approach for this task based on a transformer that autoregressively models the text and image tokens as a single stream of data. With sufficient data and scale, our approach is competitive with previous domain-specific models when evaluated in a zero-shot fashion.

Raval, Noopur. “An Agenda for Decolonizing Data Science.” Spheres: A Journal for Digital Cultures 5, (November 2019). Tags: genealogy, history, decolonization.
In his essay “Technologies of Power: From Area Studies to Big Data”, Manan Asif charts a fascinating continuity between early 20th century philological projects that were funded by the United States through a range of state and private entities and resulted in the field of ‘area studies’ and its echoes in the project of (big) data science studies. As Asif points out, in the aftermath of the Second World War, with the dawn of “the universal age” and the centring of the United States as the new global hegemon, the question of knowing the world arose – its peoples, areas, their ideological leanings and the way in which they could be managed. The Cold War era’s anxieties about identifying, ‘sniffing out’ and converting countries that were supposedly in the danger of, or on the brink of falling prey to communism is also striking to me to the extent that this political ideology appeared as a certain predisposition that threatened the US-led new empire and needed to be quelled at all costs.

Rieder, Bernard. “Big Data and the Paradox of Diversity.” Digital Culture & Society 2, vol. 2 (2016): 39–54. Tags: Big data, epistemology, digital methods.
This paper develops a critique of Big Data and associated analytical techniques by focusing not on errors – skewed or imperfect datasets, false positives, underrepresentation, and so forth – but on data mining that works. After a quick framing of these practices as interested readings of reality, I address the question of how data analytics and, in particular, machine learning reveal and operate on the structured and unequal character of contemporary societies, installing “economic morality” (Allen 2012) as the central guiding principle. Rather than critiquing the methods behind Big Data, I inquire into the way these methods make the many differences in decentred, non-traditional societies knowable and, as a consequence, ready for profitable distinction and decision-making. The objective, in short, is to add to our understanding of the “profound ideological role at the intersection of sociality, research, and commerce” (van Dijck 2014: 201) the collection and analysis of large quantities of multifarious data have come to play. Such an understanding needs to embed Big Data in a larger, more fundamental critique of the societal context it operates in.

Rosenfeld, Amir, Mahdi Biparva, and John K. Tsotsos. “Priming neural networks.” In Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Salt Lake City, Utah, June 2018, 2124–2133. Tags: Computer vision, technical paper
Visual priming is known to affect the human visual system to allow detection of scene elements, even those that may have been near unnoticeable before, such as the presence of camouflaged animals. This process has been shown to be an effect of top-down signaling in the visual system triggered by the said cue. In this paper, we propose a mechanism to mimic the process of priming in the context of object detection and segmentation. We view priming as having a modulatory, cue dependent effect on layers of features within a network. Our results show how such a process can be complementary to, and at times more effective than simple post-processing applied to the output of the network, notably so in cases where the object is hard to detect such as in severe noise, small size or atypical appearance. Moreover, we find the effects of priming are sometimes stronger when early visual layers are affected. Overall, our experiments confirm that top-down signals can go a long way in improving object detection and segmentation.

Rouast, Philipp V., Marc T. P. Adam and Raymond Chiong. "Deep Learning for Human Affect Recognition: Insights and New Developments," IEEE Transactions on Affective Computing 12, no. 22 (June 2021): 524-543 doi: 10.1109/TAFFC.2018.2890471. Tags: Affect recognition, deep learning, HCI
Automatic human affect recognition is a key step towards more natural human-computer interaction. Recent trends include recognition in the wild using a fusion of audiovisual and physiological sensors, a challenging setting for conventional machine learning algorithms. Since 2010, novel deep learning algorithms have been applied increasingly in this field. In this paper, we review the literature on human affect recognition between 2010 and 2017, with a special focus on approaches using deep neural networks. By classifying a total of 950 studies according to their usage of shallow or deep architectures, we are able to show a trend towards deep learning. Reviewing a subset of 233 studies that employ deep neural networks, we comprehensively quantify their applications in this field. We find that deep learning is used for learning of (i) spatial feature representations, (ii) temporal feature representations, and (iii) joint feature representations for multimodal sensor data. Exemplary state-of-the-art architectures illustrate the progress. Our findings show the role deep architectures will play in human affect recognition, and can serve as a reference point for researchers working on related applications.

Rudin, Cynthia. “Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.” Nature Machine Intelligence 1, no. 5 (September 2019): 206–215. Tags: Black boxes, explainability, model selection.
Black box machine learning models are currently being used for high stakes decision-making throughout society, causing problems throughout healthcare, criminal justice, and in other domains. People have hoped that creating methods for explaining these black box models will alleviate some of these problems, but trying to explain black box models, rather than creating models that are interpretable in the first place, is likely to perpetuate bad practices and can potentially cause catastrophic harm to society. There is a way forward – it is to design models that are inherently interpretable. This manuscript clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications where interpretable models could potentially replace black box models in criminal justice, healthcare, and computer vision.

Schmidt, Anna and Michael Wiegand. “A Survey on Hate Speech Detection using Natural Language Processing.” In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, Valencia, Spain, April 2017, 1–10. Stroudsburg, PA: Association for Computational Linguistics. Tags: NLP, sociality, hate speech, social media.
This paper presents a survey on hate speech detection. Given the steadily growing body of social media content, the amount of online hate speech is also increasing. Due to the massive scale of the web, methods that automatically detect hate speech are required. Our survey describes key areas that have been explored to automatically recognize these types of utterances using natural language processing. We also discuss limits of those approaches.

Schwartz, Oscar. “Competing Visions for AI: Turing, Licklider and Generative Literature.” Digital Culture & Society 4, no. 1 (June 2018): 87–105. doi: https://doi.org/10.25969/mediarep/13527. Tags: History, posthumanism, Turing.
In this paper, I will investigate how two competing visions of machine intelligence put forward by Alan Turing and J.C.R Licklider – one that emphasized automation and another that emphasized augmentation – have informed experiments in computational creativity, from early attempts at computer-generated art and poetry in the 1960s, up to recent experiments that utilise Machine Learning to generate paintings and music. I argue that while our technological capacities have changed, the foundational conflict between Turing’s vision and Licklider’s vision plays itself out in generations of programmers and artists who explore the computer’s creative potential. Moreover, I will demonstrate that this conflict does not only inform technical/artistic practice, but speaks to a deeper philosophical and ideological divide concerning the narrative of a post-human future. While Turing’s conception of human-equivalent AI informs a transhumanist imaginary of super-intelligent, conscious, anthropomorphic machines, Licklider’s vision of symbiosis underpins formulations of the cyborg as human-machine hybrid, aligning more closely with a critical post-human imaginary in which boundaries between the human and technological become mutable and up for re-negotiation. In this article, I will explore how one of the functions of computational creativity is to highlight, emphasise and sometimes thematise these conflicting post-human imaginaries.

Seaver, Nick. “Algorithms as culture: Some tactics for the ethnography of algorithmic systems.” Big Data & Society 4, no. 2 (December 2017): 1-12. doi:10.1177/2053951717738104. Tags: ethnography, culture
This article responds to recent debates in critical algorithm studies about the significance of the term “algorithm.” Where some have suggested that critical scholars should align their use of the term with its common definition in professional computer science, I argue that we should instead approach algorithms as “multiples”—unstable objects that are enacted through the varied practices that people use to engage with them, including the practices of “outsider” researchers. This approach builds on the work of Laura Devendorf, Elizabeth Goodman, and Annemarie Mol. Different ways of enacting algorithms foreground certain issues while occluding others: computer scientists enact algorithms as conceptual objects indifferent to implementation details, while calls for accountability enact algorithms as closed boxes to be opened. I propose that critical researchers might seek to enact algorithms ethnographically, seeing them as heterogeneous and diffuse sociotechnical systems, rather than rigidly constrained and procedural formulas. To do so, I suggest thinking of algorithms not “in” culture, as the event occasioning this essay was titled, but “as” culture: part of broad patterns of meaning and practice that can be engaged with empirically. I offer a set of practical tactics for the ethnographic enactment of algorithmic systems, which do not depend on pinning down a singular “algorithm” or achieving “access,” but which rather work from the partial and mobile position of an outsider.

Seaver, Nick. “Captivating algorithms: Recommender systems as traps.” Journal of Material Culture 24, no. 4 (December 2019): 421-436. Tags: Recommender Systems, behaviorism, infrastructure
Algorithmic recommender systems are a ubiquitous feature of contemporary cultural life online, suggesting music, movies, and other materials to their users. This article, drawing on fieldwork with developers of recommender systems in the US, describes a tendency among these systems’ makers to describe their purpose as ‘hooking’ people – enticing them into frequent or enduring usage. Inspired by steady references to capture in the field, the author considers recommender systems as traps, drawing on anthropological theories about animal trapping. The article charts the rise of ‘captivation metrics’ – measures of user retention – enabled by a set of transformations in recommenders’ epistemic, economic, and technical contexts. Traps prove useful for thinking about how such systems relate to broader infrastructural ecologies of knowledge and technology. As recommenders spread across online cultural infrastructures and become practically inescapable, thinking with traps offers an alternative to common ethical framings that oppose tropes of freedom and coercion.

Segar, Matthew W., Byron C. Jaeger, Kershaw V. Patel, Vijay Nambi, Chiadi E. Ndumele, Adolfo Correa, Javed Butler, Alvin Chandra, Colby Ayers, Shreya Rao, Alana A. Lewis, Laura M. Raffield, Carlos J. Rodriguez, Erin D. Michos, Christie M. Ballantyne, Michael E. Hall, Robert J. Mentz, James A. de Lemos, Ambarish Pandey. “Development and Validation of Machine Learning–Based Race-Specific Models to Predict 10-Year Risk of Heart Failure: A Multicohort Analysis.” Circulation 143, no. 24 (April 2021): 2370-2383. Tags: Epidemiology, race as covariate, race. Heart failure (HF) risk and the underlying risk factors vary by race. Traditional models for HF risk prediction treat race as a covariate in risk prediction and do not account for significant parameters such as cardiac biomarkers. Machine learning (ML) may offer advantages over traditional modeling techniques to develop race-specific HF risk prediction models and to elucidate important contributors of HF development across races.

Selbst, Andrew D., danah m. boyd, Sorelle A. Friedler, Suresh Venkatasubramanian, and Janet Vertesi. “Fairness and Abstraction in Sociotechnical Systems.” In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* ’19, Atlanta, GA, January 2019, 59–68. New York: ACM. Tags: Social questions, data ethics
A key goal of the fair-ML community is to develop machine-learning based systems that, once introduced into a social context, can achieve social and legal outcomes such as fairness, justice, and due process. Bedrock concepts in computer science—such as abstraction and modular design—are used to define notions of fairness and discrimination, to produce fairness-aware learning algorithms, and to intervene at different stages of a decision-making pipeline to produce "fair" outcomes. In this paper, however, we contend that these concepts render technical interventions ineffective, inaccurate, and sometimes dangerously misguided when they enter the societal context that surrounds decision-making systems. We outline this mismatch with five "traps" that fair-ML work can fall into even as it attempts to be more context-aware in comparison to traditional data science. We draw on studies of sociotechnical systems in Science and Technology Studies to explain why such traps occur and how to avoid them. Finally, we suggest ways in which technical designers can mitigate the traps through a refocusing of design in terms of process rather than solutions, and by drawing abstraction boundaries to include social actors rather than purely technical ones.

Sterkenburg, Tom and Peter D. Grünwald. “The no-free-lunch theorems of supervised learning.” Synthese 199, no. 4 (December 2021): 9979–10015. https://doi.org/10.1007/s11229-021-03233-1. Tags: Inductive biases, differences in algorithms, no-free-lunch theorems.
The no-free-lunch theorems promote a skeptical conclusion that all possible machine learning algorithms equally lack justification. But how could this leave room for a learning theory, that shows that some algorithms are better than others? Drawing parallels to the philosophy of induction, we point out that the no-free-lunch results presuppose a conception of learning algorithms as purely data-driven. On this conception, every algorithm must have an inherent inductive bias, that wants justification. We argue that many standard learning algorithms should rather be understood as model dependent: in each application they also require for input a model, representing a bias. Generic algorithms themselves, they can be given a model-relative justification.

Stevens, Nikki and Os Keyes “Seeing infrastructure: race, facial recognition and the politics of data.” Cultural Studies 35, no. 4-5 (March 2021): 833-853. Tag: Social questions, race, Face Recognition Technology, datasets, surveillance
Facial recognition technology (FRT) has been widely studied and criticized for its racialising impacts and its role in the overpolicing of minoritised communities. However, a key aspect of facial recognition technologies is the dataset of faces used for training and testing. In this article, we situate FRT as an infrastructural assemblage and focus on the history of four facial recognition datasets: the original dataset created by W.W. Bledsoe and his team at the Panoramic Research Institute in 1963; the FERET dataset collected by the Army Research Laboratory in 1995; MEDS-I (2009) and MEDSII (2011), the datasets containing dead arrestees, curated by the MITRE Corporation; and the Diversity in Faces dataset, created in 2019 by IBM. Through these four exemplary datasets, we suggest that the politics of race in facial recognition are about far more than simply representation, raising questions about the potential side-effects and limitations of efforts to simply ‘de-bias’ data.

Sudmann, Andreas: “On the Media-political Dimension of Artificial Intelligence Deep Learning as a Black Box and OpenAI.” Digital Culture & Society 4, no. 1 (October 2018): 181-200. Tags: Deep learning, black boxes, interpretability, OpenAI, infrastructure
The essay critically investigates the media-political dimension of modern AI technology. Rather than examining the political aspects of certain AI-driven applications, the main focus of the paper is centred around the political implications of AI’s technological infrastructure, especially with regard to the machine learning approach that since around 2006 has been called Deep Learning (also known as the simulation of Artificial Neural Networks). Firstly, the paper discusses in how far Deep Learning is a fundamentally opaque black box technology, only partially accessible to human understanding. Secondly, and in relation to the first question, the essay takes a critical look at the agenda and activities of the research company OpenAI that supposedly aims to promote the democratization of AI and tries to make technologies like Deep Learning more accessible and transparent.

Szegedy, Christian, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. “Intriguing Properties of Neural Networks.” Preprint, submitted February 2014. ArXiv:1312.6199. Tags: knowledge representation, adversariality, deep neural networks, technical paper
Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions that could have counter-intuitive properties. In this paper we report two such properties. First, we find that there is no distinction between individual high level units and random linear combinations of high level units, according to various methods of unit analysis. It suggests that it is the space, rather than the individual units, that contains the semantic information in the high layers of neural networks. Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extent. We can cause the network to misclassify an image by applying a certain hardly perceptible perturbation, which is found by maximizing the network’s prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.

Velasco, Pablo R. “Artificial Intelligibility and Proxy Error - A Comment on How a Machine Learns and Fails.” Spheres: A Journal for Digital Cultures 5 (November 2019). Tags: Black boxes, neural networks. With Pasquinelli (2019)
A representative moment of Artificial Intelligence (AI) capturing the social imaginary took place in March 2016, when Google’s AlphaGo computer program beat professional Go player Lee Sedol. Ten years before, IBM’s Deep Blue computer defeated chess grandmaster Garry Kasparov. It is worth to revisit two major insights that a decade of ‘intelligent’ machines left. First, an image search of both terms – “ibm deepblue” and “google alphago” – would reveal photos of both Kasparov’s and Sedol’s struggling matches, but is also significantly telling that the Deep Blue query will also return a squared black box. After all, Deep Blue was primarily an advanced piece of hardware, while AlphaGo takes the stage as software. Both consist, of course, of a coupling of logical instructions and computing power, but the machinery in the case of AlphaGo is shown as less relevant, less present.1 Second, Deep Blue’s advanced hardware was needed to run its Minimax algorithm, a non-probabilistic method for minimizing loosing scenarios: for each move made, it examines possible reactions from the opponent in future turns, as far away as the computing power allows.2 As complex as this algorithm is, it is relatively easy to understand. As for AlphaGo, this particular issue differs noticeably. An article calling for the demystification of AI in the Scientific American journal3 quotes Alan Winfeld, professor of robot ethics at the University of West England, on the neural networks that make AI systems like AlphaGo work: “It’s very difficult to find out why [a neural net] made a particular decision […] We still can’t explain it”. There is no need for an apocalyptic position regarding an inescapable black box here: it is possible to review every parameter in the neural network behind AlphaGo’s resolution; in this sense it is a box that can be opened. However, the article continues, the ‘meaning’ of the decision is not exactly intelligible, as it is encoded in billions of connections

Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. "Attention is all you need." In Proceedings of 31st Conference on Advances in neural information processing systems, Long Beach, California, December 2017. Red Hook, NY: Curran Associates, Inc. Tags: Attention, Transformers, technical paper.
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 Englishto-German translation task, improving over the existing best results, including ensembles, by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature.

Wachter, Sandra, Brent Mittelstadt, and Chris Russell. “Counterfactual explanations without opening the black box: Automated decisions and the GDPR.” Harvard Journal of Law and Technology 31 vol. 2 (April 2018): 841–887. Tags: Eexplainability, law, black boxes
There has been much discussion of the existence of a “right to explanation” in the EU General Data Protection Regulation (“GDPR”), and its merits and disadvantages.1 Attempts to implement a right to explanation that opens the “black box” to provide insight into the internal decision-making process of algorithms face four major legal and technical barriers. First, a legally binding right to explanation does not exist in the GDPR.2 Second, even if legally binding, the right would only apply in limited cases (when a negative decision was solely automated and had legal or other similar significant effects).3 Third, explaining the functionality of complex algorithmic decision-making systems and their rationale in specific cases is a technically challenging problem.4 Explanations may likewise offer little meaningful information to data subjects, raising questions about their value.5 Finally, data controllers have an interest in not sharing details of their algorithms to avoid disclosing trade secrets, violating the rights and freedoms of others (e.g. privacy), and allowing data subjects to game or manipulate the decision-making system.6

Wang, Qianwen, Jun Yuan, Shuxin Chen, Hang Su, Huamin Qu and Shixia Liu. "Visual Genealogy of Deep Neural Networks." IEEE Transactions on Visualization and Computer Graphics 26, no. 11 (November 2020): 3340-3352. doi: 10.1109/TVCG.2019.2921323. Tags: Information visualization, Deep neural networks, interpretability.
A comprehensive and comprehensible summary of existing deep neural networks (DNNs) helps practitioners understand the behaviour and evolution of DNNs, offers insights for architecture optimization, and sheds light on the working mechanisms of DNNs. However, this summary is hard to obtain because of the complexity and diversity of DNN architectures. To address this issue, we develop DNN Genealogy, an interactive visualization tool, to offer a visual summary of representative DNNs and their evolutionary relationships. DNN Genealogy enables users to learn DNNs from multiple aspects, including architecture, performance, and evolutionary relationships. Central to this tool is a systematic analysis and visualization of 66 representative DNNs based on our analysis of 140 papers. A directed acyclic graph is used to illustrate the evolutionary relationships among these DNNs and highlight the representative DNNs. A focus + context visualization is developed to orient users during their exploration. A set of network glyphs is used in the graph to facilitate the understanding and comparing of DNNs in the context of the evolution. Case studies demonstrate that DNN Genealogy provides helpful guidance in understanding, applying, and optimizing DNNs. DNN Genealogy is extensible and will continue to be updated to reflect future advances in DNNs.

Ward, Logan, Ankit Agrawal, Alok Choudhary, and Christopher Wolverton. “A general-purpose machine learning framework for predicting properties of inorganic materials.” npj Computational Materials 2 (August 2016): Article 16028. Tags: Materials.
A very active area of materials research is to devise methods that use machine learning to automatically extract predictive models from existing materials data. While prior examples have demonstrated successful models for some applications, many more applications exist where machine learning can make a strong impact. To enable faster development of machine-learning-based models for such applications, we have created a framework capable of being applied to a broad range of materials data. Our method works by using a chemically diverse list of attributes, which we demonstrate are suitable for describing a wide variety of properties, and a novel method for partitioning the data set into groups of similar materials to boost the predictive accuracy. In this manuscript, we demonstrate how this new method can be used to predict diverse properties of crystalline and amorphous materials, such as band gap energy and glass-forming ability.

Watson, David and Luciano Floridi. "The Explanation Game: A Formal Framework for Interpretable Machine Learning", Synthese 198, no. 10 (April 2020): 9211-9242. Tags: Interpretability, game theory (methodology)
We propose a formal framework for interpretable machine learning. Combining elements from statistical learning, causal interventionism, and decision theory, we design an idealised explanation game in which players collaborate to find the best explanation(s) for a given algorithmic prediction. Through an iterative procedure of questions and answers, the players establish a three-dimensional Pareto frontier that describes the optimal trade-offs between explanatory accuracy, simplicity, and relevance. Multiple rounds are played at different levels of abstraction, allowing the players to explore overlapping causal patterns of variable granularity and scope. We characterise the conditions under which such a game is almost surely guaranteed to converge on a (conditionally) optimal explanation surface in polynomial time, and highlight obstacles that will tend to prevent the players from advancing beyond certain explanatory thresholds. The game serves a descriptive and a normative function, establishing a conceptual space in which to analyse and compare existing proposals, as well as design new and improved solutions

Watson, David. “The rhetoric and reality of anthropomorphism in Artificial Intelligence.” Minds and Machines 29, no. 3 (September 2019) :417–440 Tags: Anthropomorphism, Epistemology
Artificial intelligence (AI) has historically been conceptualized in anthropomorphic terms. Some algorithms deploy biomimetic designs in a deliberate attempt to effect a sort of digital isomorphism of the human brain. Others leverage more general learning strategies that happen to coincide with popular theories of cognitive science and social epistemology. In this paper, I challenge the anthropomorphic credentials of the neural network algorithm, whose similarities to human cognition I argue are vastly overstated and narrowly construed. I submit that three alternative supervised learning methods—namely lasso penalties, bagging, and boosting—offer subtler, more interesting analogies to human reasoning as both an individual and a social phenomenon. Despite the temptation to fall back on anthropomorphic tropes when discussing AI, however, I conclude that such rhetoric is at best misleading and at worst downright dangerous. The impulse to humanize algorithms is an obstacle to properly conceptualizing the ethical challenges posed by emerging technologies.

Weatherby, Leif and Brian Justie. “Indexical AI.” Critical Inquiry 48, no. 2 (January 2022): 381–415. DOI.org (Crossref), https://doi.org/10.1086/717312. Tags: Neural networks, indexicality, semiotics
This article argues that the algorithms known as neural nets underlie a new form of artificial intelligence that we call indexical AI. Contrasting with the once dominant symbolic AI, large-scale learning systems have become a semiotic infrastructure underlying global capitalism. Their achievements are based on a digital version of the sign-function index, which points rather than describes. As these algorithms spread to parse the increasingly heavy data volumes on platforms, it becomes harder to remain skeptical of their results. We call social faith in these systems the naive iconic interpretation of AI and position their indexical function between heuristic symbol use and real intelligence, opening the black box to reveal semiotic function.

Wiens, Jenna and Erica S Shenoy. “Machine Learning for Healthcare: On the Verge of a Major Shift in Healthcare Epidemiology.” Clinical Infectious Diseases 66, no. 1 (January 2018): 149–153, https://doi.org/10.1093/cid/cix731. Tags: Epidemiology, patient risk stratification, public health
The increasing availability of electronic health data presents a major opportunity in healthcare for both discovery and practical applications to improve healthcare. However, for healthcare epidemiologists to best use these data, computational techniques that can handle large complex datasets are required. Machine learning (ML), the study of tools and methods for identifying patterns in data, can help. The appropriate application of ML to these data promises to transform patient risk stratification broadly in the field of medicine and especially in infectious diseases. This, in turn, could lead to targeted interventions that reduce the spread of healthcare-associated pathogens. In this review, we begin with an introduction to the basics of ML. We then move on to discuss how ML can transform healthcare epidemiology, providing examples of successful applications. Finally, we present special considerations for those healthcare epidemiologists who want to use and apply ML.

Wu Shijie and Mark Dredze. “Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT.” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, November 2019, 833-844. Stroudsburg, PA: Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1077. Tags: BERT, multilingual machines, technical paper
Pretrained contextual representation models (Peters et al., 2018; Devlin et al., 2019) have pushed forward the state-of-the-art on many NLP tasks. A new release of BERT (Devlin, 2018) includes a model simultaneously pretrained on 104 languages with impressive performance for zero-shot cross-lingual transfer on a natural language inference task. This paper explores the broader cross-lingual potential of mBERT (multilingual) as a zero-shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, document classification, NER, POS tagging, and dependency parsing. We compare mBERT with the best-published methods for zero-shot cross-lingual transfer and find mBERT competitive on each task. Additionally, we investigate the most effective strategy for utilizing mBERT in this manner, determine to what extent mBERT generalizes away from language-specific features, and measure factors that influence cross-lingual transfer.

Wu, Yonghui, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, Jeffrey Dean. “Google’s neural machine translation system: Bridging the gap between human and machine translation.” Preprint, submitted October 2016. arXiv:1609.08144. Tags: translation, language, neural networks, technical paper
Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficulty with rare words. These issues have hindered NMT's use in practical deployments and services, where both accuracy and speed are essential. In this work, we present GNMT, Google's Neural Machine Translation system, which attempts to address many of these issues. Our model consists of a deep LSTM network with 8 encoder and 8 decoder layers using attention and residual connections. To improve parallelism and therefore decrease training time, our attention mechanism connects the bottom layer of the decoder to the top layer of the encoder. To accelerate the final translation speed, we employ low-precision arithmetic during inference computations. To improve handling of rare words, we divide words into a limited set of common sub-word units ("wordpieces") for both input and output. This method provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system. Our beam search technique employs a length-normalization procedure and uses a coverage penalty, which encourages generation of an output sentence that is most likely to cover all the words in the source sentence. On the WMT'14 English-to-French and English-to-German benchmarks, GNMT achieves competitive results to state-of-the-art. Using a human side-by-side evaluation on a set of isolated simple sentences, it reduces translation errors by an average of 60% compared to Google's phrase-based production system.

Yee, Kyra, Uthaipon Tantipongpipat and Shubhanshu Mishra. “Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency.” Preprint, submitted September 2021. arXiv:2105.08667v2. Tags: Twitter, social questions, fairness.
Twitter uses machine learning to crop images, where crops are centered around the part predicted to be the most salient. In fall 2020, Twitter users raised concerns that the automated image cropping system on Twitter favored light-skinned over dark-skinned individuals, as well as concerns that the system favored cropping woman’s bodies instead of their heads. In order to address these concerns, we conduct an extensive analysis using formalized group fairness metrics. We find systematic disparities in cropping and identify contributing factors, including the fact that the cropping based on the single most salient point can amplify the disparities because of an effect we term argmax bias. However, we demonstrate that formalized fairness metrics and quantitative analysis on their own are insufficient for capturing the risk of representational harm in automatic cropping. We suggest the removal of saliency-based cropping in favor of a solution that better preserves user agency. For developing a new solution that sufficiently address concerns related to representational harm, our critique motivates a combination of quantitative and qualitative methods that include human-centered design.

Yeh, Chih-Kuan, Been Kim, Sercan O. Arik, Chun-Liang Li, Tomas Pfister, Pradeep Ravikumar. “On Completeness-aware Concept-Based Explanations in Deep Neural Networks.” Preprint, submitted June 2020. arXiv:1910.07969. Tags: interpretability, concept-based explainability, Deep Neural Networks, technical paper
Human explanations of high-level decisions are often expressed in terms of key concepts the decisions are based on. In this paper, we study such concept-based explainability for Deep Neural Networks (DNNs). First, we define the notion of completeness, which quantifies how sufficient a particular set of concepts is in explaining a model's prediction behavior based on the assumption that complete concept scores are sufficient statistics of the model prediction. Next, we propose a concept discovery method that aims to infer a complete set of concepts that are additionally encouraged to be interpretable, which addresses the limitations of existing methods on concept explanations. To define an importance score for each discovered concept, we adapt game-theoretic notions to aggregate over sets and propose ConceptSHAP. Via proposed metrics and user studies, on a synthetic dataset with apriori-known concept explanations, as well as on real-world image and language datasets, we validate the effectiveness of our method in finding concepts that are both complete in explaining the decisions and interpretable. (The code is released at this https URLa

Yosinski, Jason, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson. “Understanding neural networks through deep visualization.” In Proceedings of Deep Learning Workshop at the 31st International Conference on Machine Learning (ICML’14), Lille, France, July 2015. Tags: Computer vision, black boxes, technical paper
Recent years have produced great advances in training large, deep neural networks (DNNs), including notable successes in training convolutional neural networks (convnets) to recognize natural images. However, our understanding of how these models work, especially what computations they perform at intermediate layers, has lagged behind. Progress in the field will be further accelerated by the development of better tools for visualizing and interpreting neural nets. We introduce two such tools here. The first is a tool that visualizes the activations produced on each layer of a trained convnet as it processes an image or video (e.g. a live webcam stream). We have found that looking at live activations that change in response to user input helps build valuable intuitions about how convnets work. The second tool enables visualizing features at each layer of a DNN via regularized optimization in image space. Because previous versions of this idea produced less recognizable images, here we introduce several new regularization methods that combine to produce qualitatively clearer, more interpretable visualizations. Both tools are open source and work on a pretrained convnet with minimal setup.

Yu, Mingyi. “The Algorithm Concept, 1684-1958.” Critical Inquiry 47, no. 3 (March 2021): 592-609. DOI.org (Crossref), https://doi.org/10.1086/713556. Tags: Genealogy
The word algorithm has become the default descriptor for anything vaguely computational to the extent that it appears synonymous with computing itself. It functions in this respect as the master signifier under which a spectrum of sense is subsumed, less a well-defined and stable expression than the vehicle through which innumerable concerns are projected. Commenting on this nebulous quality, Massimo Mazzotti has dubbed the term “a site of semantic confusion.” Yet, rather than “engaging in a taxonomic exercise to norm the usage of the word,” Mazzotti proposes that a more generative approach would “consider its flexible, ill-defined, and often inconsistent meanings as a resource: a messy map of our increasingly algorithmic life”—that is, he attempts to take “the omnipresent figure of the algorithm as an object that refracts collective expectations and anxieties.” Similarly, this study has little fondness for semantic discipline. Unlike Mazzotti, however, my focus here is primarily historical: How did the algorithm come to be such an “omnipresent figure”? What was at stake in aligning the computational with the algorithmic?

Zeilinger, Martin. “Generative Adversarial Copy Machines.” Culture Machine 20 (2021). Tags: GANs, AI art, authorship.
This essay explores the redistribution of expressive agency across human artists and non-human entities that inevitably occurs when artificial intelligence (AI) becomes involved in creative processes. In doing so, my focus is not on a 'becoming-creative' of AI in an anthropocentric sense of the term. Rather, my central argument is as follows: if AI systems will be (or already are) capable of generating outputs that can satisfy requirements by which creativity is currently being evaluated, validated, and valorised, then there is a potential for AI to disturb prevailing aesthetic and ontological assumptions concerning anthropocentrically framed ideals of the artist figure, the work of art, and the idea of creativity as such. I will elaborate this argument by way of a close reading of Generative Adversarial Network (GAN) technology and its uses in AI art (discussing the work of Helen Sarin and Anna Ridler, among others), alongside examples of ownership claims and disputes involving GAN-style AI art. Overall, this discussion links to cultural theories of AI, relevant legal theory, and posthumanist thought. It is across these contexts that I will reframe GAN systems, even when their 'artistic' outputs can be interpreted with reference to the original creations of the singular author figure, as 'Generative Adversarial Copy Machines'. Ultimately, I want to propose that the disturbances effected by AI in artistic practices can pose a critical challenge to the integrity of cultural ownership models – specifically, intellectual property (IP) enclosures – which rely on an anthropocentric conceptualisation of authorship.

Zerilli, John, Alistair Knott, James Maclaurin, and Colin Gavaghan. “Transparency in algorithmic and human decision-making: Is there a double standard?” Philosophy & Technology 32, no. 4 (September 2018): 661–683. Tags: Explainability, transparency, intentional stance, decision-making, posthumanism
We are skeptical of concerns over the opacity of algorithmic decision tools. While transparency and explainability are certainly important desiderata in algorithmic governance, we worry that automated decision-making is being held to an unrealistically high standard, possibly owing to an unrealistically high estimate of the degree of transparency attainable from human decision-makers. In this paper, we review evidence demonstrating that much human decision-making is fraught with transparency problems, show in what respects AI fares little worse or better and argue that at least some regulatory proposals for explainable AI could end up setting the bar higher than is necessary or indeed helpful. The demands of practical reason require the justification of action to be pitched at the level of practical reason. Decision tools that support or supplant practical reasoning should not be expected to aim higher than this. We cast this desideratum in terms of Daniel Dennett’s theory of the intentional stance and argue that since the justification of action for human purposes takes the form of intentional stance explanation, the justification of algorithmic decisions should take the same form. In practice, this means that the sorts of explanations for algorithmic decisions that are analogous to intentional stance explanations should be preferred over ones that aim at the architectural innards of a decision tool.

Zerilli, John. “Explaining machine learning decisions” Philosophy of Science 89, no. 1 (January 2022): 1-19 (forthcoming). Tags: Interpretability tradeoffs, XAI.
The operations of deep networks are widely acknowledged to be inscrutable. The growing field of “Explainable AI” (XAI) has emerged in direct response to this problem. However, owing to the nature of the opacity in question, XAI has been forced to prioritise interpretability at the expense of completeness, and even realism, so that its explanations are frequently interpretable without being underpinned by more comprehensive explanations faithful to the way a network computes its predictions. While this has been taken to be a shortcoming of the field of XAI, I argue that it is broadly the right approach to the problem.

Zhou, Zhenglong and Chaz Firestone. “Humans can decipher adversarial images.” Nature Communications, 10, no. 1334 (March 2019): 1–9. Tags: Adversariality, epistemology
Does the human mind resemble the machine-learning systems that mirror its performance? Convolutional neural networks (CNNs) have achieved human-level benchmarks in classifying novel images. These advances support technologies such as autonomous vehicles and machine diagnosis; but beyond this, they serve as candidate models for human vision itself. However, unlike humans, CNNs are “fooled” by adversarial examples—nonsense patterns that machines recognize as familiar objects, or seemingly irrelevant image perturbations that nevertheless alter the machine’s classification. Such bizarre behaviors challenge the promise of these new advances; but do human and machine judgments fundamentally diverge? Here, we show that human and machine classification of adversarial images are robustly related: In 8 experiments on 5 prominent and diverse adversarial imagesets, human subjects correctly anticipated the machine’s preferred label over relevant foils—even for images described as “totally unrecognizable to human eyes”. Human intuition may be a surprisingly reliable guide to machine (mis)classification—with consequences for minds and machines alike.

Zhu, Jun-Yan, Taesung Park, Phillip Isola, and Alexei A. Efros. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks.” Preprint, submitted March 2017. arXiv:1703.10593 Tags: Image-to-image translation, computer vision, representation, technical paper
Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples. Our goal is to learn a mapping G:X→Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping F:Y→X and introduce a cycle consistency loss to push F(G(X))≈X (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach.

Books/Issues
Amaro, Ramon. The Black Technical Object. MIT Press (2022)
Amoore, Louise. Cloud Ethics. Duke University Press (2020)
Apprich, Clemens, Wendy Hui Kyong Chun, Florain Cramer, and Hito Steyerl. Pattern Discrimination. University of Minnesota Press (2018).
Chun, Wendy Hui Kyong. Discriminating Data: Correlation, Neighborhoods, and the New Politics of Recognition. United States: MIT Press (2021).
Culture Machine. 2021, Vol. 20.
Digital Culture & Society. Rethinking AI, Jg. 4 (2018)
Goodfellow, Ian, Yoshua Bengio, Aaron Courville. Deep Learning. MIT Press (2016).
Mackenzie, Adrian. Machine Learners. MIT Press (2017)
Roberge, Jonathan and Michael Castelle (eds). The Cultural Life of Machine Learning. Palgrave Macmillan Cham (2021).
Russell, Stuart Jonathan, et al. Artificial Intelligence: A Modern Approach. Pearson (2022).
Spheres: A Journal for Digital Cultures. Spectres of AI, 5,November 2019.