Crash Course: A Professor Quickly Learns about AI
Earlier this year, as ChatGPT, Midjourney, Runway and a long list of other AI tools ignited a national conversation about artificial intelligence, many of my colleagues in the School of Cinematic Arts at the University of Southern California shuddered in horror over the displacement of human craft and creativity by visuals created through simple text prompts. Published in The New York Times in February, columnist Kevin Roose’s description of his creepy conversation with Bing added gasoline to the fire, prompting a desire to prohibit the use of all AI across all of our programs. And what about plagiarism?! The general vibe was anxious fretting.
While AI’s seemingly sudden presence, increasing capacity and rapid speed of development sparked a sense of unease, we were not quite sure how best to respond. Some institutions around the country programmed AI-related events. For example, Bart Weiss, a filmmaker and professor in the Art & Art History Department at the University of Texas at Arlington, hosted a conversation titled “DIALOGUES IN ART: THE AESTHETICS OF AI” featuring Lynn Hershman Leeson, David Stout, Ira Greenberg and Kevin Page. Similarly, Stanford University’s HAI (Human-Centered Artificial Intelligence) hosted a symposium titled, “Creativity in the Age of AI: AI Impacting Arts, Arts Impacting AI,” with a terrific lineup of presenters, including Golan Levin and Lauren Lee McCarthy. These events were great, with a level of precision about AI that was refreshing; however, despite reading and hearing so much about AI, most of us had absolutely no real understanding of just how AI tools actually work or what they might offer a creative community.
To alleviate this illiteracy, we invited computational linguist Noya Kohavi to lead an intensive three-session workshop explaining the foundations of the technology for faculty in SCA. Kohavi is currently part of the Antikythera program of the Berggruen Institute in L.A., which brings together an interdisciplinary group of scholars, designers and artists to consider planetary-scale computation. Led by Benjamin Bratton, the project is exploring forms of synthetic intelligence, planetary sapience and methods of worldbuilding. Noya took time away from intensive research on that massive project to present “From the Chinese Room to the Embeddings Space: A Workshop About Language and AI.” At the beginning of the workshop, I admit I had no idea what the Chinese Room was, never mind embeddings space, which I assumed must have been a misspelling.
On day one, Kohavi took us through foundational concepts of intelligence and cognition, from the power attributed to Clever Hans, the horse thought to be able to complete math problems in the early 1900s, to the Turing Test and Mechanical Turk. The Chinese Room, it turns out, names a thought experiment developed by philosopher John Searle in the 1970s that helps us understand how computers function. Searle imagines himself alone in a room; his job is to respond to Chinese characters that are delivered to the room even though he does not understand Chinese. He completes his task using a program for manipulating the characters to create the appropriate response. As a result, for those who send and receive the Chinese characters outside the room, it might appear that someone who understands Chinese inside the room is responding. However, what Searle shows is that all that’s needed is the program. No actual comprehension or interpretation by the man inside the room is required. The thought experiment offers a quick way to point to both the limitations of intelligence in a system like ChatGPT as well as our tendency to overattribute the competencies of computational systems based on what we imagine to be happening.
Kohavi then moved on to explore more complex concepts, showing, for example, how language modes use sophisticated forms of pattern recognition to predict word strings. We talked about the Distributional Hypothesis, which states that semantically similar words will tend to occur in related contexts. This concept may not seem particularly illuminating, but in the context of language models, the hypothesis not only begins to show their spatial dimension but also demonstrates how text prediction functions. Words become vectors through a process of embedding, which in turn allows us to calculate the probability of appropriate outputs in text strings.
Even just this bit of clarity about how language models basically use a giant multidimensional collection of text as a foundation to infer relationships among words so that they can predict what words should come next helped us begin to see through the hazy rhetoric that touts the “magic” of AI. Furthermore, the rootedness of these models in statistics and probabilities made us aware of a very different logic at work than that of the analog image. With this fresh in our minds, Kohavi pointed us to artist Hito Steyerl’s recent essay in New Left Review, “Mean Images,” in which Steyerl explains, “Visuals created by ml [machine learning] tools are statistical renderings, rather than images of actually existing objects.” She continues, “They shift the focus from photographic indexicality to stochastic discrimination. They no longer refer to facticity, let alone truth, but to probability.” While I bristle at the implication that photographic indexicality necessarily embodies truth, Steyerl’s biting critique goes on to consider the many ramifications of “mean” as a term, from the statistical average to notions of nastiness.
With Kohavi, we went on to talk about how generative pre-trained transformers (GPTs) work, with attention to ethics, politics, labor, bias and environmental costs, as well as compelling concepts such as what N. Katherine Hayles, in her book Unthought, calls a “planetary cognitive ecology” to reference human and machine-based tools that, when integrated, prompt questions about the changing nature of cognition, not to mention the human.
Many of us teaching in film programs began using image- and video-generating tools as soon as we could and, as a result, gleaned a sense of their capacities and limitations through practice. However, Kohavi’s foundational workshop has been incredibly grounding, helping explain the rootedness of machine learning in statistics. My sense is that all of us in filmmaking programs need this basic literacy, not simply to understand the underlying logic of AI and its larger cultural implications, but also to be better equipped to explain it to our students.