A Journal of Rhetoric, Writing, and Culture

Exploring the Multimodal Gutter: What Dissociation Can Teach Us About Multimodality

Amy Anderson, West Chester University1

(Published, October 17, 2017)

About twenty minutes into “The Empty Hearse” episode of the BBC series Sherlock, John Watson sits at a restaurant table across from his girlfriend Mary, nervously fumbling toward a marriage proposal. A troublesome waiter interrupts the conversation. Removing his glasses, the waiter reveals himself to be none other than Sherlock Holmes, a dear friend whom Watson has presumed dead for the last two years. While Holmes awkwardly jokes about his faked death, the tension rises between the two men. It comes to a head when Watson grabs Holmes by the throat and starts to choke him—and carnivalesque music starts to play in the background.

The music changes everything.

The tension drains away. Even though the images flashing across the screen show one man choking another, the audience knows from the music that we should take this scene in good fun. Neither man is truly in danger. But how does the audience know that the music should determine our interpretation of the images? There aren’t any directives telling us that the music is the key to the scene’s meaning, but somehow, we resolve the juxtaposed sound and images.

The question of how we interpret this particular scene from Sherlock points to a larger question about how media interact in multimodal compositions: When two or more media are juxtaposed, how do we make meaning out of the composition as a whole? In this article, I propose that using dissociation to examine how juxtaposed media interact in a space that I call the multimodal gutter helps us learn more about the limitations and possibilities of dissociation, the process of creating closure, and the canon of invention itself.   

Juxtaposition, the Digital Invention Box, and the Multimodal Gutter

In recent years, rhetoric and composition scholars have noted juxtaposition’s important role in the composition process; the concept is a key tool for inventing new meaning, particularly in multimodal composing. Bre Garrett, Denise Landrum-Geyer, and Jason Palmeri survey a range of scholars in “Re-Inventing Invention: A Performance in Three Acts,” and they conclude that “theorists have long recognized that ‘original’ compositions arise (at least in part) out of the creative juxtaposition of existing materials.” They argue that juxtaposition is a crucial inventional tool because “composing is a process of making connections, rearranging materials (words, images, concepts) in unexpected ways.” The connections that form between juxtaposed materials reveal unexpected associations and ideas. Garrett et al. also note that “digital technologies open up new possibilities for practicing creative juxtaposition in our pedagogy and scholarship.” Often building on the multiliteracies work of the New London Group (Cazden et al.), additional scholars like Susan H. Delagrange, Jeff Rice, Johndan Johnson-Eilola, Nicholas C. Burbules, Joddy Murray, and Jonathan Alexander and Jacqueline Rhodes have theorized the ways that digital technology facilitates multimodal juxtaposition, leading to unique inventional possibilities across disparate types of media. Delagrange asserts that juxtaposition in digital spaces is particularly “generative, epistemologically rich and powerful.” Juxtaposition is thus a critical part of the multimodal composition process, particularly when this process takes place in digital spaces.

But what exactly happens when various types of media are juxtaposed? How, precisely, is new meaning generated? These questions are more difficult to answer. If we believe that invention is an important part of the composition process, then we need to understand how invention happens through juxtaposition. Two places to begin theorizing what takes place during multimodal juxtaposition are Scott McCloud’s concepts of the comics gutter and closure, and Garrett et al.’s Digital Invention Box.

McCloud introduces the comics gutter as the physical and conceptual space between comics panels (60-93). The spatial limitations of comics art prevent the depiction of every single action in a given story, so instead comics are composed of sequentially ordered panels illustrating a string of individual moments (8-23). Despite the blank space—the gutters—between the panels, McCloud notes that readers assemble a coherent story through the process of closure (60-63). He describes closure as the “phenomenon of observing the parts but perceiving the whole” (63, italics original), and he demonstrates the concept by presenting two panels (68):

Figure 1. McCloud uses two panels to demonstrate how viewers fill in space to create the illusion of action.

Despite only seeing these two static images, viewers create closure between them and assume a whole action takes place: the man with the axe has committed a murder. We never see the murder, but, through closure, we assume that it has happened. McCloud further explains the process in this way: “Here in the limbo of the gutter, human imagination takes two separate images and transforms them into a single idea. Nothing is seen between the panels, but experience tells you something must be there” (66). The juxtaposition of the images encourages viewers to create a connection between them, imagining a complete event taking place, and thereby inventing new meaning.

At first it seems that McCloud describes the process of closure in almost magical terms. He writes, “No matter how dissimilar one image may be to another, there is a kind of—alchemy at work in the space between panels which can help us find meaning or resonance in even the most jarring of combinations” (73). Although this quote implies a passive relationship that just appears between the panels, McCloud later argues that the reader plays a proactive role in the process of closure: “[C]losure in comics is . . . anything but involuntary. Every act committed to paper by the comics artist is aided and abetted by a silent accomplice. An equal partner in crime known as the reader” (68). McCloud argues that what happens between the panels is the reader’s responsibility: “I may have drawn an axe being raised in this example, but I’m not the one who decided to let it drop or decided how hard the blow, or who screamed, or why. That, dear reader, was your special crime, each of you committing it in your own style” (68). Closure, then, requires the reader’s collusion. The comics artist juxtaposes the panels and creates the gutter space, but the reader is responsible for stepping into the gutter and supplying the meaning that links the panels.

Other scholars and comics creators have a similar understanding of the role of the gutter. Like McCloud, renowned graphic novelist Art Spiegelman writes that “comics are a gutter medium” (100), and Thierry Groensteen theorizes that narrative occurs between comics panels, in the gutter. Richard Harrison and Hillary L. Chute emphasize the reader’s responsibility to forge connections between panels.  Charles Acheson calls attention to the personal investment that readers make in comics, arguing that readers draw on their own memories and experiences of previous panels when navigating the gutter (306-307). The gutter is thus a physical space where conceptual work is accomplished. Forging meaning across the gutter is a deeply individualized process, but creating closure is always active and participatory.

While these theorists have all considered how the gutter functions, they have focused less on how readers actually create closure between the texts and images combined in individual panels. Joshua C. Hilst considers the inventive work involved in closure, noting that the gutter invites readers to engage in a “slasher rhetoric” that “match[es] disparate phenomena, different ideas, pairing them together to invent new topoi, new idioms” (171). Jason Helms offers an expanded definition of the gutter that provides further insight into the process of closure. Helms suggests that the comics gutter is the place where the audience encounters “the relationship between multiple elements, whether those be panels, images, text, emminata, or the page itself.” More than the physical space between panels, the gutter as conceived by Hilst and Helms is the conceptual space where the audience navigates the juxtapositions inherent in comics. Helms’s expanded notion of the gutter makes the space relevant beyond comics and for multimodal compositions more broadly. Indeed, I want to suggest that the process of crossing the comics gutter space is essentially the same as the process of making meaning between different media and modes in other types of multimodal compositions. Texts, images, and even audio are distinct modes,2 but when they are juxtaposed in a single composition—ranging from comics panels to print advertisements and films—viewers, like comics readers, must create coherent meaning between them in what I call the multimodal gutter. Like the comics gutter, the multimodal gutter can be a physical space. It is, however, primarily a conceptual space where the audience undertakes meaning-making work when faced with a multimodal composition.

While McCloud provides a theoretical framework for understanding multimodal gutter spaces, Garratt et al.’s Digital Invention Box highlights the audience’s active role in crossing the multimodal gutter. The Digital Invention Box invites us to consciously engage with the inventional process of creating closure between juxtaposed elements. When viewers press the “Invent” button, the Box generates a random pair of texts and images from a curated collection of ninety-nine elements. Sometimes two quotes appear side-by-side, sometimes two images pop up, and sometimes a quote is juxtaposed with an image. Making connections between the pairings, however, requires viewer participation.

A recent visit to the Box gave me three different pairings. The Box opened with an image of a snow-covered bird bath on the left side of the screen and the following quote on the right: “‘Out of dissonance, out of moments that don’t seem to relate, a pattern appears. That pattern is the focus point of invention.’ (Rice, ‘Moments of Invention,’ Yellow Dog Blog)” (Fig. 1). The pairing of the bird bath and the quote was, indeed, a moment of dissonance that left me searching for a connection between the two elements. Perhaps they suggested the pattern of changing seasons as a space for invention? Pressing the “Invent” button again gave me two quotes, one from Winterowd asserting the prevalence of invention in the composition process, and the other from Dunn commenting on the relationship between experience and knowledge (Fig. 1). Did this juxtaposition imply that invention is experiential? A third click on the “Invent” button provided two images (Fig. 1). On the left was an old-fashioned photo of women reading; on the right, an image of concrete parking stops in an empty lot. The rows of women pulled up to the tables mirrored the rows of parking spaces; their busyness contrasted with the lot’s emptiness.

Figure 2. Three iterations of the Invention Box

The Box invites viewers into a multimodal gutter space where juxtaposition spurs invention: the spatial proximity of the quotes and images suggests a connection, but the audience is responsible for creating closure across the multimodal gutter. Because the Box doesn’t mandate the outcome of the invention process, it spotlights the viewer’s role in creating meaning. The Box gave me the image of a bird bath and the Rice quote, but I created closure in the gutter between them by linking the image’s suggested seasonal patterns with Rice’s idea of inventional patterns. Likewise, I forged a conceptual link between the Winterowd and Dunn quotes, and I found the similarities between the images of the women and the parking stops. The Box’s various pairings of the two types of media present an opportunity for invention, but viewers do the mental work that invention requires.

McCloud’s theories name the space created by juxtaposed media (the gutter) and the inventional process that takes place in this space (closure), while the Digital Invention Box reveals the audience’s participation in the process. The gutter, closure, and audience participation form the theoretical framework for the multimodal gutter, but how does the audience know what connections to make in the multimodal gutter?

What Happens in the Multimodal Gutter…

The process of closure begins with the assumption that juxtaposed elements should be understood as a whole. McCloud notes that the sequential order of comics panels creates a tension between the panels that holds them together: “By creating a sequence with two or more images, we are endowing them with a single—overriding identity, and forcing the viewer to consider them as a whole. However different they had been, they now belong to a single organism” (73). Unlike many comics panels, the various elements in a multimodal composition aren’t always held together by sequential force, but they are nevertheless combined into one composition. The borders of a print advertisement indicate that the texts and images contained within that space are working together to present one message. The opening and ending credits of a movie tell the audience that the intervening images and audio should be considered in tandem. When the audience knows that various media should be understood together, they are driven into the multimodal gutter to find relationships between the elements.

Film scholars have looked closely at how the juxtaposition of images and sound creates cinematic magic, as in the scene from Sherlock that opened this essay. Michel Chion introduces the idea of the audiovisual contract, a “symbolic contract that the audio-viewer enters into, agreeing to think of sound and image as forming a single entity” (216). When images and audio are paired in a film sequence, he notes that the audience creates one meaning through the process of synchresis (63). Other scholars in film studies have attempted to give more detail about the interactions between images and audio. Claudia Gorbman (Unheard) and Carol Vernallis suggest that sound calls out specific details in the accompanying images, while Tisha Turk observes that music and images can be mutually transforming (172). These theorists’ approaches combine elements of both juxtaposition and closure to argue that meaning is generated from the relationship between various media.

Although film scholars’ work focuses on audio-visual interactions, and rhetoric and composition scholars consider interactions across a range of media, both groups have come to similar conclusions about the importance of the audience’s beliefs and values. Film scholars Gorbman (“Aesthetics and Rhetoric”) and Lauren Anderson suggest that the audience plays a more important role in interpreting film scores than critics have previously acknowledged. Anderson argues that more attention should be paid to how audience members’ personal taste, age, gender, and cultural and personal associations affect their interpretation of a soundtrack (38-46). Paralleling Anderson, rhetorician Scott Lloyd DeWitt proposes that the audience’s prior knowledge drives their understanding of the interaction between juxtaposed elements in the composition process (34-35). The importance of this prior knowledge is evident in McCloud’s example of the raised axe and scream panels: viewers assume that someone was killed because we know that a raised axe followed by a scream usually doesn’t imply an afternoon picnicking in the park. We’ve seen enough horror movies (or their trailers) to know when death is implied.

While acknowledging that prior knowledge can help the reader establish connections, rhetoric and composition scholars have also considered how various media sometimes have an unequal influence on a multimodal composition’s overall meaning. Gunther Kress proposes that modes and media can carry varying “communicational loads” (161). He considers a presentation in a science classroom, noting how the teacher sometimes relies on speech, but other times draws on gesture or images (162-66). At different moments of the presentation, one of the modes bears a heavier communicational load than the others, and the teacher shifts between the modes to accommodate the audience (164). Kress proposes that a rhetor orchestrates the dominant modes, but the audience also has the freedom to focus on one mode over another, altering the way that the rhetor’s message is received (161, 164). Ellen Cushman likewise notes that meaning in a multimodal composition arises from “the (im)balances of competing tensions” between juxtaposed elements. The interaction between juxtaposed media is thus dynamic and shifting.

Delagrange further complicates this interaction, positing that within a multimodal composition, one mode or medium can control the interpretation of another. While reflecting on a particular screen of “Wunderkammer, Cornell, and Visual Canon of Arrangement,” a piece she had previously composed for Kairos, she writes that images can dominate text: “[T]he viewer is connecting the words to what she discovers or questions about the images. The images control (the meaning of) the words.” She makes a similar observation about a different screen from the piece, which pairs a moving image with text: “It is very difficult NOT to watch a moving image, so this movement would establish the meaning-making primacy of the animations.” The images are privileged in both screens, but for different reasons: the information conveyed in the first screen’s images make them dominant, while movement catches readers’ attention and causes them to privilege the second screen’s image. Delagrange’s comments reveal that there can be a hierarchy in the interaction between juxtaposed modes, and that hierarchy may shift depending on the context. In a previous article, I found a similar hierarchical interaction between texts and images in advertisements for Levy’s rye bread from the 1950s (122). In these ads, however, the texts dominate the images and determine how the reader should interpret them. I suggested that Chaïm Perelman and L. Olbrechts-Tyteca’s concept of dissociation is useful for understanding the hierarchical relationship between the advertisements’ texts and images (121-123). I want to extend that work and propose that dissociation can be more broadly useful for understanding the inventional juxtaposition that takes place in the multimodal gutter. 

A Framework of Dissociative Multimodality

When Chaïm Perelman and Lucie Olbrechts-Tyteca introduced dissociation in The New Rhetoric, they intended the concept to be used for examining “discursive,” or language-based, arguments (8). Dissociation models what happens when unequally valued concepts come into contact, using the form of ratios called philosophical pairs: term I/term II.  Term II is the more valued concept and is thus the framework through which term I is understood (422). As Olbrechts-Tyteca later pointed out, however, term I is the concept that is usually recognized first (“Les Couples Philosophiques” 82). It often isn’t until an incompatibility arises between the juxtaposed terms that we realize that term II is shaping our understanding of term I. Dissociation doesn’t happen every time two concepts come together. Olbrechts-Tyteca notes that dissociative pairs are distinguished from classificatory pairs (like odd/even) and antithetical pairs (like good/evil) because even though there may be dissonance between the terms in these pairs, they are considered equally (81-82). In a dissociative philosophical pair, one term is always valued more highly and shapes how the other is understood.

Perelman and Olbrechts-Tyteca offer the appearance/reality pair as the prototypical dissociative philosophical pair in some strands of Western philosophy, and they illustrate it with the iconic example of a stick in a half-filled glass of water. Viewed from the side of the cup, the stick appears broken, even though viewers know that in reality it is straight. We assume that the appearance of the stick is false because we trust and value our concept of reality more highly than appearances. Even though we notice the appearance of the stick (term I) first, our understanding of the stick’s shape is determined by our often-unseen concept of reality (term II) (415-16).

Dissociation has been used primarily for explaining how definitional categories are formed. Within the framework, Perelman and Olbrechts-Tyteca explain that the more highly-valued term II is the criterion that determines which ideas are included and excluded from the definitional boundaries of term I (444-450). David Zarefsky et al. use dissociation to explain Reagan’s changing definition of the “needy” (113-114). Kathryn M. Olson draws on the concept to explain changes in the Shakers’ definitions of “success” and “growth” (51-53), and Janice W. Fernheimer considers how race and religion dissociatively interact to define Jewish identity (52-58). In each instance, the process of dissociation is inventional, generating new ideas about the term or category in question.

Dissociation also has inventional uses beyond definitions. Perelman and Olbrechts-Tyteca’s explanation of the concept deals primarily with text-based discourse, but Olbrechts-Tyteca’s later work opens up the possibility of multimodal dissociation. Several years after The New Rhetoric was written, Lucie Olbrechts-Tyteca published Le Comique du Discours, a monograph that includes an entire chapter on dissociation in comedy. Olbrechts-Tyteca observes that comedy doesn’t just exist to make us laugh; when used rhetorically, it leads to a better grasp of the significance of an argument and its most salient points (8). Comedy often occurs in moments of dissonance or rupture, and she suggests that dissociation can help us understand how we make meaning of that dissonance and why we find it so amusing (321). These moments of dissonance are often driven by an encounter with the appearance/reality pair, which she notes frequently occurs in the theater when the audience sees one thing but knows something else to be true (334). By applying dissociation to comedy, Olbrechts-Tyteca expands the concept beyond the purely discursive framework of The New Rhetoric Project and applies it to both play scripts and theater performances, where audio and visual modalities are combined. It is thus not a stretch to consider how dissociation informs the way we interpret other multimodal compositions.

Because philosophical pairs demonstrate a hierarchical juxtaposition between two concepts, the pairs are useful for modeling how meaning is made when media bearing Kress’s unequal communicational loads are part of a single multimodal composition. The concept is akin to Hilst’s “slasher rhetoric” (171), but when the slash is a dissociative slash, one of the paired terms guides the understanding of the other. In an earlier article, I argued that the printed slogan and image in the Levy’s advertisements form an image/text philosophical pair: the ad’s written slogan is privileged over the accompanying image, closing off several possible interpretations of the image and directing the viewer to a preferred meaning (121-123). Viewers are called on to invent new meaning in the ads’ dissociative multimodal gutter space.

Dissociation can also help us understand the interactions between the media juxtaposed in multimodal film compositions like the scene from Sherlock that I used to open this article. In the case of that scene, the philosophical pair representing the interaction between the soundtrack and the images would place the soundtrack, or audio, in the position of term II because the audio drives the meaning of the scene: image/audio. There is a noticeable incompatibility between the images and the music in the scene, and the viewers must work through a dissociation in the multimodal gutter to reconcile them. Given a different soundtrack, perhaps one more ominous and suspenseful, the scene could be interpreted differently and viewers would be more concerned about Sherlock’s well-being. Instead, the carnivalesque music signifies a light mood. We are free to laugh because we privilege the music over the images, creating the images/audio pair.

But how do we know to privilege the music? How do we know which medium is in the position of term II? Sometimes narrative conventions drive audiences to a particular interpretation. As Delagrange noticed, sometimes the flashier medium catches our attention first. In the scene from Sherlock, the abrupt start of the music catches the audience’s attention, but there is another factor causing us to privilege the audio in this example: the underlying appearance/reality philosophical pair. Long-time Sherlock fans know that Holmes and Watson have a deep (albeit quirky) friendship. In the scene, however, the audience is presented with an incompatibility between what they see (Watson’s anger, manifested by attacking Holmes) and what they know to be true (the reality of the two men’s friendship). The lighthearted music references the deeper reality of their friendship, and because we value reality more than appearances, we interpret the images through the music. In Olbrechts-Tyteca’s view of comedy, the dissonance between the information conveyed in the images and the music leads viewers to a place of dissociation, resulting in humor. Resolving the dissociation reveals the purpose of the scene: to reassure us that the two men’s friendship will survive despite Watson’s anger.

Using dissociation to interpret the Sherlock scene again highlights the audience’s active role in constructing the scene’s meaning in the multimodal gutter. The viewers are responsible for navigating the seeming mismatch between the images and the soundtrack, and for determining which medium will guide the interpretation of the other. In this particular case, as DeWitt suggests, the audience must draw on their prior knowledge of the relationship between two men. Without the audience’s involvement in working out the dissociation between the two media, the scene is unresolved. Multimodal dissociation thus adds a rhetorical twist to film scholars’ work on audio-visual interactions. It explains what is taking place during Chion’s synchresis and gives insight into why audio can lead to different interpretations of the visual, as Gorbman and Vernallis suggest. But multimodal dissociation is more dynamic because it can be used to understand the interactions between a range of media beyond film.

Inspired by Garrett et al.’s Digital Invention Box, I worked with professional coder Tim Ambler to create the Multimodal Gutter, where users can experience and explore multimodal dissociation. The space has a similar interface to the Digital Invention Box—a blank screen with buttons that users can click on to generate examples of various types of media. Unlike the Digital Invention Box, however, the Multimodal Gutter has a wider and more random selection of media. The Digital Invention Box consists of a curated group of ninety-nine quotes and images. In contrast, the Multimodal Gutter site generates text from a dictionary file. Images are randomly selected from Flickr (with somewhat unpredictable results), and the audio soundtracks are taken from a selection of pieces purchased from Incompetech. Because the media options in the Digital Invention Box are preselected with affinities in mind, the Box itself takes on some of the work of invention. In contrast, the Multimodal Gutter puts that inventional work back in the hands of users. The wider pool of media gives users greater control over creating the juxtapositions and challenges them to ask why a given medium is privileged. This space even allows users to change the position and size of the text and image to see how spatial orientation impacts the ways viewers interpret the media.

As visitors juxtapose combinations of image, text, and audio in the Multimodal Gutter, they are faced with questions like these: How does switching the soundtrack paired with an image alter that image’s interpretation—and how does swapping out the image transform the way the soundtrack is understood? Does pairing a soundtrack with a word change the way the soundtrack is perceived? When an image and word are juxtaposed, which medium drives the meaning of the composition? In which circumstances do we privilege one medium over another? What are the value hierarchies that influence which medium we privilege? What happens when we simply can’t make meaning across the gutter—when the pairings are so random that they don’t make sense? Are some multimodal gutters too wide to cross? 

Dissociation’s Discontents

While dissociation is a powerful tool for interpreting multimodal arguments, it also has limitations. The first of these is that philosophical pairs are seemingly unable to model what happens when there is not a clear hierarchy between the terms. In the case of the Digital Invention Box, it is often not evident which element frames the interpretation of the other, particularly when two elements in the same medium are paired. When the images of the women and the parking barriers appeared, it wasn’t immediately apparent which one should drive the interpretation. The same happened with the juxtaposition of the Winterowd and Dunn quotes. It’s possible that the spatial positioning of the media plays a role, and the quote or image to the left is privileged because it is seen first. If this is the case, though, the spatial privilege is subtle and does not necessarily guide the invention process. If dissociation is involved in the interpretation of the Digital Invention Box’s pairs, it likely happens at the level of the concepts conveyed rather than through the media by which they are conveyed. McCloud’s raised axe and scream comics panels are also not dissociative. The logic connecting the panels is an enthymeme, rather than a dissociating philosophical pair. Indeed, the Digital Invention Box and McCloud’s work show us that the relationships between media are not necessarily hierarchical, and thus not necessarily dissociative.

The second limitation on dissociation is that it is essentially a binary concept. As conceived by Perelman and Olbrechts-Tyteca, the dissociative process always involves pairs of unequally valued terms. These pairs may become more complex, splitting apart into what Perelman and Olbrechts-Tyteca call “fan-type” dissociations (431), like the one below:

In a fan-type dissociation, either term I or term II term can be divided into two additional terms, which themselves have a dissociative relationship. Fan-type pairs make the dissociative system more flexible by incorporating more than two terms, and it is possible to set up a complex series of branching dissociations by incorporating numerous fan-type dissociations. Nevertheless, the system is still binary and ultimately requires the terms to relate to each other in a series of pairs. If three concepts (I, II, and III) are brought together and two of the concepts (perhaps II and III) are equally privileged, there is no way to represent this in a single fan-type philosophical pair. The only option is to create two philosophical pairs, both with term I on the top and with differing bottom terms: I/II and I/III. The three terms can only be united in a single philosophical pair if terms II and III are brought into a hierarchical relationship and a fan-type system is formed, as in:

As long as two terms (II and III) are valued equally, it is not possible to represent all three in a dissociative philosophical pair.

A second film scene offers a more concrete example of dissociation’s limitations for explaining what happens in the multimodal gutter, while also demonstrating how even these limitations can be spaces of invention. Late in Stanley Kubrick’s 1968 film 2001: A Space Odyssey, the audience watches the astronaut Dave dismantling HAL, the artificial intelligence (AI) program running his spacecraft. HAL has just caused mechanical malfunctions that killed the rest of Dave’s crew, and the AI has attempted to kill Dave by stranding him outside the spaceship. After forcing his way back inside, Dave decides to disconnect HAL’s circuits. The scene where this takes place is constructed of visual and aural elements. There is dissonance between the information conveyed in the visual and aural elements, but there is also dissonance between content of the audio and the audio’s intonation. The audio content and intonation thus need to be considered separately, so there are essentially three separate media at work.


The images of the scene show Dave in a space suit, floating next to a circuit bank bathed in red light. He methodically moves down a line of circuit blocks, unscrewing each one until it pops out. Dave’s movements are purposeful and mechanical as he focuses on the bank of circuits, suggesting maintenance or repair work.

Audio Content

The soundtrack of this scene consists primarily of HAL speaking, attempting to dissuade Dave from shutting down the system. In the background, a hiss of air can be heard along with Dave’s heavy, regular breathing. The content of HAL’s speech is pleading:

“Dave, stop. Stop, will you. Stop, Dave. Will you stop, Dave. Stop, Dave. I’m afraid. I’m afraid, Dave. Dave. My mind is going. I can feel it. I can feel it. My mind is going. There is no question about it. I can feel it. I can feel it. I can feel it. I’m a . . . fraid.” (2001: A Space Odyssey)

HAL then lapses into reciting his mechanical pedigree: “Good afternoon, gentlemen. I am a HAL 9000 computer. I became operational at the HAL plant in Urbana, Illinois on the twelfth of January, 1992. My instructor was Mr. Langly, and he taught me to sing a song. If you’d like to hear it, I can sing it for you.” 

Dave replies, “Yes, I’d like to hear it HAL. Sing it for me.”

HAL continues, “It’s called ‘Daisy.’ Daisy, daisy, give me your answer do. I’m half-crazy all for the love of you. It won’t be a stylish marriage. I got a horse and carriage . . . ” HAL’s voice then becomes unintelligible.

Audio Intonation

The intonation of HAL’s speech complicates the scene. The AI’s computer-generated speech is flat and mechanical: there is no emotion. HAL’s vocal pacing is even, unhurried until the end of his speech, when it slows. Dave’s sole response is quick and to the point, but similarly devoid of emotion. HAL’s final song degenerates into mechanical tones, and there is no emotional inflection to mark the demise of the AI. 

Multimodal Interaction

The interaction between the images, audio content, and audio intonation in this scene is complex, and not only because there are three media at play. If we were going to interpret these three elements of the scene through a dissociative framework, we would have to decide which one to privilege, and privileging different elements sets up fan-type dissociations that offer disparate interpretations of the scene.

Privileging Images

If the images are assumed to be the dominant medium, the remaining media are forced into a dissociative relationship where the audio intonation shapes the way the audience interprets the audio content:

Privileging the images means that we understand the scene as Dave performing a mechanical operation and repairing the ship’s broken computer system. HAL is then a machine composed of circuits, and we have to interpret the AI’s protests as malfunctioning electronic pulses. Placing images as the ultimate term II means that HAL’s audio intonation overshadows the content of his speech, so there is a secondary dissociation going on in the top part of the philosophical pair.

Privileging Audio Intonation

Privileging the audio intonation offers a similar interpretation. If we privilege the mechanical tone of HAL’s words, then we assume that the AI is a mechanical construct. We have to then understand the audio content through the lens of the images, with the result that Dave is again working on a machine and HAL’s protests are simply mechanical malfunctions.

In this scene, the audio intonation and the images convey a similar message, so privileging either one sets off a second dissociation that privileges the other. The audio content is always the element that is understood differently when either the images or the audio intonation drives the dissociative process.

Privileging Audio Content

The scene’s meaning changes if the audio content is privileged. HAL’s protestations turn Dave into a cold-blooded murderer. HAL begs to be spared, expressing a very human fear as his mechanical mind slips away. Dave, in contrast, is impervious to HAL’s entreaties. The astronaut is a heartless aggressor against HAL, the helpless victim. Privileging the audio means that the three elements can’t be placed into one fan-type dissociation, however, because there isn’t a clear hierarchy between the images and the audio intonation. The latter two elements carry the same information, so they are valued equally once the audio content is privileged. The result is two dissociative pairs instead of one fan-type dissociation:

Examining these two pairs highlights how HAL’s intonation leads to an alternate interpretation of the other media in the scene, but the two pairs aren’t entirely sufficient for representing the scene because they don’t acknowledge the interaction between the images and audio content in the multimodal gutter. In this case, the binary limitations of dissociation don’t allow us to hold all three elements together in one whole. The result is a dissonance that reflects the larger questions about artificial intelligence inherent in the scene.

Kubrick doesn’t let one medium clearly dominate the viewer’s interpretation because the question about whether or not an AI has human qualities does not have a simple answer. There wasn’t a widely accepted answer to this ethical dilemma when the movie was created in 1968, and there still isn’t one today. Because we don’t have cultural knowledge or movie conventions to guide our interpretation of the scene, the viewer is left wrestling with the various dissociative possibilities. HAL’s previous actions of stranding Dave in space and killing the other crew members point towards privileging the scene’s audio content, but the images and the audio intonation in the scene offer other plausible possibilities. The complexity of the relationship between the media brings us back to the prototypical dissociative appearance/reality philosophical pair. If HAL in reality is alive, we can discount the appearance of mechanical qualities in his voice and Dave’s actions. But if the reality is that HAL is a machine, then the apparent emotional content of his speech is false. The way that the audience resolves the relationship between appearance and reality is directly connected to how we resolve the relationships between the different media.

The lack of a clear dissociative path does not, however, make the scene ineffective. It is instead the key to the scene’s power. Fernheimer has shown that when dissociation breaks down, the process is still productive. What she terms “dissociative disruption” occurs when dissociation is not completed because the audience doesn’t accept the proposed concept hierarchy (63-64). In these moments of disruption, Fernheimer demonstrates that dissociation is nevertheless generative because it opens up space for discussion and invention (64-65). The scene from 2001: A Space Odyssey opens up exactly this sort of space. The scene’s creation of multiple, competing dissociations forces the audience to question their views of appearance and reality, machines and humans, AI and astronauts. What the audience perceives as ambiguity in the multimodal gutter between the media is actually ambiguity in their own beliefs and cultural values.

Conclusions from the Multimodal Gutter

My analysis of how meaning is made in the multimodal gutter has two larger implications. First, the process of creating closure in the multimodal gutter is not a black box. The ways in which an audience associates or dissociates various media often reflect broader beliefs and cultural values. Can Holmes and Watson’s friendship survive a fight? Is an AI capable of humanity? Is reality more trustworthy than appearance, and if so, what is the nature of reality? When the audience chooses which medium to privilege in a multimodal dissociation, they are often choosing to reify or challenge the larger cultural value hierarchies embedded in the multimodal composition itself. As McCloud notes, the audience’s act of creating closure in the gutter implicates them in the story or argument that results. Considered in this light, multimodal gutters are essentially Rorschach tests that reveal the audience’s implicit and explicit beliefs and values.

Second, multimodal dissociation asks us to reconsider the relationship between the canons of invention and arrangement. Invention is tied to juxtaposition in multimodal compositions, and juxtaposition happens through the spatial and chronological arrangement of the different media. Multimodal compositions like comics highlight the spatial dimension of arrangement, while film soundtracks underscore the importance of chronology. These two dimensions of arrangement—spatial and chronological—create the multimodal gutter and call on the audience to begin inventing connections. Arrangement, then, is not an entirely separate canon from invention in multimodal compositions. At the same time, part of the responsibility for invention is turned over to a multimodal composition’s audience when they create the links between the various types of media to interpret the composition. Invention is thus dispersed and relies partly on the values and beliefs of the audience.

It is interesting to note that while dissociation isn’t the key to understanding all juxtaposed media, the concept itself shows the interconnectedness of arrangement and invention. Dissociation’s philosophical pairs spatially model juxtaposed concepts, placing words next to each other, separated by a slash that indicates the audience must perform intellectual work. Just as the arrangement of the concepts in a dissociative pair creates a physical space that calls the audience into invention, the arrangement of media in a multimodal composition is an invitation to invention.

As multimodality becomes ubiquitous in rhetoric and composition, we need to pay closer attention to how closure occurs in the multimodal gutter. Indeed, identifying dissociative relationships and finding the belief systems that motivate hierarchies should be a key part of multimodal analysis and pedagogy. The Multimodal Gutter website is a place to start investigating how these hierarchies are formed and where they fail. Even in situations where dissociative hierarchies are inadequate for capturing the meaning-making process, such as the scene from 2001: A Space Odyssey, we can still gain new insight into the inventional activity that takes place during closure. Dissociation reveals the beliefs and cultural values that audiences bring to a composition, highlighting how they frame the audience’s meaning-making invention process. Although it is not the only means of revealing these beliefs, dissociation is a place to start exploring what happens in the multimodal gutter.

  • 1.  Acknowledgements: I would like to thank Janice Fernheimer, David Frank, Martin Camper, Jason Helms, Matt Davis, Timothy Oleksiak, and Randall Cream for the conversations that helped to solidify the ideas in this piece. I also benefitted from the feedback graciously provided by Susan Jarrett, Barbara Biesecker, Ashley Hall, Kim Moreland, and Kathryn Lambrecht, and the reviewers at enculturation. Finally, I am incredibly grateful for Tim Ambler’s (http://bitcubby.com) smart work creating the Multimodal Gutter space.
  • 2. Helms points out that when pushed to their limits the verbal|visual and image|text binaries break down, and that comics readers are forced to continually juggle reading and seeing. The fonts used in comics are one example of this breakdown. Helms is certainly correct, but there is nevertheless often some distinction between the various media and modes in a multimodal composition, often made clear by the disparate information that each medium contributes to the whole. For the sake of my argument, I will assume that at least on the surface, some distinction can be made between modalities.
Works Cited

Acheson, Charles. “Expanding the Role of the Gutter in Nonfiction Comics: Forged Memories in Jose Sacco’s Safe Area Goražde.” Studies in the Novel, vol. 47, no. 3, Fall 2015, pp. 291-307. ProQuest, http://gateway.proquest.com.ezproxy.uky.edu/openurl?ctx_ver=Z39.88-2003&xri:pqil:res_ver=0.2&res_id=xri:lion&rft_id=xri:lion:ft:abell:R05281982:0.

Alexander, Jonathan and Jacqueline Rhodes. On Multimodality: New Media in Composition Studies. National Council of Teachers of English, 2014. CCCC Studies in Writing & Rhetoric.

Anderson, Amy K. “Dissociation and Visual Arguments: Creating Customers for Levy’s Real Jewish Rye.” Argumentation and Advocacy, vol. 52, no. 2, Fall 2015, pp. 109-124.

Anderson, Lauren. “Beyond Figures of the Audience: Towards a Cultural Understanding of the Film Music Audience.” Music, Sound, and Moving Image, vol. 10, no. 1, 2016, pp. 25-51, Project Muse, doi:10.3828/msmi.2016.2.

Burbules, Nicholas C. “Rhetorics of the Web: Hyperreading and Critical Literacy.” Page to Screen: Taking Literacy into the Electronic Era. Edited by Ilana Snyder. Routledge, 1998, pp. 102-122.

Cazden, Courtney, et al. “A Pedagogy of Multiliteracies: Designing Social Futures.” Harvard Educational Review, vol. 66, no. 1, Spring 1996, pp. 60-92. ProQuest, http://ezproxy.uky.edu/login?url=http://search.proquest.com.ezproxy.uky.edu/docview/212258378?accountid=11836.

Chion, Michel. Audio-Vision: Sound on Screen. Edited and translated by Claudia Gorbman, Columbia UP, 1994.

Chute Hillary L. Graphic Women: Life Narrative & Contemporary Comics. Columbia UP, 2010.

Cushman, Ellen. “Composing New Media: Cultivating Landscapes of the Mind. Kairos: A Journal of Rhetoric, Technology, and Pedagogy, vol. 9, no. 1, 2004, http://kairos.technorhetoric.net/9.1/binder.html.

Delagrange, Susan H. “When Revision is Redesign: Key Questions for Digital Scholarship.” Kairos: A Journal of Rhetoric, Technology, and Pedagogy, vol. 14, no. 1, Fall 2009, http://kairos.technorhetoric.net/14.1/.

DeWitt, Scott Lloyd. Writing Inventions: Identities, Technologies, Pedagogies. State U of New York P, 2001.

Fernheimer, Janice W.  “Black Jewish Identity Conflict: A Divided Universal Audience and the Impact of Dissociative Disruption. Rhetoric Society Quarterly, vol. 39, no. 1, Winter 2009, pp. 46-72. MLA International Bibliography, doi:10.1080/02773940802555530.

Garrett, Bre et al. “Re-Inventing Invention: A Performance in Three Acts.” The New Work of Composing. Edited by Debra Journet, Cheryl E. Ball, and Ryan Trauman; Computers and Composition Digital P/Utah State UP, 2012  http://ccdigitalpress.org/nwc/index.html.

Gorbman, Claudia. “Aesthetics and Rhetoric.” American Music, vol. 22, no. 1, Spring 2004, pp. 14-26. JSTOR, http://www.jstor.org/stable/3592963.

---. Unheard Melodies: Narrative Film Music. Indiana UP, 1987.

Groensteen, Thierry. The System of Comics. Translated by Bart Beaty and Nick Nguyen. UP of Mississippi, 2007.

Harrison, Richard. “Seeing and Nothingness: Michael Nicoll Yahgulanaas, Haida Manga, and a Critique of the Gutter.” Canadian Review of Comparative Literature/Revue Canadienne de Littérature Comparée, vol. 43, no. 1, Mar 2016, pp. 51-74. Project Muse, doi:10.1353/crc.2016.0009.

Helms, Jason. Rhizcomics: Rhetoric, Technology, and New Media Composition. Sweetland

Digital Rhetoric Collaborative and Michigan P, 2017. http://www.digitalrhetoriccollaborative.org/rhizcomics/index.html

Hilst, Joshua C. “Gutter Talk: (An)Other Idiom of Rhetoric.” JAC: A Journal of Rhetoric, Culture, & Politics, vol. 31, no. ½, 2011, pp. 153-176. JSTOR, http://www.jstor.org/stable/20866989?seq=1#page_scan_tab_contents.

Johnson-Eilola, Johndan. Datacloud: Toward a New Theory of Online Work. Hampton P, 2005. New Dimensions in Computers and Composition.

Kress, Gunther. Multimodality: A Social Semiotic Approach to Contemporary Communication. Routledge, 2010.

McCloud, Scott. Understanding Comics: The Invisible Art. HarperCollins, 1993.

Olbrechts-Tyteca, Lucie. “Les Couples Philosophiques: Une Nouvelle Approche.” Revue Internationale de Philosophie, vol. 33, no. 127-128, 1979, pp. 81-98.

---. “Le Comique du Discours. Éditions de l’Université de Bruxelles, 1974.

Olson, Kathryn M. “The Role of Dissociation in Redeeming Knowledge Claims: Nineteenth-Century Shakers’ Epistemological Resistance to Decline.” Philosophy & Rhetoric, vol. 28, no. 1, 1995, pp. 45-68. JSTORhttp://www.jstor.org.proxywcupa.klnpa.org/stable/40237837.

Perelman, Chaim and L. Olbrechts-Tyteca. The New Rhetoric: A Treatise on Argumentation. Translated by John Wilkinson and Purcell Weaver. U of Notre Dame P, 1969.

Rice, Jeff. The Rhetoric of Cool: Composition Studies and New Media. Southern Illinois UP, 2007.

Spiegelman, Art. “Those Dirty Little Comics.” Art Spiegelman: Comix, Essays, Graphics and Scraps (From Maus to Now to MAUS to Now). Sellerio Editore, 1999, pp. 97-100.

“The Empty Hearse.” Sherlock, directed by Jeremy Lovering, performance by Benedict Cumberbatch and Martin Freeman, season 3, episode 1, Hartswood Films, 2014.

Turk, Tisha. “Transformation in a New Key: Music in Vids and Vidding.” Music, Sound, and the Moving Image, vol. 9, no. 2, 2015, pp. 163-176, Project Muse, doi:10.3828/msmi.2015.11.

Vernallis, Carol. “Music Video’s Second Aesthetic?” The Oxford Handbook of New Audiovisual Aesthetics. Edited by John Richardson et al., Oxford UP, 2013, pp. 437-465.

Zarefsky, David et al.. “Reagan’s Safety Net for the Truly Needy: The Rhetorical Uses of Definition.” Central States Speech Journal, vol. 35, no. 2, 1984, pp. 113-119.

2001: A Space Odyssey, director Stanley Kubrick, performance by Keir Dullea, Metro-Goldwn-Mayer (MGM), 1968.