The Quantitative Analysis of Comics: Towards a Visual Stylometry of Graphic Narrative

2018, Empirical Comics Research: Digital, Multimodal and Cognitive Methods

This book chapter, published as part of the edited volume "Empirical Comics Research (Routledge, 2018), puts forward a quantitative approach to the study of comics by introducing a stylistic analysis of a corpus of around 240 graphic novels, graphic memoirs and other non-fiction.

Empirical Comics Research Digital, Multimodal, and Cognitive Methods Edited by Alexander Dunst, Jochen Laubrock, and Janina Wildfeuer First published 2019 by Routledge 711 Third Avenue, New York, NY 10017 and by Routledge 2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN Routledge is an imprint of the Taylor & Francis Group, an informa business © 2019 Taylor & Francis The right of editors Alexander Dunst, Jochen Laubrock, and Janina Wildfeuer to be identified as the authors of the editorial material, and of the authors for their individual chapters, has been asserted in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Names: Dunst, Alexander, 1980– editor. | Laubrock, Jochen, editor. | Wildfeuer, Janina, 1984– editor. Title: Empirical comics research: digital, multimodal, and cognitive methods / edited by Alexander Dunst, Jochen Laubrock, and Janina Wildfeuer. Description: New York: Routledge, 2019. | Series: Routledge advances in comics studies; 6 | Includes bibliographical references and index. Identifiers: LCCN 2018014754 | Subjects: LCSH: Comic books, strips, etc.—History and criticism. | Graphic novels—History and criticism. Classification: LCC PN6714 .E48 2019 | DDC 741.5/9—dc23 LC record available at https://lccn.loc.gov/2018014754 ISBN: 978-1-138-73744-0 (hbk) ISBN: 978-1-315-18535-4 (ebk) Typeset in Sabon by codeMantra Contents List of Figures and Tables Acknowledgements 1 Comics and Empirical Research: An Introduction ix xv 1 A L E X A N D E R D U N S T, J O C H E N L AU B RO C k , A N D JA N I NA W I LDF EU ER PART I Digital Approaches to Comics Research 2 Two Per Cent of What? Constructing a Corpus of Typical American Comic Books 25 27 B A RT B E AT Y, N I C k S O U S A N I S , A N D B E N J A M I N WO O 3 The Quantitative Analysis of Comics: Towards a Visual Stylometry of Graphic Narrative 43 A L E X A N D E R D U N S T A N D R I TA H A RT E L 4 “The Spider’s Web”: An Analysis of Fan Mail from Amazing Spider-Man, 1963–1995 62 J O H N A . WA L S H , S H AW N M A RT I N , A N D J E N N I F E R S T. G E R M A I N 5 Crowdsourcing Comics Annotations 85 M IH N EA T U FIS A N D JEA N-GABR IEL GA NASCI A 6 Computer Vision Applied to Comic Book Images C H R I S T O P H E R I G AU D A N D J E A N - C H R I S T O P H E B U R I E 104 vi Contents PART II Linguistics and Multimodal Analysis 7 From Empirical Studies to Visual Narrative Organization: Exploring Page Composition 125 127 J O H N A B AT E M A N , A N N I k A B E C k M A N N , A N D RO C Í O I N É S VA R E L A 8 Character Developments in Comics and Graphic Novels: A Systematic Analytical Scheme 154 C H I AO - I T S E N G , J O C H E N L AU B RO C k , A N D JA NA PFLA EGI NG 9 How Informative are Information Comics in Science Communication? Empirical Results from an Eye-Tracking Study and Knowledge Testing 176 H A N S - J ü RG E N B U C H E R A N D B E T T I N A B OY 10 The Interpretation of an Evolving Line Drawing 197 PA S C A L L E F è V R E A N D G E RT M E E S T E R S PART III Cognitive Processing and Comprehension 215 11 Viewing Static Visual Narratives through the Lens of the Scene Perception and Event Comprehension Theory (SPECT) 217 L E S T E R C . L O S C H k Y, J O H N P. H U T S O N , M AV E R I C k E . S M I T H , T I M J . S M I T H , A N D J O S E P H P. M AG L I A N O 12 Attention to Comics: Cognitive Processing During the Reading of Graphic Literature 239 J O C H E N L AU B RO C k , S V E N H O H E N S T E I N , A N D M AT T H I A S k ü M M E R E R 13 Reading Words and Images: Factors Influencing Eye Movements in Comic Reading C L A R E k I RT L E Y, C H R I S T O P H E R M U R R AY, P H I L L I P B . VAU G H A N , A N D B E N J A M I N W. TAT L E R 264 Contents vii 14 Detecting Differences between Adapted Narratives: Implication of Order of Modality on Exposure 284 J O S E P H P. M AG L I A N O , J A M E S A . C L I N T O N , E DWA R D J . O ’ B R I E N , A N D DAV I D N . R A P P 15 Visual Language Theory and the Scientific Study of Comics 305 N EIL COH N Glossary List of Contributors Index 329 337 347 3 The Quantitative Analysis of Comics Towards a Visual Stylometry of Graphic Narrative Alexander Dunst and Rita Hartel 1. Introduction: The Increasing Diversity of Comics Studies For most of its relatively brief history, academic comics research was dominated by methods drawn from literary studies. A distinct form of close reading characterized this growing field of inquiry: Focusing on a small, emerging canon, scholars offered reflections on selected comics series, graphic novels, and manga that sought to historicize the development of the medium or put forward versions of ideology critique inspired by cultural studies. As Bart Beaty has argued, this methodological focus tended to neglect the visual aspects of comics insofar as it read them as a form of literature rather than art (Comics). More recently, a wider range of empirical and formalist approaches has supplemented literary methods—a consequence in part of the rising cultural prestige of comics. Multimodal research, the first forays by cognitive scientists, an emerging transmedia narratology, computer science, and the digital humanities (DH) now all contribute to our understanding of comics. This chapter adds a decidedly quantitative approach to the study of comics by introducing a stylistic analysis of a large corpus of booklength comics, or graphic narratives. As the next section explains in more detail, even empirical research in the field has yet to move beyond the analysis of a limited canon. To date, humanities scholars interested in comics have lacked the means to study large numbers of images given the combined challenges of digitization, annotation, and computation. In the few cases where computer scientists have assembled or analyzed large collections of comics, their work has focused on the automatic recognition of formal features such as panels, speech balloons, or characters, rather than researching stylistic or narrative features (Fujimoto et al., Iyer et al., Guerin et al.). In this essay, we begin to remedy this situation and introduce methods that distinguish between different genres, detail the historical evolution of graphic novels and memoirs, and shed light on central aspects of authorship and publication formats. Our visual analysis is based on the first representative corpus of graphic narrative, which we understand to be book-length comics that are aimed at 44 Alexander Dunst and Rita Hartel an adult readership and tell continuous, or closely related, stories. Thus, we follow scholars such as Hillary Chute, Daniel Stein, and Jan-Noël Thon in distinguishing between graphic narrative as long-form comics that tell both fictional and non-fictional stories, and individual genres such as the graphic novel and graphic memoir (Chute 453 and Stein & Thon 4–7). While many publishers, literary critics, and even comics artists casually refer to the graphic novel in this earlier sense as an umbrella term, this usage is highly misleading. Neither Art Spiegelman’s Maus nor Joe Sacco’s Palestine are graphic novels in any meaningful sense of the word but constitute examples of graphic non-fiction—a memoir in the first and journalistic reportage in the second instance. These theoretical distinctions take on practical importance in empirical research and will form the conceptual foundation for the automated genre distinction we present here. In this chapter, an introduction to our research question and hypotheses are followed by sections on method, corpus design, and data analysis. For readers unaccustomed to technical discussions, these subchapters may present a challenge. However, comprehension of the chapter does not require the understanding of equations or statistics. Two final sections analyze results and offer an outlook on future work. 2. Rationale and Hypotheses: Moving from Qualitative to Quantitative Research A number of roadblocks have impeded quantitative research on comics. Until recently, studying comics was the domain of individual fans or scholars who often lacked access to technical infrastructure and expertise. The large-scale digitization of comics by publishers, researchers, or on crowdsourced online databases represents a recent phenomenon that faces severe copyright restrictions. Crowdsourced materials add further challenges, as their quality may differ drastically from one scanned image to another. Skewed or yellowed pages, blurred captions or other scanning artifacts will impede text and image recognition. In turn, copyright restrictions may not only restrict digitization but also prevent the sharing of corpora that already exist. Like other researchers in DH, comics scholars also face methodological challenges and increasingly need expertise in statistics and practical programming skills in addition to disciplinary knowledge. Where these hurdles are overcome, annotation adds another, if frequently necessary, bottleneck. Even semiautomated mark-up remains a cost-intensive and time-consuming task, proving prohibitive for studying large data sets. Overcoming these challenges becomes an even more urgent task given the necessary limitations of qualitative research. Experimental studies of cognitive processes add an invaluable perspective to our understanding of comics narratives. For the first time, methods such as eye tracking The Quantitative Analysis of Comics 45 and EEG allow for the direct study of reception processes and brain functions. However, experimental set-ups remain restricted to studying excerpts, shorter texts such as comic strips and brief stories, or smaller numbers of long-form comics. Similarly, narratology and multimodal approaches currently lack the powers of automation that would enable their application to large corpora of visual narratives. This restriction to excerpts and small samples risks obscuring both the historical evolution and aesthetic properties of the medium. Individual case studies, however pertinent the insights they may produce, must be checked against quantitative analyses of comics as a cultural system. Ultimately, cognitive and literary case studies may form one element within a tripartite research program. Ideally, corpus research will eventually complement experimental studies and close readings (Beaty, Sousanis & Woo; Cohn, this volume). Both, in turn, may then be embedded in a cultural sociology that mediates between different levels of analysis and provides a media-specific theory of production, circulation, and consumption (Underwood & English). Automated stylistic measures mark an important step in making this ambitious research program a reality. In what follows, we introduce several measurements that describe the visual style of graphic narrative. This focus on visual properties means that text enters into our analysis only as image, not linguistic, data. We maintain this focus for pragmatic as well as methodological reasons. The automatic detection of text on comics pages remains a work in progress, and optical character recognition (OCR) still struggles with handwritten or quasi-handwritten comics fonts. As was mentioned earlier, manual mark-up remains extremely time-intensive. This means that we cannot easily annotate the text contained in hundreds of book-length comics. At the same time, there exist a wide range of methods for analyzing literary texts that may be applied to comics once we successfully adapt OCR to the latter’s idiosyncrasies. In contrast, our research for the first time describes and differentiates between some of the central concepts that underpin the study of comics on the basis of visual style. The work we present in this chapter draws its inspiration from several developments in DH. Issues of genre and authorship have been at the center of digital literary studies recently, drawing on stylometry, topic modeling, and social network analysis (Moretti, Jockers, and Underwood). These methods will likely gain importance for comics research in the years ahead. However, comics are also, and primarily, a visual medium whose verbal components are presented as written text, rather than given auditorily. Any systematic study of comics must therefore engage with their image content and the issue of artistic style. One of the first DH scholars to address this question was Lev Manovich, whose research presented exploratory visualizations of manga and modernist painting and led to the development of a software tool for 46 Alexander Dunst and Rita Hartel large image sets (“Style Space”). Manovich’s methods were highly suggestive. However, they proved mostly illustrative when applied to specific media and offered little insight into aesthetic form. James Cutting’s quantitative studies of Hollywood film, which attend to aspects such as shot length, movement, and color to trace cinema’s formal development, proved another inspiration. Cutting’s work, based in psychology, shares our own interest in popular culture and visual style and presents opportunities for comparing two related media. In formulating our main research questions and hypotheses, we drew on the approaches elaborated above: Stylometry and literary theory, digital art history, and empirical film studies. We were also guided by the experience of assembling and studying a growing corpus of graphic narratives. Thus, it seemed to us that graphic memoirs were often brighter than other examples of long-form comics. In one of his papers, Cutting described the development of Hollywood cinema with the terms “quicker, faster, darker” (“Quicker” 569). Quicker and faster—a decrease in the average shot length and an increase in motion and movement—seemed specific to film. Brightness, however constitutes an important aspect of comics as well. If Hollywood cinema was becoming darker (noticeable in the action films that generate a large share of industry profits), would we find a similar change in graphic narratives? Another intuition concerned the visual regularity of graphic novels and memoirs. Prominent examples of the two genres—as different as Alison Bechdel’s Fun Home and Chris Ware’s Jimmy Corrigan— eschewed the visual spectacle of contemporary Marvel and DC comics to tell their stories with the help of regular panel grids and restrained color palettes. This observation gave rise to our second research question: Were graphic narratives becoming less animated in their visual style? If so, did this development affect all genres within that broad category? Earlier studies, in which we analyzed over a hundred cover images, suggested a different hypothesis (Dunst et al. “Corpus Analysis”). The increasing stylistic variety of covers seemed to suggest a process of internal diversification. Artists and publishers, we speculated, might be trying to distinguish themselves in a crowded market by designing increasingly varied covers. Would we be able to discern a similar process at work within graphic narratives? Thinking about genre and the dynamics of the still-developing format of graphic narrative led us to the nexus of authorship and publishing. Historically, most comics were published serially, often anonymously, and by teams of writers, illustrators, letterers, and inkers. Serial publication remains relatively common for graphic narratives, a characteristic they share with the literary novel at an earlier stage of its development in the eighteenth and nineteenth centuries. Most typically, however, graphic narratives are a one-shot form, published as novel-length books. Similarly, a team of creatives collaborating on a book is no rarity. However, the preponderance of single authors means that a version of auteur theory has The Quantitative Analysis of Comics 47 grown around research into graphic novels. Did these configurations of authorship and publication leave artistic traces that we could detect with the help of automatic measures? In other words, and with the proviso that we were just beginning this kind of work: Are concepts such as the single author and the book format meaningful for the study of graphic narratives? 3. Method: Automatic Measurements and Artistic Style in Graphic Narrative This section describes the motivations for building a corpus of graphic narrative, presents a brief overview of the principles underlying the corpus design, and details the basic measurements and procedures on which our analysis of comics images is based. 3.a Material: The Graphic Narrative Corpus In a recent interview, Art Spiegelman described the graphic novel as the “dominant form” of contemporary comics, defining it simply as the “single book that tells a story” (“Public Conversation” 35). Our interest in graphic narrative—what Spiegelman termed, following popular convention, comics novels—stems from a similar observation: The rapid rise of long-form comics in the United States and internationally, and its transformative effect on the medium as a whole. Studying the evolution of graphic narratives thus offers the opportunity to do cultural history in situ: to explore the dynamics of a popular form during a period when it continues to morph into high, or at least middlebrow, culture, in a process we might term aesthetic gentrification.1 The fact that graphic narratives are highly labor-intensive and constitute a niche product in market terms, often devised by a single author over a period of several years, means that they are published in far smaller numbers than literary novels or feature films. Graphic narrative’s brief history and comparatively small numbers enable even a mid-sized corpus to provide insight into an entire cultural form. Their appropriation of aspects of the novel and cinema also make graphic narratives an ideal test case for inter- and transmedial research. Motivated by these research interests, the graphic narrative corpus (GNC) collects fictional and non-fictional texts, including graphic novels and memoirs, as well as graphic journalism (Dunst et al., “Graphic Narrative Corpus”). An additional element is what we refer to as graphic fantasy, which includes examples of the fantasy, fairy tale, horror, mystery, superhero, and science fiction genres. In our definition, the term graphic narrative refers to book-length comics that exceed 64 pages in length, tell one continuous or closely-related stories, are aimed primarily at an adult readership, and form one single volume or a limited series (such as a trilogy, etc.). Historically, our corpus stretches from the mid-1970s, when the graphic novel came into its own, to the present. At the time of 48 Alexander Dunst and Rita Hartel writing, we include around 240 titles. This collection of texts has been conceived as a stratified monitor corpus. This means that we keep adding new titles to increase representativeness and aim to balance aspects like genres or the gender of our authors within the overall corpus. Due to their pop-cultural status and the recent advent of systematic research on the topic, it is impossible to know exactly how many graphic narratives exist. Most libraries, if they collect graphic narratives at all, have only recently begun to do so and lack systematic focus in their acquisitions policy. Established online databases such as the Grand Comics Database (GCD) theoretically provide a more complete overview but do not always distinguish graphic narratives from serial comics. Therefore, we have drawn on a wide range of sources in constructing our corpus. These are: international comics prizes (Eisner, Ignatz, Harvey, and the British Comics Award), academic databases (JSTOR, MLA, Bonn Online Bibliography of Comics Research), Amazon.com bestseller lists, online bibliographies (Grand Comics Database, Comicvine), library collections (Library of Congress, Billy Ireland Cartoon Library at Ohio State University), literary histories, a survey of international comics experts, and media articles about graphic narratives (The Guardian, Time, etc.). By casting our net widely, we aim to balance pop-cultural narratives drawn from genres such as superhero and horror comics with more prestigious forms such as graphic memoirs and offset the biases of individual sources. 3.b Design and Procedure: Basic Measurements This section provides a brief overview and explanation of the stylistic measurements we used in our research for this chapter. At the time of analysis in November 2017, 209 full-length books in the corpus running to nearly 50,000 pages had been scanned by a commercial provider at 600dpi and saved in PNG format. Scans were checked manually for quality, and pages containing scanning artifacts were replaced with new scans. For each page of each graphic narrative in our corpus, we collected the following three basic measurements: Brightness In order to measure the mean brightness of a page, we transformed the former into a grayscale image by computing the Luma of each pixel, i.e., the weighted sums of the gamma-compressed RGB values of the image, which can be viewed as a linear approximation of the pixel’s luminance. The resulting matrix contained one brightness value for each pixel of the image. The mean brightness of a page can then be calculated as the mean value of all brightness values of all pixels of the page. In addition, we calculated the standard deviation of all brightness values to receive a measurement of the page’s diversity in brightness. The Quantitative Analysis of Comics 49 Entropy In information theory, entropy (also known as Shannon entropy) may be defined as the expected value of the information contained in a message. The entropy H(X) of a message X = (x1,…,xn) of length n is defined to be H (X) := − ∑ n i =1 P ( xi ) ∗ log 2 P ( xi ) . ( ) To calculate the entropy of an image, the message X of the entropy is the list of the brightness values of each pixel, with the xi range between 0 and 255. In addition, n is the total number of pixels of the image. As P(x i) denotes the probability or relative frequency of item xi, we can compute P(xi) for a given xi by P(xi):= (Number of pixels having value xi)/(n = total number of pixels). The lowest entropy H(X) = 0 will be measured for an image of a single color, as this image does not contain any unpredictable information. A black-and-white image that contains only two colors will have entropy between zero and one, whereas it reaches a value close to zero if one color clearly dominates the other. A value approximating 1 will be measured if both colors occur equally within the image. The highest possible entropy will be achieved in cases where each level of brightness occurs in a color image and all levels of brightness are equally distributed. As soon as one brightness value dominates the others, the entropy becomes smaller (as the image becomes more predictable). Number of Shapes Similar to entropy in that it measures the unpredictability of an image, the number of shapes contained in an image describes its agitation. In order to yield normalized values, and thus values that are comparable within a graphic narrative as well as between them, we scaled the image to a height of 250 pixels. We then split grayscale images into five binary sub-images of different equal-sized brightness intervals, where each sub-image contains a 1-bit for a pixel p at coordinate (x,y) if the pixel of the original grayscale image belongs to the brightness interval assigned to this sub-image. Next, we filled small holes of 0-bits in the image up to a diameter of four pixels. For each sub-image we then counted the number of connected components, i.e., we collect areas of 1-bits that are next to each other. The number of shapes of an image thus amounts to the sum of connected components of all its sub-images. In a final step, we discounted components that came to less than 10 pixels in size. We did so in order to discount individual letters of the comics text as separate shapes. Figure 3.1 shows this process in simplified fashion. First, the picture is split into five sub-images. In this case, the darkest part contains the outlines of a woman and a book, the second incorporates the woman’s shirt and hair, as well as the book cover, and the third 50 Alexander Dunst and Rita Hartel Figure 3.1 Illustration of Shape definition. The image is an adapted excerpt from https://commons.wikimedia.org/wiki/File:BD-propagande_ colour_en.jpg (Licensed under CC BY). just contains visual noise. The fourth sub-image contains the woman’s face, neck, arm, and pants plus some noise, and the brightest part encompasses the background. Staying with our example, we would count four connected components (face, neck, arm, trousers) for the fourth image but ignore the noise, as these components are too small. For the third sub-image we would not count any shapes at all, as they are all too small. Thus, in total, we would compute that this image contains 13 + 7 + 0 + 4 + 3 = 27 shapes. 3.c Data Processing Following the calculation of our images’ raw data (brightness, entropy, and number of shapes) as described in the previous subsection, we processed these measurements for different purposes. Median Values for Brightness, Entropy, and Shapes First, we aggregated the values per page of a single measure to one single value per graphic narrative by calculating the median of this measure of all pages. The Quantitative Analysis of Comics 51 Stylistic Diversity within a Graphic Narrative In order to measure the stylistic diversity within a text, we calculated the standard deviation of the three basic measures: brightness, entropy, and number of shapes for all pages of a graphic narrative. 3.d Data Analysis We used our three basic measurements, as well as standard deviations from these measures of internal diversity, for the analysis of the following concepts: genre, authorship, and publication format. We performed Anova and Tukey’s HSD tests to distinguish individual categories, with p < 0.05 for statistical significance. Genre All titles in the corpus were assigned one or several of a total of 23 subgenre categories. As these categories proved too fine-grained to produce reliable results, subgenres were further grouped into four genres: graphic novel, graphic memoir, graphic fantasy, graphic non-fiction. 2 A small number of titles that did not fit these larger categories were collected under the term miscellaneous. A research assistant assigned subgenre terms on the basis of information provided by publishers, booksellers, or, if necessary, a book’s content. Genre categories were checked for consistency by the authors before our measurements were calculated. Authorship Our data set included several authors who were represented with more than one graphic narrative. For scatter plots that aimed to distinguish between individual artistic styles, we chose authors with three or more titles in our sample. Another series of plots distinguished between the following authorship categories: single author, collaboration between one author and one illustrator, and multiple authors. Publication Format As we mentioned earlier, serial publication remains relatively widespread for graphic narratives. Therefore, we distinguished between four different formats: single-issue, or so-called one-shot, book publications; graphic narratives that were published as part of a book series, such as a trilogy or more than three novel-length installments; graphic narratives that originally appeared as a limited series; and titles that assemble parts of an unlimited series in one volume. 52 Alexander Dunst and Rita Hartel 4. Results and Discussion Results show that all three of our analytic categories—genre, authorship, and publication format—are meaningful distinctions for the quantitative study of graphic narratives. This confirms our initial research question about the applicability of these concepts to the study of graphic narratives. 4.a Genre The genres graphic fantasy, graphic memoir, and graphic novel demonstrate significant distinctions in visual style, both synchronically and diachronically. Graphic fantasy, in particular, is significantly darker than other genre categories (Figure 3.2). Titles in this group are also distinctly less regular, with graphic fantasy showing higher median entropy levels than all other genres, in particular graphic memoirs. Examples in our data set include canonical texts such as Watchmen and V for Vendetta, the superhero arcs Batman: Year One and Batman: Dark Victory, or the fantasy narrative Fables: 1001 Nights of Snowfall. This result supports our anecdotal observation that the core genres of graphic narrative, the graphic novel and the graphic memoir, are characterized by a relatively unvarying artistic style. Titles with the lowest entropy and lowest deviation from mean brightness are overwhelmingly graphic memoirs and graphic novels. Figure 3.2 Mean Brightness across genres: Graphic Memoir - Graphic Novel (p < 0.016), Graphic Fantasy - Graphic Novel (p < 0.000). The Quantitative Analysis of Comics 53 Results did not support our broad analogy between graphic narrative and Hollywood cinema. Instead, our measurements indicate a more precise and fascinating parallel between film and graphic fantasy. Cutting had demonstrated a historical evolution towards ever-darker images. As we just saw, mean brightness proved a useful category when it came to distinguishing graphic memoirs from other genres. These distinctions are also visible in a diachronic view. Yet, Figure 3.3 does not show a development akin to Hollywood film in graphic narrative as a whole. If we follow the curve of all graphic narratives contained in our sample, we see that the form seems to have been at its darkest from the early to mid-1980s to the late 1990s. This extreme was followed by a steep upswing, and mean brightness has remained more or less stable since the 2000s. Cutting attributes the decrease of luminescence in Hollywood cinema to a number of factors, such as the greater visual range of modern film stock and then digital video, as well as cognitive benefits. As he writes: “a darker film in a dark theatre allows for greater dynamic contrast, which in turn allows for better control over viewers’ attention” (574). While this development has no equivalent in graphic narrative as a whole, it is worth noting that graphic fantasy evinces greater brightness contrasts and darker images than other genres. Our comparatively shorter time period shows only a slight increase on these scores since the 1980s. Nonetheless, graphic fantasy demonstrates clear parallels with contemporary Hollywood cinema—a form with which it shares an emphasis on visual spectacle, and for which it serves as frequent source material. Whether these similarities are motivated by cognitive benefits, or whether we should attribute the darker images of graphic fantasy to its depiction of personal conflict and dystopian worlds, remains an open question. Figure 3.3 Year vs. Mean Brightness by Genre. 54 Alexander Dunst and Rita Hartel Figure 3.4 Year vs. Standard Deviation from Mean Brightness by Genre. The diachronic view also fleshes out our characterization of the graphic memoir as comparatively uniform in style. Figure 3.4 details how this development has become markedly more pronounced since the mid-2000s. These years saw the publication of well-known examples of the genre such Bechdel’s Fun Home, which is included in our sample. The book’s visual regularity and monochrome color scheme contributes to this development, but also can be seen to influence the further evolution of the graphic memoir. 4.b Authorship As with genre, our automated measurements were able to distinguish between individual authors on the basis of visual style. Figure 3.5 visualizes the artistic signature of authors who are present with three or more titles in our sample. Authors such as Jason Lutes, Ozamu Tezuka, and Chester Brown, who evince a comparatively consistent style across several titles, take up a smaller space within the overall matrix. Other authors cover a far larger swathe of “style space” (Manovich). These include Frank Miller, Dave Mckean, and David Mazzuchelli. To anyone who has read books by these authors, the results will not come as a surprise. At one end of the scale, Brown favors a highly distinctive drawing style characterized by simple lines and the autobiographical description of everyday life. At the other extreme, Mazzuchelli’s graphic novel Asterious Polyp marks a conscious attempt to combine radically different styles within one over-arching narrative. As an The Quantitative Analysis of Comics 55 Figure 3.5 PCA Author Style. aesthetic experiment, it is worlds apart from his other titles contained in our sample: City of Glass, an adaptation of Paul Auster’s novel that successfully translates a literary aesthetic into the comics format, and his work in the superhero genre: Daredevil: Born Again and Batman: Year One. Tezuka’s mingling among the crowd highlights the limits of this approach. Clearly, these stylistic measurements do not help us differentiate between individual manga titles and Western graphic narratives. Whether they’ll be able to do so once we’ve constructed a comparison corpus that includes a sufficient number of manga remains to be seen. Alan Moore and Harvey Pekar present different challenges. These two are featured on the book covers as the authors of their works. Unlike the other authors included in this scatter plot, however, they work with professional artists to draw their comics. And yet, their titles show remarkable consistency, especially in Pekar’s case. It is tempting to attribute these results to Moore and Pekar’s authorial influence over these graphic narratives. Moore, for one, has been known to imagine his stories in meticulous detail that includes many of their visual features. In turn, most of Pekar’s comics are characterized by a relatively narrow set of locales and themes—the depiction of everyday life in Cleveland. Thus, it may be that our stylistic measurements, despite their small number and relative simplicity at this point, capture something of the underlying content described by their authors and realized visually by different artists. A less optimistic take may question whether these measurements remain too broad at this point to ground such an interpretation. 56 Alexander Dunst and Rita Hartel Figure 3.6 Standard Deviation from Shapes per Page 1 Author - Author + Illustrator (p < 0.0114). That the division of labor between writer and visual artist does lead to measurably different results becomes clear in Figure 3.6. Titles that are co-authored tend to be more varied stylistically, showing a higher standard deviation from the number of shapes. We can interpret this as a surplus of artistic creativity—but this surplus may also make a graphic novel less coherent, and therefore more difficult to read. Figure 3.7 uses the same two measures but places individual book titles in a scatter plot. Two contrasting perspectives command attention here. The first focuses on extremes: Titles that stand out as particularly diverse or consistent. Thus, Brown’s memoir I Never Liked You shows up as the most internally consistent title in our sample. Given its black and white pages and what we noted earlier about Brown’s drawing style, this may come as little surprise. The same goes for Marjane Sartrapi’s Persepolis—another graphic memoir drawn in black and white and a pared-down visual style. Frank Miller’s installment Sin City: The Hard Goodbye may be a more surprising find at this end of the scale. Black-and-white pages play their part. More decisively, perhaps, Frank Miller’s jagged edges and reduced settings are uniformly so, evoking a world that is as bleak as it is repetitively brutal. At the other end, featuring extremely diverse graphic narratives, Neil Gaiman and Dave Mckean’s Signal to Noise lives up to its name. Another perspective may prove equally if not more interesting. Exceptions are one thing—and the humanities have long focused on what they deem to be extraordinary cultural artifacts. Yet our plot shows that some of the most successful authors in the business, those who have managed to publish and sell several graphic narratives, congregate in a The Quantitative Analysis of Comics 57 Figure 3.7 Standard Deviation of Mean Brightness vs. Standard Deviation of Number of Shapes by Book Titles. relatively tight cluster. Examples include Will Eisner, Alan Moore, and Craig Thompson. Thus, it might be at the center of these two scales, at a distance from the extremes of internal variety or uniformity, that graphic narratives function most successfully: varied enough to retain the interest of comics readers, sufficiently consistent not to disrupt narrative flow. Thus, the process of diversification that we discerned in book covers seems not to extend to graphic storytelling. Where covers must arouse interest, the narratives that follow need to retain reader’s attention, apparently at the cost of too much variety. 4.c Publication Format Finally, our measurements show significant differences between publication formats. The measures entropy and number of shapes did not give significant results. However, graphic narratives that were originally issued as book publications, either as single-issue volumes or together with other installments, are much brighter than titles that first appeared as limited or unlimited series (Figure 3.8). One reason for this result lies in the impact of genre: Graphic memoirs and graphic novels are groups 58 Alexander Dunst and Rita Hartel Figure 3.8 Mean Brightness by Publication Format: Book Publication - Limited Series (p < 0.001), Book Publication - Unlimited Series (p < 0.004), Book Series - Unlimited Series (p < 0.02). of texts that are generally characterized by higher brightness values. At the same time, prominent exceptions featured in our corpus are not difficult to find: Spiegelman’s Maus, Daniel Clowes’s Ghost World, and Charles Burns’s Black Hole all appeared in serial form first. 5. Conclusion & Outlook In a famous essay originally published in 1958, the French historian Fernand Braudel foresaw “the advent of a quantitative history” (“History”). According to Braudel, this new historiography would “get past superficial observation in order to reach the zone of unconscious or barely conscious elements, and then to reduce that reality to tiny elements, minute identical sections, whose relations can be precisely analyzed” (44). Sixty years later, such a quantitative history is appearing on the horizon for the humanities. As a medium, comics have only recently garnered attention from cultural historians and are only starting to interest digital humanists. Still, with every step forward additional possibilities become imaginable. Here, we have presented an initial study, which functions as proof of concept for a stylometric approach that analyzes graphic narratives based on their image content alone. Our corpus will soon grow from around 200 to 250 full-length graphic narratives and supply reference corpora of German graphic novels, Franco-Belgian bande dessinée, and Japanese manga. These numbers may seem small in The Quantitative Analysis of Comics 59 comparison to some literary corpora. Conceivably, a single person may read all of them over a couple of months. Yet even 250 books stretch the human capacity for synthesis. And no human eye or brain matches a household computer when it comes to data retention, arithmetic precision, or even pattern recognition. As we add texts, we will also introduce new measurements to supplement our stylistic analysis, starting with colorfulness, and extending to algorithms that distinguish between different kinds of drawn edges and lines. Most importantly, we need to improve automatic text location and OCR capacities to open the treasure trove of textual stylometry. Only combinatory methods of visual and textual analysis will allow us to study a large number of comics in their full complexity and understand the minute interaction between those levels. Distinguishing genre and author signals constitutes another area for improvement. This essay has demonstrated that texts grouped together under the mantle of authorship or genre affiliation share certain stylistic traits. With a sufficiently large data set, we may be able to train the computer to identify other members of these groups, or to establish where and how they overlap. Yet given that artists repeatedly publish in one genre, it is very likely that author signals are interfering with what we take to be the characteristics of genre. Digital literary studies offers examples of how these categories can be disentangled. We need to adopt its methods with care, and adapt and invent wherever necessary. Given his focus on socioeconomic data, Braudel’s essay does not engage with the potential of experimental science. Nonetheless, reception data adds a decisive element to quantitative analysis. DH methods tend to examine the formal characteristics of cultural objects in isolation, continuing a long tradition that reduces reception to the intuitions of the expert reader. The successful combination of quantitative and cognitive methods will depend on an additional, third, element. Quantitative, as much as qualitative, methods need a strong grounding in theory—a cultural theory that lays the groundwork for operationalizing its concepts and connects different media and aesthetic structures. Only then will we be able to shape a truly novel method of analyzing comics. Acknowledgements This work was funded by the German Federal Ministry of Education and Research (BMBF) as part of an early-career research group on digital and cognitive approaches to graphic narrative. For a project description and other information see: graphic-literature.upb.de. We are also grateful to our research assistants Volker Deppe and Svitlana Zarytska for data processing and database entry in preparation of this essay. 60 Alexander Dunst and Rita Hartel Notes 1 Beaty similarly speaks of the graphic novel as a “gentrifying term” (“Introduction” 108). 2 Subgenres were grouped as follows: graphic novel (action/adventure, coming of age, crime, fiction, historical fiction, romance, comedy, satire); graphic memoir (autobiography, memoir); graphic fantasy (fairy tale, fantasy, horror, mystery, science fiction, superhero); graphic non-fiction (biography, educational, historical non-fiction, graphic journalism, travel writing). Works Cited Beaty, Bart. “Introduction.” Cinema Journal, vol. 50, no. 3, 2011, pp. 106–10. ———. Comics versus Art. U of Toronto P, 2012. Braudel, Fernand. “History and the Social Sciences: The Long Durée.” On History, translated by Sarah Matthews, U of Chicago P, 1980, pp. 25–54. Chute, Hillary. “Comics as Literature? Reading Graphic Narrative.” PMLA, vol. 123, no. 2, 2008, pp. 452–465. Cutting, James E., Caitlin L. Brunick, Jordan E. DeLong, Catalina Iricinschi, and Ayse Candan. “Quicker, Faster, Darker: Changes in Hollywood Film over 75 Years.” i-Perception, vol. 2, 2011, pp. 569–576. Dunst, Alexander, Rita Hartel, and Jochen Laubrock. “The Graphic Narrative Corpus: Design, Annotation, and Analysis for the Digital Humanities.” Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR 2017), 13–15 Nov., 2018, kyoto, pp. 15–20. Dunst, Alexander, Rita Hartel, Sven Hohenstein, and Jochen Laubrock. “The Corpus Analysis of Multimodal Narrative: The Example of Graphic Novels.” Digital Humanities 2016, 11–16 July 2016, Cracow. Conference Paper. English, James F., and Ted Underwood. “Shifting Scales: Between Literature and Social Science.” Modern Language Quarterly, vol. 77, no. 3, 2016, pp. 277–295. Fujimoto, Azuma, Toru Ogawa, kazuyoshi Yamamoto, Yusuke Matsui, Toshihiko Yamasaki, and kiyoharu Aizawa. “Manga109 Dataset and Creation of Metadata.” MANPU 2016: Proceedings of the 1st International Workshop on Comics Analysis, Processing and Understanding, 4 Dec 2016, Cancun, ACM, 2016, pp. 1–5. Guérin, Clément, Christophe Rigaud, Antoine Mercier, Farid Ammar-Boudjelal, karell Bertet, Alain Bouju…, and Arnaud Revel. “EBDtheque: A Representative Database of Comics.” ICDAR 2013: Proceedings.12th International Conference on Document Analysis and Recognition, 25–28 Aug. 2013, Washington DC, HAL, 2013, pp. 1145–1149. Hal.archives-ouvertes, hal-00914860. Iyyer, Mohit, Varun Manjunatha, Anupam Guha, Yogarshi Vyas, Jordan BoydGraber, Hal Daumé III, and Larry Davis. “The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives.” Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition, 26 June–1 July 2016, Las Vegas, CPS, 2016, pp. 1–10. Jockers, Matthew L. Macroanalysis: Digital Methods and Literary History. U of Illinois P, 2013. Manovich, Lev. “Style Space: How to Compare Image Sets and Follow their Evolution.” Manovich, 2011, manovich.net/index.php/projects/style-space. Accessed 25 June 2017. The Quantitative Analysis of Comics 61 Moretti, Franco. Graphs, Maps, Trees: Abstract Models for a Literary Theory. Verso, 2005. Spiegelman, Art, and W. J. T. Mitchell. “Public Conversation: What the %$#! Happened to Comics.” Critical Inquiry, vol. 40, no. 3, 2014, pp. 20–35. Stein, Daniel, and Jan-Noël Thon. “Introduction: From Comic Strips to Graphic Novels.” From Comic Strips to Graphic Novels: Contributions to the Theory and History of Graphic Narrative, edited by Daniel Stein and Jan-Noël Thon, DeGruyter, 2013, pp. 1–23. Underwood, Ted. “The Life Cycle of Genres.” The Journal of Cultural Analytics, 2016, culturalanalytics.org/2016/05/the-life-cycles-of-genres. Accessed 25 Oct 2016.

Log In

The Quantitative Analysis of Comics: Towards a Visual Stylometry of Graphic Narrative

Related papers

Related papers

Related topics