Empirical Comics Research
Digital, Multimodal, and Cognitive
Methods
Edited by
Alexander Dunst, Jochen Laubrock,
and Janina Wildfeuer
First published 2019
by Routledge
711 Third Avenue, New York, NY 10017
and by Routledge
2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
Routledge is an imprint of the Taylor & Francis Group, an
informa business
© 2019 Taylor & Francis
The right of editors Alexander Dunst, Jochen Laubrock, and
Janina Wildfeuer to be identified as the authors of the editorial
material, and of the authors for their individual chapters, has
been asserted in accordance with sections 77 and 78 of the
Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted
or reproduced or utilised in any form or by any electronic,
mechanical, or other means, now known or hereafter invented,
including photocopying and recording, or in any information
storage or retrieval system, without permission in writing from
the publishers.
Trademark notice: Product or corporate names may be
trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
Library of Congress Cataloging-in-Publication Data
Names: Dunst, Alexander, 1980– editor. | Laubrock, Jochen,
editor. | Wildfeuer, Janina, 1984– editor.
Title: Empirical comics research: digital, multimodal, and
cognitive methods / edited by Alexander Dunst, Jochen
Laubrock, and Janina Wildfeuer.
Description: New York: Routledge, 2019. |
Series: Routledge advances in comics studies; 6 |
Includes bibliographical references and index.
Identifiers: LCCN 2018014754 |
Subjects: LCSH: Comic books, strips, etc.—History and
criticism. | Graphic novels—History and criticism.
Classification: LCC PN6714 .E48 2019 | DDC 741.5/9—dc23
LC record available at https://lccn.loc.gov/2018014754
ISBN: 978-1-138-73744-0 (hbk)
ISBN: 978-1-315-18535-4 (ebk)
Typeset in Sabon
by codeMantra
Contents
List of Figures and Tables
Acknowledgements
1 Comics and Empirical Research: An Introduction
ix
xv
1
A L E X A N D E R D U N S T, J O C H E N L AU B RO C k , A N D
JA N I NA W I LDF EU ER
PART I
Digital Approaches to Comics Research
2 Two Per Cent of What? Constructing a Corpus of
Typical American Comic Books
25
27
B A RT B E AT Y, N I C k S O U S A N I S , A N D B E N J A M I N WO O
3 The Quantitative Analysis of Comics: Towards a
Visual Stylometry of Graphic Narrative
43
A L E X A N D E R D U N S T A N D R I TA H A RT E L
4 “The Spider’s Web”: An Analysis of Fan Mail from
Amazing Spider-Man, 1963–1995
62
J O H N A . WA L S H , S H AW N M A RT I N , A N D J E N N I F E R
S T. G E R M A I N
5 Crowdsourcing Comics Annotations
85
M IH N EA T U FIS A N D JEA N-GABR IEL GA NASCI A
6 Computer Vision Applied to Comic Book Images
C H R I S T O P H E R I G AU D A N D J E A N - C H R I S T O P H E B U R I E
104
vi
Contents
PART II
Linguistics and Multimodal Analysis
7 From Empirical Studies to Visual Narrative
Organization: Exploring Page Composition
125
127
J O H N A B AT E M A N , A N N I k A B E C k M A N N ,
A N D RO C Í O I N É S VA R E L A
8 Character Developments in Comics and
Graphic Novels: A Systematic Analytical Scheme
154
C H I AO - I T S E N G , J O C H E N L AU B RO C k , A N D
JA NA PFLA EGI NG
9 How Informative are Information Comics in Science
Communication? Empirical Results from an
Eye-Tracking Study and Knowledge Testing
176
H A N S - J ü RG E N B U C H E R A N D B E T T I N A B OY
10 The Interpretation of an Evolving Line Drawing
197
PA S C A L L E F è V R E A N D G E RT M E E S T E R S
PART III
Cognitive Processing and Comprehension
215
11 Viewing Static Visual Narratives through the Lens
of the Scene Perception and Event Comprehension
Theory (SPECT)
217
L E S T E R C . L O S C H k Y, J O H N P. H U T S O N , M AV E R I C k E .
S M I T H , T I M J . S M I T H , A N D J O S E P H P. M AG L I A N O
12 Attention to Comics: Cognitive Processing During
the Reading of Graphic Literature
239
J O C H E N L AU B RO C k , S V E N H O H E N S T E I N , A N D
M AT T H I A S k ü M M E R E R
13 Reading Words and Images: Factors Influencing
Eye Movements in Comic Reading
C L A R E k I RT L E Y, C H R I S T O P H E R M U R R AY, P H I L L I P
B . VAU G H A N , A N D B E N J A M I N W. TAT L E R
264
Contents vii
14 Detecting Differences between Adapted Narratives:
Implication of Order of Modality on Exposure
284
J O S E P H P. M AG L I A N O , J A M E S A . C L I N T O N ,
E DWA R D J . O ’ B R I E N , A N D DAV I D N . R A P P
15 Visual Language Theory and the Scientific Study
of Comics
305
N EIL COH N
Glossary
List of Contributors
Index
329
337
347
3
The Quantitative Analysis
of Comics
Towards a Visual Stylometry
of Graphic Narrative
Alexander Dunst and Rita Hartel
1. Introduction: The Increasing Diversity
of Comics Studies
For most of its relatively brief history, academic comics research was
dominated by methods drawn from literary studies. A distinct form of
close reading characterized this growing field of inquiry: Focusing on a
small, emerging canon, scholars offered reflections on selected comics
series, graphic novels, and manga that sought to historicize the development of the medium or put forward versions of ideology critique inspired
by cultural studies. As Bart Beaty has argued, this methodological focus
tended to neglect the visual aspects of comics insofar as it read them
as a form of literature rather than art (Comics). More recently, a wider
range of empirical and formalist approaches has supplemented literary
methods—a consequence in part of the rising cultural prestige of comics.
Multimodal research, the first forays by cognitive scientists, an emerging
transmedia narratology, computer science, and the digital humanities
(DH) now all contribute to our understanding of comics.
This chapter adds a decidedly quantitative approach to the study of
comics by introducing a stylistic analysis of a large corpus of booklength comics, or graphic narratives. As the next section explains in
more detail, even empirical research in the field has yet to move beyond
the analysis of a limited canon. To date, humanities scholars interested
in comics have lacked the means to study large numbers of images given
the combined challenges of digitization, annotation, and computation.
In the few cases where computer scientists have assembled or analyzed
large collections of comics, their work has focused on the automatic
recognition of formal features such as panels, speech balloons, or characters, rather than researching stylistic or narrative features (Fujimoto
et al., Iyer et al., Guerin et al.). In this essay, we begin to remedy this situation and introduce methods that distinguish between different genres,
detail the historical evolution of graphic novels and memoirs, and shed
light on central aspects of authorship and publication formats. Our visual analysis is based on the first representative corpus of graphic narrative, which we understand to be book-length comics that are aimed at
44 Alexander Dunst and Rita Hartel
an adult readership and tell continuous, or closely related, stories. Thus,
we follow scholars such as Hillary Chute, Daniel Stein, and Jan-Noël
Thon in distinguishing between graphic narrative as long-form comics
that tell both fictional and non-fictional stories, and individual genres
such as the graphic novel and graphic memoir (Chute 453 and Stein &
Thon 4–7). While many publishers, literary critics, and even comics artists casually refer to the graphic novel in this earlier sense as an umbrella
term, this usage is highly misleading. Neither Art Spiegelman’s Maus
nor Joe Sacco’s Palestine are graphic novels in any meaningful sense of
the word but constitute examples of graphic non-fiction—a memoir in
the first and journalistic reportage in the second instance. These theoretical distinctions take on practical importance in empirical research and
will form the conceptual foundation for the automated genre distinction
we present here.
In this chapter, an introduction to our research question and hypotheses are followed by sections on method, corpus design, and data analysis.
For readers unaccustomed to technical discussions, these subchapters
may present a challenge. However, comprehension of the chapter does
not require the understanding of equations or statistics. Two final sections analyze results and offer an outlook on future work.
2. Rationale and Hypotheses: Moving from
Qualitative to Quantitative Research
A number of roadblocks have impeded quantitative research on comics.
Until recently, studying comics was the domain of individual fans or
scholars who often lacked access to technical infrastructure and expertise. The large-scale digitization of comics by publishers, researchers,
or on crowdsourced online databases represents a recent phenomenon
that faces severe copyright restrictions. Crowdsourced materials add
further challenges, as their quality may differ drastically from one
scanned image to another. Skewed or yellowed pages, blurred captions
or other scanning artifacts will impede text and image recognition. In
turn, copyright restrictions may not only restrict digitization but also
prevent the sharing of corpora that already exist. Like other researchers
in DH, comics scholars also face methodological challenges and increasingly need expertise in statistics and practical programming skills in
addition to disciplinary knowledge. Where these hurdles are overcome,
annotation adds another, if frequently necessary, bottleneck. Even semiautomated mark-up remains a cost-intensive and time-consuming task,
proving prohibitive for studying large data sets.
Overcoming these challenges becomes an even more urgent task given
the necessary limitations of qualitative research. Experimental studies of
cognitive processes add an invaluable perspective to our understanding
of comics narratives. For the first time, methods such as eye tracking
The Quantitative Analysis of Comics
45
and EEG allow for the direct study of reception processes and brain
functions. However, experimental set-ups remain restricted to studying
excerpts, shorter texts such as comic strips and brief stories, or smaller
numbers of long-form comics. Similarly, narratology and multimodal
approaches currently lack the powers of automation that would enable
their application to large corpora of visual narratives. This restriction to
excerpts and small samples risks obscuring both the historical evolution
and aesthetic properties of the medium. Individual case studies, however pertinent the insights they may produce, must be checked against
quantitative analyses of comics as a cultural system. Ultimately, cognitive and literary case studies may form one element within a tripartite
research program. Ideally, corpus research will eventually complement
experimental studies and close readings (Beaty, Sousanis & Woo; Cohn,
this volume). Both, in turn, may then be embedded in a cultural sociology that mediates between different levels of analysis and provides
a media-specific theory of production, circulation, and consumption
(Underwood & English).
Automated stylistic measures mark an important step in making this
ambitious research program a reality. In what follows, we introduce several measurements that describe the visual style of graphic narrative.
This focus on visual properties means that text enters into our analysis
only as image, not linguistic, data. We maintain this focus for pragmatic
as well as methodological reasons. The automatic detection of text on
comics pages remains a work in progress, and optical character recognition (OCR) still struggles with handwritten or quasi-handwritten comics fonts. As was mentioned earlier, manual mark-up remains extremely
time-intensive. This means that we cannot easily annotate the text contained in hundreds of book-length comics. At the same time, there exist
a wide range of methods for analyzing literary texts that may be applied
to comics once we successfully adapt OCR to the latter’s idiosyncrasies.
In contrast, our research for the first time describes and differentiates
between some of the central concepts that underpin the study of comics
on the basis of visual style.
The work we present in this chapter draws its inspiration from
several developments in DH. Issues of genre and authorship have been
at the center of digital literary studies recently, drawing on stylometry, topic modeling, and social network analysis (Moretti, Jockers, and
Underwood). These methods will likely gain importance for comics
research in the years ahead. However, comics are also, and primarily,
a visual medium whose verbal components are presented as written
text, rather than given auditorily. Any systematic study of comics must
therefore engage with their image content and the issue of artistic style.
One of the first DH scholars to address this question was Lev Manovich,
whose research presented exploratory visualizations of manga and
modernist painting and led to the development of a software tool for
46
Alexander Dunst and Rita Hartel
large image sets (“Style Space”). Manovich’s methods were highly suggestive. However, they proved mostly illustrative when applied to specific media and offered little insight into aesthetic form. James Cutting’s
quantitative studies of Hollywood film, which attend to aspects such as
shot length, movement, and color to trace cinema’s formal development,
proved another inspiration. Cutting’s work, based in psychology, shares
our own interest in popular culture and visual style and presents opportunities for comparing two related media.
In formulating our main research questions and hypotheses, we drew
on the approaches elaborated above: Stylometry and literary theory, digital art history, and empirical film studies. We were also guided by the
experience of assembling and studying a growing corpus of graphic narratives. Thus, it seemed to us that graphic memoirs were often brighter
than other examples of long-form comics. In one of his papers, Cutting
described the development of Hollywood cinema with the terms “quicker,
faster, darker” (“Quicker” 569). Quicker and faster—a decrease in the
average shot length and an increase in motion and movement—seemed
specific to film. Brightness, however constitutes an important aspect of
comics as well. If Hollywood cinema was becoming darker (noticeable
in the action films that generate a large share of industry profits), would
we find a similar change in graphic narratives?
Another intuition concerned the visual regularity of graphic novels and
memoirs. Prominent examples of the two genres—as different as Alison
Bechdel’s Fun Home and Chris Ware’s Jimmy Corrigan— eschewed the
visual spectacle of contemporary Marvel and DC comics to tell their stories with the help of regular panel grids and restrained color palettes.
This observation gave rise to our second research question: Were graphic
narratives becoming less animated in their visual style? If so, did this development affect all genres within that broad category? Earlier studies, in
which we analyzed over a hundred cover images, suggested a different hypothesis (Dunst et al. “Corpus Analysis”). The increasing stylistic variety
of covers seemed to suggest a process of internal diversification. Artists
and publishers, we speculated, might be trying to distinguish themselves
in a crowded market by designing increasingly varied covers. Would we
be able to discern a similar process at work within graphic narratives?
Thinking about genre and the dynamics of the still-developing format
of graphic narrative led us to the nexus of authorship and publishing.
Historically, most comics were published serially, often anonymously, and
by teams of writers, illustrators, letterers, and inkers. Serial publication
remains relatively common for graphic narratives, a characteristic they
share with the literary novel at an earlier stage of its development in the
eighteenth and nineteenth centuries. Most typically, however, graphic narratives are a one-shot form, published as novel-length books. Similarly, a
team of creatives collaborating on a book is no rarity. However, the preponderance of single authors means that a version of auteur theory has
The Quantitative Analysis of Comics
47
grown around research into graphic novels. Did these configurations of authorship and publication leave artistic traces that we could detect with the
help of automatic measures? In other words, and with the proviso that we
were just beginning this kind of work: Are concepts such as the single author and the book format meaningful for the study of graphic narratives?
3. Method: Automatic Measurements and
Artistic Style in Graphic Narrative
This section describes the motivations for building a corpus of graphic
narrative, presents a brief overview of the principles underlying the corpus design, and details the basic measurements and procedures on which
our analysis of comics images is based.
3.a Material: The Graphic Narrative Corpus
In a recent interview, Art Spiegelman described the graphic novel as the
“dominant form” of contemporary comics, defining it simply as the “single book that tells a story” (“Public Conversation” 35). Our interest in
graphic narrative—what Spiegelman termed, following popular convention, comics novels—stems from a similar observation: The rapid rise of
long-form comics in the United States and internationally, and its transformative effect on the medium as a whole. Studying the evolution of
graphic narratives thus offers the opportunity to do cultural history in
situ: to explore the dynamics of a popular form during a period when it
continues to morph into high, or at least middlebrow, culture, in a process
we might term aesthetic gentrification.1 The fact that graphic narratives
are highly labor-intensive and constitute a niche product in market terms,
often devised by a single author over a period of several years, means that
they are published in far smaller numbers than literary novels or feature
films. Graphic narrative’s brief history and comparatively small numbers
enable even a mid-sized corpus to provide insight into an entire cultural
form. Their appropriation of aspects of the novel and cinema also make
graphic narratives an ideal test case for inter- and transmedial research.
Motivated by these research interests, the graphic narrative corpus
(GNC) collects fictional and non-fictional texts, including graphic novels and memoirs, as well as graphic journalism (Dunst et al., “Graphic
Narrative Corpus”). An additional element is what we refer to as graphic
fantasy, which includes examples of the fantasy, fairy tale, horror, mystery, superhero, and science fiction genres. In our definition, the term
graphic narrative refers to book-length comics that exceed 64 pages in
length, tell one continuous or closely-related stories, are aimed primarily
at an adult readership, and form one single volume or a limited series (such
as a trilogy, etc.). Historically, our corpus stretches from the mid-1970s,
when the graphic novel came into its own, to the present. At the time of
48 Alexander Dunst and Rita Hartel
writing, we include around 240 titles. This collection of texts has been
conceived as a stratified monitor corpus. This means that we keep adding
new titles to increase representativeness and aim to balance aspects like
genres or the gender of our authors within the overall corpus.
Due to their pop-cultural status and the recent advent of systematic
research on the topic, it is impossible to know exactly how many graphic
narratives exist. Most libraries, if they collect graphic narratives at all,
have only recently begun to do so and lack systematic focus in their acquisitions policy. Established online databases such as the Grand Comics
Database (GCD) theoretically provide a more complete overview but do
not always distinguish graphic narratives from serial comics. Therefore,
we have drawn on a wide range of sources in constructing our corpus.
These are: international comics prizes (Eisner, Ignatz, Harvey, and
the British Comics Award), academic databases (JSTOR, MLA, Bonn
Online Bibliography of Comics Research), Amazon.com bestseller lists,
online bibliographies (Grand Comics Database, Comicvine), library collections (Library of Congress, Billy Ireland Cartoon Library at Ohio
State University), literary histories, a survey of international comics
experts, and media articles about graphic narratives (The Guardian,
Time, etc.). By casting our net widely, we aim to balance pop-cultural
narratives drawn from genres such as superhero and horror comics with
more prestigious forms such as graphic memoirs and offset the biases of
individual sources.
3.b Design and Procedure: Basic Measurements
This section provides a brief overview and explanation of the stylistic
measurements we used in our research for this chapter. At the time of
analysis in November 2017, 209 full-length books in the corpus running to nearly 50,000 pages had been scanned by a commercial provider
at 600dpi and saved in PNG format. Scans were checked manually for
quality, and pages containing scanning artifacts were replaced with new
scans. For each page of each graphic narrative in our corpus, we collected the following three basic measurements:
Brightness
In order to measure the mean brightness of a page, we transformed the
former into a grayscale image by computing the Luma of each pixel, i.e.,
the weighted sums of the gamma-compressed RGB values of the image,
which can be viewed as a linear approximation of the pixel’s luminance.
The resulting matrix contained one brightness value for each pixel of
the image. The mean brightness of a page can then be calculated as the
mean value of all brightness values of all pixels of the page. In addition,
we calculated the standard deviation of all brightness values to receive a
measurement of the page’s diversity in brightness.
The Quantitative Analysis of Comics
49
Entropy
In information theory, entropy (also known as Shannon entropy) may be
defined as the expected value of the information contained in a message.
The entropy H(X) of a message X = (x1,…,xn) of length n is defined to be
H (X) := − ∑ n
i =1 P ( xi ) ∗ log 2 P ( xi ) .
(
)
To calculate the entropy of an image, the message X of the entropy is the
list of the brightness values of each pixel, with the xi range between 0
and 255. In addition, n is the total number of pixels of the image. As P(x i)
denotes the probability or relative frequency of item xi, we can compute
P(xi) for a given xi by P(xi):= (Number of pixels having value xi)/(n = total
number of pixels). The lowest entropy H(X) = 0 will be measured for an
image of a single color, as this image does not contain any unpredictable
information. A black-and-white image that contains only two colors will
have entropy between zero and one, whereas it reaches a value close to
zero if one color clearly dominates the other. A value approximating
1 will be measured if both colors occur equally within the image. The
highest possible entropy will be achieved in cases where each level of
brightness occurs in a color image and all levels of brightness are equally
distributed. As soon as one brightness value dominates the others, the
entropy becomes smaller (as the image becomes more predictable).
Number of Shapes
Similar to entropy in that it measures the unpredictability of an image,
the number of shapes contained in an image describes its agitation. In
order to yield normalized values, and thus values that are comparable
within a graphic narrative as well as between them, we scaled the image
to a height of 250 pixels. We then split grayscale images into five binary
sub-images of different equal-sized brightness intervals, where each
sub-image contains a 1-bit for a pixel p at coordinate (x,y) if the pixel of
the original grayscale image belongs to the brightness interval assigned
to this sub-image. Next, we filled small holes of 0-bits in the image up
to a diameter of four pixels. For each sub-image we then counted the
number of connected components, i.e., we collect areas of 1-bits that are
next to each other. The number of shapes of an image thus amounts to
the sum of connected components of all its sub-images. In a final step,
we discounted components that came to less than 10 pixels in size. We
did so in order to discount individual letters of the comics text as separate shapes. Figure 3.1 shows this process in simplified fashion. First,
the picture is split into five sub-images. In this case, the darkest part
contains the outlines of a woman and a book, the second incorporates
the woman’s shirt and hair, as well as the book cover, and the third
50
Alexander Dunst and Rita Hartel
Figure 3.1 Illustration of Shape definition. The image is an adapted excerpt
from https://commons.wikimedia.org/wiki/File:BD-propagande_
colour_en.jpg (Licensed under CC BY).
just contains visual noise. The fourth sub-image contains the woman’s
face, neck, arm, and pants plus some noise, and the brightest part encompasses the background. Staying with our example, we would count
four connected components (face, neck, arm, trousers) for the fourth
image but ignore the noise, as these components are too small. For the
third sub-image we would not count any shapes at all, as they are all too
small. Thus, in total, we would compute that this image contains 13 +
7 + 0 + 4 + 3 = 27 shapes.
3.c Data Processing
Following the calculation of our images’ raw data (brightness, entropy,
and number of shapes) as described in the previous subsection, we processed these measurements for different purposes.
Median Values for Brightness, Entropy, and Shapes
First, we aggregated the values per page of a single measure to one single
value per graphic narrative by calculating the median of this measure of
all pages.
The Quantitative Analysis of Comics
51
Stylistic Diversity within a Graphic Narrative
In order to measure the stylistic diversity within a text, we calculated the
standard deviation of the three basic measures: brightness, entropy, and
number of shapes for all pages of a graphic narrative.
3.d Data Analysis
We used our three basic measurements, as well as standard deviations
from these measures of internal diversity, for the analysis of the following concepts: genre, authorship, and publication format. We performed
Anova and Tukey’s HSD tests to distinguish individual categories, with
p < 0.05 for statistical significance.
Genre
All titles in the corpus were assigned one or several of a total of 23 subgenre categories. As these categories proved too fine-grained to produce
reliable results, subgenres were further grouped into four genres: graphic
novel, graphic memoir, graphic fantasy, graphic non-fiction. 2 A small
number of titles that did not fit these larger categories were collected
under the term miscellaneous. A research assistant assigned subgenre
terms on the basis of information provided by publishers, booksellers,
or, if necessary, a book’s content. Genre categories were checked for
consistency by the authors before our measurements were calculated.
Authorship
Our data set included several authors who were represented with more
than one graphic narrative. For scatter plots that aimed to distinguish
between individual artistic styles, we chose authors with three or more
titles in our sample. Another series of plots distinguished between the
following authorship categories: single author, collaboration between
one author and one illustrator, and multiple authors.
Publication Format
As we mentioned earlier, serial publication remains relatively widespread for graphic narratives. Therefore, we distinguished between four
different formats: single-issue, or so-called one-shot, book publications;
graphic narratives that were published as part of a book series, such as a
trilogy or more than three novel-length installments; graphic narratives
that originally appeared as a limited series; and titles that assemble parts
of an unlimited series in one volume.
52
Alexander Dunst and Rita Hartel
4. Results and Discussion
Results show that all three of our analytic categories—genre, authorship, and publication format—are meaningful distinctions for the quantitative study of graphic narratives. This confirms our initial research
question about the applicability of these concepts to the study of graphic
narratives.
4.a Genre
The genres graphic fantasy, graphic memoir, and graphic novel demonstrate significant distinctions in visual style, both synchronically and
diachronically. Graphic fantasy, in particular, is significantly darker
than other genre categories (Figure 3.2). Titles in this group are also
distinctly less regular, with graphic fantasy showing higher median entropy levels than all other genres, in particular graphic memoirs. Examples in our data set include canonical texts such as Watchmen and V
for Vendetta, the superhero arcs Batman: Year One and Batman: Dark
Victory, or the fantasy narrative Fables: 1001 Nights of Snowfall. This
result supports our anecdotal observation that the core genres of graphic
narrative, the graphic novel and the graphic memoir, are characterized
by a relatively unvarying artistic style. Titles with the lowest entropy
and lowest deviation from mean brightness are overwhelmingly graphic
memoirs and graphic novels.
Figure 3.2 Mean Brightness across genres: Graphic Memoir - Graphic Novel
(p < 0.016), Graphic Fantasy - Graphic Novel (p < 0.000).
The Quantitative Analysis of Comics
53
Results did not support our broad analogy between graphic narrative
and Hollywood cinema. Instead, our measurements indicate a more precise and fascinating parallel between film and graphic fantasy. Cutting
had demonstrated a historical evolution towards ever-darker images. As
we just saw, mean brightness proved a useful category when it came to
distinguishing graphic memoirs from other genres. These distinctions are
also visible in a diachronic view. Yet, Figure 3.3 does not show a development akin to Hollywood film in graphic narrative as a whole. If we follow the curve of all graphic narratives contained in our sample, we see that
the form seems to have been at its darkest from the early to mid-1980s to
the late 1990s. This extreme was followed by a steep upswing, and mean
brightness has remained more or less stable since the 2000s. Cutting attributes the decrease of luminescence in Hollywood cinema to a number
of factors, such as the greater visual range of modern film stock and then
digital video, as well as cognitive benefits. As he writes: “a darker film in
a dark theatre allows for greater dynamic contrast, which in turn allows
for better control over viewers’ attention” (574). While this development
has no equivalent in graphic narrative as a whole, it is worth noting that
graphic fantasy evinces greater brightness contrasts and darker images
than other genres. Our comparatively shorter time period shows only
a slight increase on these scores since the 1980s. Nonetheless, graphic
fantasy demonstrates clear parallels with contemporary Hollywood cinema—a form with which it shares an emphasis on visual spectacle, and
for which it serves as frequent source material. Whether these similarities
are motivated by cognitive benefits, or whether we should attribute the
darker images of graphic fantasy to its depiction of personal conflict and
dystopian worlds, remains an open question.
Figure 3.3 Year vs. Mean Brightness by Genre.
54 Alexander Dunst and Rita Hartel
Figure 3.4 Year vs. Standard Deviation from Mean Brightness by Genre.
The diachronic view also fleshes out our characterization of the
graphic memoir as comparatively uniform in style. Figure 3.4 details
how this development has become markedly more pronounced since the
mid-2000s. These years saw the publication of well-known examples of
the genre such Bechdel’s Fun Home, which is included in our sample.
The book’s visual regularity and monochrome color scheme contributes
to this development, but also can be seen to influence the further evolution of the graphic memoir.
4.b Authorship
As with genre, our automated measurements were able to distinguish between individual authors on the basis of visual style. Figure 3.5 visualizes
the artistic signature of authors who are present with three or more titles
in our sample. Authors such as Jason Lutes, Ozamu Tezuka, and Chester
Brown, who evince a comparatively consistent style across several titles,
take up a smaller space within the overall matrix. Other authors cover a
far larger swathe of “style space” (Manovich). These include Frank Miller,
Dave Mckean, and David Mazzuchelli. To anyone who has read books by
these authors, the results will not come as a surprise. At one end of the scale,
Brown favors a highly distinctive drawing style characterized by simple lines
and the autobiographical description of everyday life. At the other extreme,
Mazzuchelli’s graphic novel Asterious Polyp marks a conscious attempt to
combine radically different styles within one over-arching narrative. As an
The Quantitative Analysis of Comics
55
Figure 3.5 PCA Author Style.
aesthetic experiment, it is worlds apart from his other titles contained in our
sample: City of Glass, an adaptation of Paul Auster’s novel that successfully
translates a literary aesthetic into the comics format, and his work in the
superhero genre: Daredevil: Born Again and Batman: Year One.
Tezuka’s mingling among the crowd highlights the limits of this approach. Clearly, these stylistic measurements do not help us differentiate between individual manga titles and Western graphic narratives.
Whether they’ll be able to do so once we’ve constructed a comparison
corpus that includes a sufficient number of manga remains to be seen.
Alan Moore and Harvey Pekar present different challenges. These two
are featured on the book covers as the authors of their works. Unlike the
other authors included in this scatter plot, however, they work with professional artists to draw their comics. And yet, their titles show remarkable consistency, especially in Pekar’s case. It is tempting to attribute
these results to Moore and Pekar’s authorial influence over these graphic
narratives. Moore, for one, has been known to imagine his stories in meticulous detail that includes many of their visual features. In turn, most
of Pekar’s comics are characterized by a relatively narrow set of locales
and themes—the depiction of everyday life in Cleveland. Thus, it may be
that our stylistic measurements, despite their small number and relative
simplicity at this point, capture something of the underlying content described by their authors and realized visually by different artists. A less
optimistic take may question whether these measurements remain too
broad at this point to ground such an interpretation.
56 Alexander Dunst and Rita Hartel
Figure 3.6 Standard Deviation from Shapes per Page 1 Author - Author + Illustrator
(p < 0.0114).
That the division of labor between writer and visual artist does lead
to measurably different results becomes clear in Figure 3.6. Titles that
are co-authored tend to be more varied stylistically, showing a higher
standard deviation from the number of shapes. We can interpret this as
a surplus of artistic creativity—but this surplus may also make a graphic
novel less coherent, and therefore more difficult to read. Figure 3.7 uses
the same two measures but places individual book titles in a scatter plot.
Two contrasting perspectives command attention here. The first focuses
on extremes: Titles that stand out as particularly diverse or consistent.
Thus, Brown’s memoir I Never Liked You shows up as the most internally consistent title in our sample. Given its black and white pages
and what we noted earlier about Brown’s drawing style, this may come
as little surprise. The same goes for Marjane Sartrapi’s Persepolis—another graphic memoir drawn in black and white and a pared-down visual style. Frank Miller’s installment Sin City: The Hard Goodbye may
be a more surprising find at this end of the scale. Black-and-white pages
play their part. More decisively, perhaps, Frank Miller’s jagged edges
and reduced settings are uniformly so, evoking a world that is as bleak
as it is repetitively brutal. At the other end, featuring extremely diverse
graphic narratives, Neil Gaiman and Dave Mckean’s Signal to Noise
lives up to its name.
Another perspective may prove equally if not more interesting.
Exceptions are one thing—and the humanities have long focused on
what they deem to be extraordinary cultural artifacts. Yet our plot shows
that some of the most successful authors in the business, those who have
managed to publish and sell several graphic narratives, congregate in a
The Quantitative Analysis of Comics
57
Figure 3.7 Standard Deviation of Mean Brightness vs. Standard Deviation of
Number of Shapes by Book Titles.
relatively tight cluster. Examples include Will Eisner, Alan Moore, and
Craig Thompson. Thus, it might be at the center of these two scales,
at a distance from the extremes of internal variety or uniformity, that
graphic narratives function most successfully: varied enough to retain
the interest of comics readers, sufficiently consistent not to disrupt narrative flow. Thus, the process of diversification that we discerned in
book covers seems not to extend to graphic storytelling. Where covers
must arouse interest, the narratives that follow need to retain reader’s
attention, apparently at the cost of too much variety.
4.c Publication Format
Finally, our measurements show significant differences between publication formats. The measures entropy and number of shapes did not
give significant results. However, graphic narratives that were originally
issued as book publications, either as single-issue volumes or together
with other installments, are much brighter than titles that first appeared
as limited or unlimited series (Figure 3.8). One reason for this result lies
in the impact of genre: Graphic memoirs and graphic novels are groups
58
Alexander Dunst and Rita Hartel
Figure 3.8 Mean Brightness by Publication Format: Book Publication - Limited
Series (p < 0.001), Book Publication - Unlimited Series (p < 0.004),
Book Series - Unlimited Series (p < 0.02).
of texts that are generally characterized by higher brightness values. At
the same time, prominent exceptions featured in our corpus are not difficult to find: Spiegelman’s Maus, Daniel Clowes’s Ghost World, and
Charles Burns’s Black Hole all appeared in serial form first.
5. Conclusion & Outlook
In a famous essay originally published in 1958, the French historian Fernand Braudel foresaw “the advent of a quantitative history”
(“History”). According to Braudel, this new historiography would “get
past superficial observation in order to reach the zone of unconscious
or barely conscious elements, and then to reduce that reality to tiny elements, minute identical sections, whose relations can be precisely analyzed” (44). Sixty years later, such a quantitative history is appearing on
the horizon for the humanities. As a medium, comics have only recently
garnered attention from cultural historians and are only starting to interest digital humanists. Still, with every step forward additional possibilities become imaginable. Here, we have presented an initial study,
which functions as proof of concept for a stylometric approach that analyzes graphic narratives based on their image content alone. Our corpus
will soon grow from around 200 to 250 full-length graphic narratives
and supply reference corpora of German graphic novels, Franco-Belgian
bande dessinée, and Japanese manga. These numbers may seem small in
The Quantitative Analysis of Comics
59
comparison to some literary corpora. Conceivably, a single person may
read all of them over a couple of months. Yet even 250 books stretch
the human capacity for synthesis. And no human eye or brain matches
a household computer when it comes to data retention, arithmetic precision, or even pattern recognition. As we add texts, we will also introduce new measurements to supplement our stylistic analysis, starting
with colorfulness, and extending to algorithms that distinguish between
different kinds of drawn edges and lines. Most importantly, we need to
improve automatic text location and OCR capacities to open the treasure trove of textual stylometry. Only combinatory methods of visual
and textual analysis will allow us to study a large number of comics
in their full complexity and understand the minute interaction between
those levels.
Distinguishing genre and author signals constitutes another area for
improvement. This essay has demonstrated that texts grouped together
under the mantle of authorship or genre affiliation share certain stylistic traits. With a sufficiently large data set, we may be able to train
the computer to identify other members of these groups, or to establish
where and how they overlap. Yet given that artists repeatedly publish
in one genre, it is very likely that author signals are interfering with
what we take to be the characteristics of genre. Digital literary studies offers examples of how these categories can be disentangled. We
need to adopt its methods with care, and adapt and invent wherever
necessary.
Given his focus on socioeconomic data, Braudel’s essay does not engage with the potential of experimental science. Nonetheless, reception
data adds a decisive element to quantitative analysis. DH methods tend
to examine the formal characteristics of cultural objects in isolation,
continuing a long tradition that reduces reception to the intuitions
of the expert reader. The successful combination of quantitative
and cognitive methods will depend on an additional, third, element.
Quantitative, as much as qualitative, methods need a strong grounding
in theory—a cultural theory that lays the groundwork for operationalizing its concepts and connects different media and aesthetic structures. Only then will we be able to shape a truly novel method of
analyzing comics.
Acknowledgements
This work was funded by the German Federal Ministry of Education
and Research (BMBF) as part of an early-career research group on digital and cognitive approaches to graphic narrative. For a project description and other information see: graphic-literature.upb.de. We are also
grateful to our research assistants Volker Deppe and Svitlana Zarytska
for data processing and database entry in preparation of this essay.
60 Alexander Dunst and Rita Hartel
Notes
1 Beaty similarly speaks of the graphic novel as a “gentrifying term” (“Introduction” 108).
2 Subgenres were grouped as follows: graphic novel (action/adventure, coming
of age, crime, fiction, historical fiction, romance, comedy, satire); graphic
memoir (autobiography, memoir); graphic fantasy (fairy tale, fantasy, horror,
mystery, science fiction, superhero); graphic non-fiction (biography, educational, historical non-fiction, graphic journalism, travel writing).
Works Cited
Beaty, Bart. “Introduction.” Cinema Journal, vol. 50, no. 3, 2011, pp. 106–10.
———. Comics versus Art. U of Toronto P, 2012.
Braudel, Fernand. “History and the Social Sciences: The Long Durée.” On History,
translated by Sarah Matthews, U of Chicago P, 1980, pp. 25–54.
Chute, Hillary. “Comics as Literature? Reading Graphic Narrative.” PMLA,
vol. 123, no. 2, 2008, pp. 452–465.
Cutting, James E., Caitlin L. Brunick, Jordan E. DeLong, Catalina Iricinschi,
and Ayse Candan. “Quicker, Faster, Darker: Changes in Hollywood Film
over 75 Years.” i-Perception, vol. 2, 2011, pp. 569–576.
Dunst, Alexander, Rita Hartel, and Jochen Laubrock. “The Graphic Narrative
Corpus: Design, Annotation, and Analysis for the Digital Humanities.”
Proceedings of the 14th IAPR International Conference on Document Analysis
and Recognition (ICDAR 2017), 13–15 Nov., 2018, kyoto, pp. 15–20.
Dunst, Alexander, Rita Hartel, Sven Hohenstein, and Jochen Laubrock. “The
Corpus Analysis of Multimodal Narrative: The Example of Graphic Novels.”
Digital Humanities 2016, 11–16 July 2016, Cracow. Conference Paper.
English, James F., and Ted Underwood. “Shifting Scales: Between Literature and
Social Science.” Modern Language Quarterly, vol. 77, no. 3, 2016, pp. 277–295.
Fujimoto, Azuma, Toru Ogawa, kazuyoshi Yamamoto, Yusuke Matsui,
Toshihiko Yamasaki, and kiyoharu Aizawa. “Manga109 Dataset and Creation
of Metadata.” MANPU 2016: Proceedings of the 1st International Workshop
on Comics Analysis, Processing and Understanding, 4 Dec 2016, Cancun,
ACM, 2016, pp. 1–5.
Guérin, Clément, Christophe Rigaud, Antoine Mercier, Farid Ammar-Boudjelal,
karell Bertet, Alain Bouju…, and Arnaud Revel. “EBDtheque: A Representative
Database of Comics.” ICDAR 2013: Proceedings.12th International Conference on Document Analysis and Recognition, 25–28 Aug. 2013, Washington
DC, HAL, 2013, pp. 1145–1149. Hal.archives-ouvertes, hal-00914860.
Iyyer, Mohit, Varun Manjunatha, Anupam Guha, Yogarshi Vyas, Jordan BoydGraber, Hal Daumé III, and Larry Davis. “The Amazing Mysteries of the
Gutter: Drawing Inferences Between Panels in Comic Book Narratives.” Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition, 26 June–1 July 2016, Las Vegas, CPS, 2016, pp. 1–10.
Jockers, Matthew L. Macroanalysis: Digital Methods and Literary History. U
of Illinois P, 2013.
Manovich, Lev. “Style Space: How to Compare Image Sets and Follow their
Evolution.” Manovich, 2011, manovich.net/index.php/projects/style-space.
Accessed 25 June 2017.
The Quantitative Analysis of Comics
61
Moretti, Franco. Graphs, Maps, Trees: Abstract Models for a Literary Theory.
Verso, 2005.
Spiegelman, Art, and W. J. T. Mitchell. “Public Conversation: What the %$#!
Happened to Comics.” Critical Inquiry, vol. 40, no. 3, 2014, pp. 20–35.
Stein, Daniel, and Jan-Noël Thon. “Introduction: From Comic Strips to Graphic
Novels.” From Comic Strips to Graphic Novels: Contributions to the Theory
and History of Graphic Narrative, edited by Daniel Stein and Jan-Noël Thon,
DeGruyter, 2013, pp. 1–23.
Underwood, Ted. “The Life Cycle of Genres.” The Journal of Cultural Analytics,
2016, culturalanalytics.org/2016/05/the-life-cycles-of-genres. Accessed 25
Oct 2016.