Chapter 3
Chapter 3
Chapter 3
Introduction to Segmenting
In verbal data analysis, a stream of language is first segmented and then, in a
separate and independent step, each segment is selected and coded. The unit
used for segmentation is explicitly defined and often linguistic in nature. As
we will discuss later in this chapter, this unit may be the sentence, the clause,
the t-unit, the topic, or any other unit appropriate to but distinct from the
code that is eventually assigned to it. This approach to segmentation has two
distinctive features that set it apart from other analytic techniques.
Figure 3.2: The unit of segmentation too small for the phenomenon of interest.
72 Chapter 3
Figure 3.3: The unit of segmentation too big for the phenomenon of interest.
Another option, if we really want to make sure we see all of the cats in
our analysis, would be to include a rule such as, “If you see a cat in the unit,
code it as cat no matter what else you may see there.” The result of using a
rule like this is to underreport dogs and other pets in favor of identifying all
of the cats. While this might be an acceptable outcome for some purposes,
we usually try to avoid this situation by getting the best match of unit to
phenomenon of interest that we can at the segmentation stage.
Of course, in verbal data analysis, we are not coding for non-verbal phe-
nomena like cats and dogs. But these instances are intended to illustrate
conceptually some of the potential problems that can later arise from inap-
propriate segmenting, so that you can try to avoid them at this early stage of
analysis.
Segmenting the Data 73
T-Units
Syntactically, a stream of language is structured as a set of t-units, the smallest
group of words which can make a move in language. By move, we mean the
work that a piece of language does to advance communication. In traditional
argument, rhetorical moves advance the reader or listener along the path the
speaker or writer is constructing (Kaufer & Geisler, 1991). More generally, a
rhetorical move can be understood as the work done by language to fulfill any
communicative purpose (Biber & Kanoksilapatham, 2007, p. 23).
A t-unit consists of a principle clause and any subordinate clauses or
non-clausal structures attached to or embedded in it. The following are all
t-units:
I ran to the store because we needed flour for the cake for Mar-
tha’s birthday.
Jen is the mail carrier who replaced the one we liked.
Walking is my favorite form of exercise, the one with the least
impact.
T-units are one of the most basic units of language. If the phenomenon in
which you are interested is associated with the moves that a speaker or writer
makes, the t-unit may be the most appropriate unit for your segmentation.
You might, for example, look at each t-unit in the transcript of a meeting of an
engineering design meeting for the kind of move it makes: descriptions, pro-
posals, questions, evaluations, and so forth. You could also look only at t-units
that make proposals. The length of t-units has also been used as a measure of
syntactic maturity (Kellog, 1978).
74 Chapter 3
Clauses
Clauses are the smallest units of language that make a claim—that predicate
something—about an entity in the world. A clause is a group of words con-
taining a subject—the entity—and a predicate—the claim being made about
the subject. When clauses stand alone, they are said to be independent. When
they make sense only in conjunction with an independent clause, they are said
to be dependent. As we have already seen, an independent clause with all of
its dependent clauses makes up the t-unit. The following are all independent
clauses:
the committee requested the prior report from the president
once upon a time two children were lost in the woods
The underlined language in the following are all dependent clauses:
He refused when the committee requested the prior report from
the president.
She told us that two children were lost in the woods.
If your phenomenon of interest is related to the claims that a speaker or writer
makes about the world, the clause may be the right unit of analysis for you.
To segment a stream of language into clauses, begin as with t-units by
76 Chapter 3
finding the first inflected verb that has a subject. Then look for the next in-
flected verb with a subject. Segment the stream at the most logical place
between them:
He refused
when the committee requested the prior report from the presi-
dent
The resulting segmented stream will consist of a mix of independent clauses
and dependent clauses. Any stream of language that has been segmented using
t-units can easily be further subdivided into clauses.
Noun Phrases
Noun phrases are the units of language which pick out objects in the world,
both concrete objects and those which are abstract. In clauses, noun phrases
can serve as subjects, but they may take other roles as well, as the following
examples suggest:
That cat is obnoxious.
The day I was born was cold.
June is a hot month in Kentucky.
If your analysis is concerned with what is being spoken or written about,
you may want to use the noun phrase as your unit of analysis. Noun phrases
can help you identify the domains of knowledge from which speakers or writ-
ers draw, the worlds of discourse through which they move.
Choosing to analyze noun phrases is inherently selective—you make the
decision not to look at the predicates that make up the clauses that, in their
turn, make up the stream of verbal data. If you are going to look at noun
phrases, you will probably want to choose some slightly larger unit (such as
the clause) by which to segment the data, and then look at each noun phrase
within that segment.
The easiest way to segment your discourse by noun phrase is to break the
discourse up by clause, and then underline each noun phrase you find within
each clause.
Segmenting the Data 77
Verbals
Verbals are the unit of language which convey action, emotion, existence. In-
flected verbals—those with tense—fill the predicate slot in clauses, both inde-
pendent and dependent, as in the following:
When you back up your hard drive regularly, you prevent data
loss.
Other verbals come as reduced verb phrases:
The purpose for backing up your hard drive should be obvious.
In this example, “backing up” is actually serving as part of a noun phrase, but it
is clearly a reduced form of the verbal “back up” used in the previous example.
Some verbals like “back up” are idiomatic combinations of verb forms with
prepositions, back + up, where the meaning is quite different from the sum of
the parts. In these cases, the verbal is the entire idiom.
If you are interested in the schema being invoked by your speakers or writ-
ers, you will want to select the verbal as your unit of analysis. The verbal, “back
up your hard drive regularly,” for example, invokes a schema related to com-
puter use in the same way that “went on a date” invokes a courtship schema.
Using verbals, you can track the way schemas shift through your data set.
Because of their relationship to schemata, verbals are often indicative of
genre choices, or shifts within genres from one part to another. In news re-
ports, for example, hot news is often presented using present perfect tense:
The government has announced support for the compromise
bill.
Details are then presented in simple past tense:
78 Chapter 3
that are desired.” In identifying the verbals, underline any phrase that can be
expanded to a full verb. And remember, when you give coders data with this
kind of selective segmentation, make sure that they understand that they are
to code only the language that is underlined.
Topical Chains
Topical chains in both spoken and written interactions are what allow par-
ticipants to understand their discourse as being about something. To some
extent, the topic of a discourse can be established by how it points to or index-
es objects in the world; listeners and readers will understand language which
points to the same object to be somewhat cohesive. But the true workhorses of
cohesion are the topical chains that writers and speakers establish.
Topical chains are constructed out of continuous units—either the t-units
(frequent in formal discourse) or clause (not uncommon in informal dis-
course); each continuous unit may either start a new topic or continue the
same topic as the one before it. When a writer constructs a long topical chain,
or when two interlocutors work together to extend one another’s thoughts
through a long topic chain, they develop the complexity of the topic.
Topical chains are often held together by the following kinds of referentials:
• personal pronouns: it, they, he, she, we, them, us, his, her, my, me
• demonstrative pronouns: this and that, these and those
• definite articles: the
• other expressions: such, one
• elipses and repetition
In oral discourse, the boundaries of topical chains are often marked (OK, well).
If you are interested in the conceptual complexity of discourse, the extent
to which a topic is developed, the depth of interaction on a topic, you may
want to use the topical chain as your unit of analysis. You may also want to use
the topical chain as a unit of analysis if you wish to do a selective analysis of
discussions that concern a specific topic. Next to the t-unit, the topical chain is
one of the most useful units for segmenting language.
80 Chapter 3
Speaker B:
22 So you actually print them [PAPERS] out
23 after you have a look at the abstract [ABSTRACT] on the web
24 and then print it [PAPER] out
25 if you think it’s [PAPER] interesting
26 and you file it [PAPER]?
Speaker A:
27 Right—
28 read it [PAPER] and
29 file it [PAPER]
Speaker B:
30 Read it [PAPER]
31 and file it [PAPER]
Speaker A:
32 And I often hand them [PAPERS] to my students
By using referentials to track and label the topics, as in Figure 3.4, we can
clearly see where breaks in the topical chain occur. When you first begin to
segment by topical chain, you may find it useful to track and label topics as we
have done. With some practice, however, you will be able to sense the breaks
without this kind of extensive annotation.
Units in Conversation
In addition to the basic units that characterize all verbal data described above,
specific kinds of verbal data have additional regularities that can be exploited
as units of segmentation. In this section, we look at regularities in conversa-
tion that suggest a variety of units of segmentation.
Conversational Turns
Conversations are made up of turns. For the most part, only one speaker
talks at a time, although there is often some overlap at the edges. Much can
be learned from looking at the turns in a conversation, particularly by speak-
er. How turns are allocated among possible speakers tells a great deal about
relative power in a conversation: Who speaks most often? Whose turns are
longest? Whose turns initiate new topics? If you are interested in phenomena
of power, you may well want to look at the turn as a unit of segmentation.
To segment a stream of language into turns, simply segment at the borders
between speakers.
Conversational Sequences
Conversation does not take place through the random accumulation of
speaker’s turns. Instead it is organized by its participants into sequences. A
conversational sequence can be thought of as a joint project undertaken by
two or more speakers, using language. It is made up of the following com-
ponents.
Segmenting the Data 83
speaker’s turn as rephrasings of each other, then treat them as a single initia-
tion. Mark the segmenting boundary prior to the initiation either immediately
before it, or, if the previous t-units were used by the same speaker to introduce
the initiation, then right before these introductory t-units.
To locate the segmenting boundary following the initiation, examine the
nature of the response:
1. No Response: If the initiation does not receive a response from a second
speaker, divide after the initiation when silence is noted in the transcript.
2. Response Only: If a response is given by the second speaker and not
commented on by any other speakers, then divide after the response.
Responses can take more than one t-unit.
3. Response + Comment: If a response is given by a second speaker and
then commented on either by the first speaker or by other speakers,
then divide after these comments. In general, all comments by speak-
ers on previous speakers’ turns should be included in the sequence.
A comment is related material, but has to be related to what comes
immediately before it.
Interview Responses
Often implicitly, interviewers select the response as their unit of segmenta-
tion. That is, they look only at what is said in response to interview questions
rather than at the interview as a whole. Such a move often helps to focus on
the situation or person of interest, but it should never be done without con-
sidering the extent to which these responses have been shaped by conversa-
tional imperatives set up by the interviewer’s questions.
Interviews are often structured according to an interview schedule—that
is, with certain fixed questions that are asked of all those interviewed or are
asked repeatedly of the same person over time. In such situations, it is possible
to use the question as a unit of segmentation: to look at all responses to the
same question. Since questions often direct respondents to particular topics,
this unit of segmentation will help to focus on phenomenon related to topic.
No guarantee exists, however, that the same topic will not have come up
86 Chapter 3
Units in Text
Written texts have a variety of characteristics, some associated with conven-
tions of publication, others with conventions of typography, and still others
associated with the rhetorical interactions with readers at a distance. All of
the following can serve useful purposes as units of analysis for textual data.
The Text
Perhaps the most obvious unit for analyzing textual data is the text itself. Un-
like the stream of conversational data which must often be bounded in some
arbitrary fashion for the purposes of analysis, written texts often have well-es-
tablished boundaries. In a classroom, for example, students generally write
and bind (with staples or paperclips) individual texts separately: the boundar-
ies of individual student “papers” are seldom hard to determine. In published
formats, conventions exist for separating individual texts: the chapters of an
edited volume, the articles in a magazine or journal, the stories in a newspaper.
From the writer’s point of view, many phenomenon occur at the level of the
text: the quality of the text, the genre of the text, the implied audience for the
text. From the reader’s point of view, texts also have a variety of characteristics
that can be examined: their persuasiveness, their familiarity, their importance,
and so on. If you are concerned with any of these or similar phenomena, from
either the writers’ or the readers’ point of view, you may find the text itself a
good unit with which to segment textual data.
Genre Elements
Most texts belong to families of texts we call genres. While genres are not
rigid, texts in certain genres do tend to share common features and common
Segmenting the Data 87
Typographical Units
Texts are structured by their layout with a variety of characteristics, any of
which can be used as a unit of segmentation. As units, they can serve useful
purposes as ways of selecting data when the phenomenon of interest is as-
sumed to be regularly distributed through the text and you simply need some
way of selecting part of the data.
You might choose, for instance, every third sentence, every fifth para-
graph, every other page, or the first ten lines of each section. Keep in mind
88 Chapter 3
These indexicals can also give you a handle on the extent to which interlocu-
tors are coordinating with each other.
If your analysis is concerned with understanding the development and
nature of the common ground that speakers or writers create with their in-
terlocutors, you may want to use one or more of the indexicals as your unit
of segmentation:
1. Pronouns: I, he, she, it
2. Demonstratives: this and that, these and those
3. Adverbs: here, now, today, yesterday, tomorrow
4. Adjectives: my, his, her
The easiest way to segment your discourse by indexical is to break the dis-
course up by clause, and then underline any indexical you find within each
clause.
Personal Pronouns
Personal pronouns—I, me, you, he, she, him, her, they, and them—point to the
world of interlocutors in which a speaker or writer takes as common ground.
As we have already seen, pronouns are indexical. Focusing on the personal
pronouns as a specific kinds of indexical can give you clues about the scope
of the human world in which writers or speakers see themselves as acting.
Looking specifically at first person pronouns (I, me) can help you to examine
the agency of the speaker or writer. Personal pronouns can be looked at for
themselves (how many time did the speaker use I?), for what they refer to
(Where did the speaker talk about her family), or they can be used to select
other phenomenon for analysis (what kind of verbals does the speaker attri-
bute to herself).
To use personal pronouns to segment your discourse, underline each pro-
noun and then break the discourse right before each one. Or you may choose
to segment your discourse first by some larger comprehensive unit such as
the t-unit or clause, and then underline each personal pronoun within each of
these larger units.
90 Chapter 3
Modals
Modals provide language users with a way to indicate the attitude or stance of
the writer or speaker toward the message he or she is conveying. The stance
can range from bald assertion:
Sally will leave tomorrow.
to assertions with less definite status
Sally might leave tomorrow.
Sally could leave tomorrow.
Sally will probably leave tomorrow.
Sally will certainly leave tomorrow.
In general, modality can communicate probability (she might go tomorrow)
advisability (she ought not go tomorrow), or conditionality (she would have
gone yesterday). Modality is often conveyed through the modal auxiliary
verbs: might, may, and must, can and could, will and would, shall and should,
ought. Modality can be conveyed in many other ways however as the following
lists suggest:
1. Modal auxiliaries: might, may and must, can and could, will and would,
shall and should, ought
2. Conditionals: if, unless
3. Idioms: have to, need to, ought to, have got to, had better, need to
4. Adverbials: probably, certainly, most assuredly
5. Verbs: appear, assume, doubt, guess, looks like, suggest, think, insist,
command, request, ask
All modals convey information about the level of obligation or certainty
that speakers or writers associate with the content of what they are say-
ing. If you are interested in tracking the degree of certainty with which
interlocutors assess their claims, you may want to use modals as a unit of
segmentation.
The easiest way to segment your discourse by modality is to break the dis-
course up by clause, and then underline any modal you find within each clause.
Segmenting the Data 91
Metadiscourse
Metadiscourse is the part of discourse that talks about the discourse: the meta-
discourse. If you can imagine that a text has a primary channel in which infor-
mation is conveyed, the metadiscourse forms a background channel through
which the writer talks to the readers to tell them how to understand and in-
terpret the text.
There are two primary kinds of metadiscourse. Textual metadiscourse di-
rects the reader in understanding the text. Textual connectives such as first,
next, and however help readers recognize how the text is organized. Illocution
markers like in summary, we suggest, and our point is point to the kind of work
the writer is trying to do. Narrators such as according to, many people believe
that, and so-and-so argues that let readers know to whom to attribute a claim.
Textual metadiscourse is directly related to the rhetorical awareness exhibited
in the text, and can be used as a unit of segmentation when you are concerned
with rhetorical sophistication.
A second kind of metadiscourse is interpersonal, and serves to develop a
relationship between writer and reader. Validity markers such as hedges (might,
perhaps), emphatics (clearly, obviously), and narrators (according to) give the
reader guidance on how much face value to give to the claim with which they are
associated. Other attitude markers like surprisingly and unfortunately commu-
nicate the writer’s attitude toward the situation and invite the reader to share the
same stance. Commentaries such as as we’ll see in the following section and read-
ers are invited to peruse the appendix are more extended directions to the reader.
Interpersonal metadiscourse is directly related the degree to which a text
shows evidence of audience awareness. Interpersonal metadiscourse can vary
by genre, by rhetorical sophistication, and by the degree of comfort an writer
has with the audience addressed. If you are interested in phenomenon related
to audience, you may well wish to look at interpersonal metadiscourse as a
unit of segmentation.
The easiest way to segment your discourse by metadiscourse is to break the
discourse up by t-unit, and then underline any metadiscourse you find within
each t-unit.
92 Chapter 3
Often the physician will infuse several drugs into the patient to
control these states close to the desired values.¶
For example, in the case of critical care patients with congestive
heart failure, measured variables that are of primary importance
include mean arterial pressure (MAP) and cardiac output (CO).¶
Secondary variables which are monitored, but not regulated as
tightly as the primary variables, include heart rate and pulmo-
nary capillary wedge pressure.¶
The physician uses her/his own senses for other variables that
are not easily measured, such as depth of anesthesia, and often
infers them from a number of measurements and patient re-
sponses to surgical procedures.
See Procedure 3.1 for more information on segmenting using comprehensive
units.
https://goo.gl/1jf8Up
1. Working with a single stream of language in a word processing program, place your cursor at the break
between one segment and the next.
2. Hit enter.
Segmenting the Data 95
Speakers names (P and J) are in the first column. In the second column, each
clause appears, one line at a time.
See Procedure 3.2 for more information on segmenting conversational data.
Figure 3.5: Conversational data, segmented by clause and moved into Excel.
https://goo.gl/1jf8Up
1. Working with your stream of conversation in a word processing program, turn off Autocorrect in your
Preferences under the Word menu item.
2. Replace the colons after speaker names with colon + tab.
3. Place your cursor at the break between one segment and the next.
4. Hit enter to insert a carriage return and then add a tab before the second unit
The results should look like the following with -> represeting tabs:
P:-> okay ... ah ... in terms of your overall plan ... then ... where do you move from here ...
-> after you finish extracting information ...
J:-> which is going to be a chore ... considering ...
P:-> it’s going to take a while right ...
96 Chapter 3
This way of segmenting for selective units not only clearly communicates to
coders which text is supposed to be considered in coding (the underlined
words), but also supplies them with the full context to support that coding.
As you can see from this example, however, the text shown on each line is
rather arbitrary and meaningless. And we will often encounter discourse that
has long passages without a selective unit, as in this passage further along in
Steves’ article:
3
I now realize that this rule of thumb does not actually work for coding cats and
dogs, but I was taught it when I was very young and it does illustrate the point nicely.
Segmenting the Data 97
Not only is this first way of segmenting selective units unrelated to meaning,
but it presents problems for understanding the frequency of your phenom-
enon. In this example, the length of the segments is arbitrary, ranging from
four words in segment 23 to 106 words in segment 24. Thus to give a sense of
the relative frequency of modals, we could not rely on the number of segments
as a basis for our analysis; that is, there is no communicative value in saying
that there was on average one modal per segment, since definitionally we have
insured that there will be one modal per segment. To give a better sense of fre-
quency, then, we will have to use some other base metric, saying, for example,
that there were five modals in 185 words, or an average of 2.7 (5/185) modals
per 100 words.
A second problem with this first kind of segmentation arises if you should
wish to code your data in a second way, a not uncommon strategy as we’ll see
98 Chapter 3
in the next chapter. Going back to our cats and dogs example, suppose we
decide we want not only to code for cats and dogs that are in our stream, but
also for the kinds of flowers we see. Since the natural segmentation unit for
flowers is the plant, we could go back and resegment our stream in an entirely
different way than by the nose in order to code for flowers, but this would be a
tremendous amount of work.
As this metaphorical example suggests, a second and simpler approach to
segmenting for selective units is often called for. This second approach in-
volves picking a comprehensive unit to start with and then using underlining
to identify the selective units. In our cats and dogs and plants example, this
might mean segmenting our stream by property lot, and then coding each
lot first for pets and second for plants in bloom. We might have to deal with
the problem of a few lots that had both cats and dogs (perhaps by adding a
category for both), but this approach to segmenting would allow us to look for
relationships between pet ownership and the state of the lot’s landscape.
Coming back to the Rick Steves example, this second approach to selec-
tive segmentation approach would involve segmenting the discourse first by
clause, and then underlining each modal within the clause:
Now we can say that the same five modals occur over 25 clauses or at a rate of
one every five clauses.
When you use this kind of combination of comprehensive and selective
segmentation, you may find that more than one example of the selective unit
occurs within the comprehensive unit. The following passage, for example,
has been segmented by clause and then for noun phrases. Each of the clauses
contains more than one noun phrase:
1 Critical care patients have often suffered a “disturbance” to
the normal operation of their physiological system;
2 this disturbance could have been generated by surgery or
some sort of trauma.
Our interest here is not, of course, whether the clauses have noun phrases,
but what kind of noun phrases; perhaps we want to code each noun phrase for
the use of everyday language or medical jargon. This would require us to pull
out each noun phrase on a separate line for coding. Ideally, our data would
look like that shown in Figure 3.6 once in Excel:
A B C D F
clause # noun phrase # Clause/Noun phrase Code 1 Code 2
1 Critical care patients have often suffered
a “disturbance” to the normal operation
of their physiological system;
1a Critical care patients
1b a “disturbance” to the normal operation
of their physiological system;
2 this disturbance could have been gener-
ated by surgery or some sort of trauma
2a this disturbance
2b surgery
2c some sort of trauma
Figure 3.6: Data first segmented comprehensively by
clause and then selectively by noun phrase.
Segmenting the Data 101
This data is set up so that 1) the noun phrases are numbered in column B
(1a, 1b, 2a, 2b, 2c) and can be coded using column D; and 2) the clauses are
numbered in column A (1, 2) and can be coded in column F. The greyed-out
cells in each column help to tell the coder what data not to code. We will
describe the procedure for formatting this kind of data before moving it into
Excel in the section on Moving the Segmented Data.
https://goo.gl/1jf8Up
To create a new style in Microsoft Word:
1. Select a segment.
2. Format it in the way you want.
For example, you might increase the spacing after a segment to 6 points by placing the cursor in the seg-
ment, invoking the Paragraph command on the Format menu, and increasing the spacing after the para-
graph to 6.
3. Click on the Style Panes icon to open the Style Pane.
4. Click on the New Style button.
5. Name your new style.
To apply this style to other segments:
6. Select the other segments to which you want to apply the new style.
7. Then choose the new style from the drop down Style menu.
To change a style:
8. Change the style the way you want in one segment.
9. Then in the Style Pane, hover over the style name until you see the drop down menu to the right.
10. From that drop down menu, choose Update to Match Selection.
Word will automatically apply the new style changes to every segment with that style in your file.
Segmenting the Data 103
https://goo.gl/1jf8Up
1. Select and copy the data to be moved from Word.
2. Paste it into a worksheet, starting with Column C, leaving Columns A and B free for the labels you will
insert later.
Generally speaking, each data stream (interview, transcript, text) should be placed on its own worksheet.
Make sure to label the worksheets as you go.
After segments are moved into Excel, you should label them:
3. In Column B, insert numbers starting with 1 and continuing for 3 or 4 segments.
4. Select these cells and drag down to fill the column.
5. In Column A, type a label next to the first segment.
6. Copy the label and select the rest of the cells next to the rest of the data and issue the paste command.
Numbering and labeling segments in this way will insure that each segment has a unique identity in analysis.
If you have conversational data, you will also want to insure that each segment is labeled for speaker.
104 Chapter 3
https://goo.gl/1jf8Up
Moving comprehensively coded data into MAXQDA is very easy.
1. In a new project in MAXQDA, from the Documents menu, select the segmented files you want to import.
2. Click Open.
Generally speaking, each data stream (interview, transcript, text) should be placed on its own document.
Make sure to label the documents with identifying information as you go.
Each file will be imported and automatically numbered by segment. MAXQDA will also automatically keep
track of the source of each segment. Thus, you do not need to do any additional numbering or labelling.
Segmenting the Data 105
https://goo.gl/1jf8Up
1. Working with a stream of language in Microsft Word, underline the selective segments in your compre-
hensive unit:
Critical care patients have often suffered a “disturbance” to the normal operation of their physio-
logical system
2. Create a copy of the comprehensive segment below the comprehensive unit and edit it so that each
selective unit is located on a separate line beneath the comprehensive unit:
Continued . . .‑
106 Chapter 3
https://goo.gl/1jf8Up
Critical care patients have often suffered a “disturbance” to the normal operation of their physio-
logical system
Critical care patients
a “disturbance” to the normal operation of their physiological system
After you have edited all of your segments:
3. Select all of the segments you want to number, both comprehensive and selective.
4. Select Outline Numbered from the Bullets and Numbering option under the Format menu.
5. Select the third format option (1. 1.1. 1.1.1) and click Customize.
6. With Level 1 selected, add a second period to the number format so that it is a number followed by two
periods (1..).
7. Select Level 2 and change the Number Style to a, b, c
8. Edit the number format so that it is a period followed by a number followed by a letter followed by a
period (.1a.) and click OK.
9. To move the selective segments to level 2, select and indent them.
10. Check to make sure that the text looks appropriately numbered, with comprehensive segments num-
bered 1, 2, and so on followed by two periods, and selective segments numbered under their compre-
hensive segments as 1a 1b and so on with a period before and after the numbering.
11. Save your file as a text file (.txt).
To import your data:
12. From within Excel, put your cursor in cell B2 and invoke the File > Import command.
13. Select text file from the file types and click Import.
14. Select the file you want to import and click Get Data.
15. In Step 1 of the Text Import Wizard, select Delimited as your file type and click Next.
16. In Step 2 of the Text Import Wizard, uncheck all delimiters and type a period (.) in the box following
Other:.
17. In Step 3 of the Text Import Wizard, click Finish.
18. In the Import Data dialogue box, make sure the data will go into an existing worksheet with =$B$2 as
the destination and click OK.
Segmenting the Data 107
Issues in Segmenting
As you segment verbal data, a few issues will arise that may require special
handling. In closing this chapter, we call your attention to some of these.
Fragments
Particularly if you are working with oral or online conversations, you may
need to deal with fragments of language that don’t quite add up to the full unit
you are using for segmentation. Not only do we encounter the uhms and ohs
with which people fill their speech, but we also hear the fits and starts of un-
finished ideas, a clause started and left hanging. You will need to decide how
https://goo.gl/1jf8Up
1. In Word, underline the selective segments in your comprehensive unit:
Critical care patients have often suffered a “disturbance” to the normal operation of their physio-
logical system
2. Create a copy of the segment and place the selective units beneath the comprehensive unit:
Critical care patients have often suffered a “disturbance” to the normal operation of their physio-
logical system
Critical care patients
a “disturbance” to the normal operation of their physiological system
3. Select each of the selective units and change the font color:
Critical care patients have often suffered a “disturbance” to the normal operation of their physio-
logical system
Critical care patients
a “disturbance” to the normal operation of their physiological system;
When you import the data into MAXQDA, it will preserve the font colors, and you will be able to tell a coder
to code just those segments in black (the comprehensive units).
108 Chapter 3
Center Embedding
Another issue that you may encounter involves center-embedded clauses.
Most of the clauses in English come one right after another and can easily be
segmented as in this earlier example from Rick Steves:
1 Many laptops have a file-sharing option.
2 Though this setting is likely turned off by default,
3 it’s a good idea to check
4 that this option is not activated on your computer
Segmenting the Data 109
Pronouns
A final issue you may encounter in preparing data for coding involves pro-
nouns. Pronouns pose more difficulties in interpretation than the noun phras-
es to which they refer. Many references are vague; others refer to persons or
things outside of rather than in the text. And, of course, anytime you ask peo-
ple to take the additional step of deciding what a pronoun refers to before they
decide how to code it, there will be increased variation.
To manage this referential complexity, you can take one of two approach-
es as you segment the data. The first is simply to remove pronouns from
coding if your unit of analysis would otherwise indicate that the should be
coded. So, for example, if you are planning to code all nominals, you might
decide not to code any nominal that was a pronoun. While such a decision
might seem to eliminate a lot of data, if the elimination is spread proportion-
110 Chapter 3
ately through your coding categories, the overall patterns will be preserved.
Sometimes, however, you will not be able to eliminate the pronouns be-
cause they contain important information about the phenomenon of interest.
If, for example, you are looking at references to human agents, you may not
want to eliminate pronouns because they disproportionately contain a lot of
information about agency in verbal data.
In this case, you may pre-process the verbal data to insert into the data the
noun to which the pronoun should be understood to refer. So, for example, if
the pronoun “he” is used to refer to Harry, we could insert it as follows:
He [Harry] was taking his dog for a walk.
Resolving pronominal reference in this way in advance of coding will allow
the data to be coded with greater consistency. But this technique is inherently
tricky: if the referent is unclear or vague, you may need to read too much into
the data to resolve it. Thus you may find it best to resolve only those references
about which there is no ambiguity.
Kuhn, D., Hemberger, L., & Khait, V. (2015). Tracing the development of argumenta-
tive writing in a discourse-rich context. Written Communication, 33(1), 92-121. (By
idea unit.)
Ngai, C. S. B., & Jin, Y. (2016).The effectiveness of crisis communication strategies on
Sina Weibo in relation to Chinese publics’ acceptance of these strategies. Journal
of Business and Technical Communication, 30(4), 451-494. (By genre element,
response).
Shin, W., Pang, A., & Kim, H-Y. (2015). Building relationships through integrated
online media: Global organizations’ use of brand web sites, Facebook, and Twitter.
Journal of Business and Technical Communication, 29(2), 184-220. (By genre ele-
ment—website, Facebook profile, wall post, Twitter profile, tweet).
Swarts, J. (2015). Help is in the helping: An evaluation of help documentation in a
networked age. Technical Communication Quarterly, 24(2), 164-187. (By t-unit.)
Walker, K. C. (2016). Mapping the contours of translation: Visualized un/certainties
in the ozone hole controversy. Technical Communication Quarterly, 25(2), 104-120.
(By sentence.)
Morris, J. S. (2009). The Daily Show with Jon Stewart and audience attitude change
during the 2004 party conventions. Political Behavior, 31(1), 79-102.
Neuendorf, K. (2016). The Content Analysis Guidebook (2nd ed.). London: Sage Publi-
cations. Kindle Edition.
Saldaña, J. (2016). The coding manual for qualitative researchers. London: Sage Publi-
cations.