Izydorczak, K., Grzyb, T., & Dolinski, D. (2022). Ascent of Humans: Investigating
Methodological and Ethical Concerns About the Measurement. Collabra: Psychology,
8(1). https://doi.org/10.1525/collabra.33297
Social Psychology
Ascent of Humans: Investigating Methodological and Ethical
Concerns About the Measurement
Kamil Izydorczak 1
1
a
, Tomasz Grzyb 1 , Dariusz Dolinski 1
Faculty of Psychology in Wrocław, SWPS University of Social Sciences and Humanities, Wrocław, Poland
Keywords: dehumanisation, prejudice, Ascent of Humans, Blatant dehumanisation, methodology, ethics, measurement, validity
https://doi.org/10.1525/collabra.33297
Collabra: Psychology
Vol. 8, Issue 1, 2022
Introduction
Since the Ascent of Humans (AoH) scale was introduced
in 2015, it has been used in 16 published studies and mentioned in 389 articles (based on Google Scholar citations
of Kteily et al., 2015 as of August 2, 2021). Findings based
on these methods have been cited by the Washington Post
(Kteily & Bruneau, 2015) and numerous online media
sources. Considering its impact, novelty, and unorthodox
approach to measure dehumanisation, critical analysis of
this method by an independent research team could be a
valuable contribution as no such analysis has been published yet.
This study investigates whether results obtained by this
scale could be biased and whether the measurement could
impact views toward an out-group, rather than simply measuring them.
Dehumanisation and Its Measurement
Defining and measuring the degree of humanity attributed to groups and individuals is a goal of social and scientific importance. Categorising individuals as ‘human beings’ is a predicate of their inclusion in a circle of moral
consideration (Leyens et al., 2003) and in a group of privileged legal status (Bastian et al., 2011). The dynamics of
humanisation and dehumanisation could also shape state
policy regarding the expansion or limitation of rights and
inclusion/exclusion from mainstream society and culture
(Esses et al., 2008; Tileagă, 2007).
a
Researchers’ interest in dehumanisation is also sparked
by its historical importance. It is evident that dehumanisation accompanies the horrors of intergroup and international conflicts that we most certainly strive to avoid. Research often invokes examples of Tutsi and Hutu, German
Nazis (Haslam, 2006), or more recent examples, such as
the ongoing Israeli-Palestinian conflict (Bruneau & Kteily,
2017; Kteily et al., 2015). Although it is still unknown
whether dehumanisation leads to aggression or vice versa,
the co-occurrence is clear. Therefore, researchers hope that
examining intergroup dehumanisation will lead to the understanding and prevention of intergroup atrocities.
In summary, there are many reasons why researchers
seek to measure dehumanisation. Nonetheless, addressing
the question of how to do it is complicated, and the history
of such endeavours is brief— the field of social psychology
has been empirically measuring dehumanisation for less
than two decades (Castano & Kofta, 2009).
When discussing the measurement of dehumanisation,
two distinctive approaches (indirect and direct) can be distinguished, each of which comes with benefits and risks.
The indirect approach appeared first. The pioneering and
influential work of Leyens and colleagues (2000) on emotional infrahumanisation established the field of empirical
studies and measurements. In infrahumanisation, the degree of humanness is defined through differences in the attribution of secondary emotions between the in-group and
the out-group (Leyens et al., 2007). A subsequent indirect
approach was introduced in the concepts of mechanistic
Correspondence concerning this this article should be addressed to Kamil Izydorczak, SWPS University of Social Sciences and Humanities
Faculty of Psychology in Wroclaw, Aleksandra Ostrowskiego 30b, 50-505 Wroclaw, Poland. Contact: kizydorczak@swps.edu.pl
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
In this pre-registered study on a representative Polish sample (n = 1751), we aimed to test
two potential critical issues with the Ascent of Humans scale. First, we tested whether the
scores may be influenced by peripheral and previously undiscussed properties of the
measurement: position of the slider-scale dot and the pattern of groups’ display. Second,
we tested whether participation in Ascent of Humans measurement may influence the
attitudes towards out-groups, making participants more prejudiced. All our predictions
were conclusively disconfirmed. Additionally, we explored the distribution of Ascent of
Humans, discovering large inflation of scores indicating the absence of dehumanisation.
We discuss implications of our findings for improving theoretical grounds of
dehumanisation and its measurement.
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
Ascent of Humans—Methodological and Ethical
Aspects
The AoH measurement is preferable over subtle measurements because researchers are not forced to make arbitrary decisions about what makes someone ‘human’. Moreover, the measurement provides an opportunity to examine
previously under-researched, overt forms of dehumanisation. However, it has limitations.
By allowing the humanness to be freely interpreted by
the respondents, researchers limit their possibility of understanding, what is the exact substance of the attitude
which respondents express. This poses a particular problem
in the case of dehumanisation measurement since ‘humanness’ is especially prone to distinct interpretations (GinerSorolla et al., 2021).
This leads to questioning how results generated by AoH
should be interpreted. It is assumed that results reflect existing and consciously held beliefs about lesser degrees of
humanness. However, the possibility that besides respondents’ beliefs, the social situation of the measurement
along with its features can impact the results, remains unexamined.
According to the tacit, but fundamental, assumption of
classical test theory (Novick, 1966), the measurement
process does not influence the measured variable. It reflects
the ‘real result’ with a smaller or larger margin of error, but
it does not make the real result itself, smaller or larger. Unfortunately, in the domain of psychological questionnaires,
such consequences cannot be excluded.
When asked about certain matters, respondents form an
opinion even though they have no real interest or knowledge of the topic, and such opinions may easily shift (Sigelman & Thomas, 1984). Furthermore, they may also express
‘opinions on non-existent topics’, a phenomenon known in
political science and consumer research as ‘pseudo-opinions’ (Bishop et al., 1980).
This does not mean that participants draw their responses from a vacuum. They base them on general convictions or political stances (Sturgis & Smith, 2010). Questionnaires that produce pseudo-opinions do not measure
‘nothing,’ nor do they measure what they overtly inquire
about.
Another problem with measurements in psychological
research is the dependence of results on circumstantial
variables created by the measurement situation itself. Measurement, just like any other research procedure, is a social
situation in which people do not simply express their inner,
authentic, and spontaneous tendencies. Each time people
are asked about something, they do not merely respond
to the stated question. They also respond to imagined or
actual expectations of social situations (Rosenthal, 1963).
Although the researcher or developer of the method may
strive to avoid suggesting the hypotheses or expressing any
expectations, participants may subjectively perceive them
and act accordingly.
Another means by which measurement can result in
much more than just capturing an existing state is the anchoring mechanism. When we are asked to make a statement or guess about something, our judgments are unconsciously affected by the subtle clues provided in the
question (Tversky & Kahneman, 1974). Most typically a cue
can be an initial reference point ‘X’ given in a question
such as: ‘Is it more than “X” or less than “X”?’ There is a
great deal of evidence indicating that people tend to evaluate close to ‘X,’ even if ‘X’ is markedly distant from the true
value (Furnham & Boo, 2011).
Anchors can also be more subtle, even subliminal (Re-
Collabra: Psychology
2
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
and animalistic dehumanisation (Haslam, 2006), where the
degree of humanness was defined by traits the general public believes to be ‘uniquely human’ (not shared with animals) and characteristic of ‘human nature’ (absent in automata).
Under the indirect approach, respondents do not explicitly evaluate how human-like an individual or group seems.
Instead, researchers identify and develop a list of traits they
believe are qualities of human beings. Respondents are then
asked to rate individuals based on the degree they believe
someone possesses them.
Researchers are able to understand exactly what concept
of humanity respondents are invoking as it is the same one
that the researchers developed. This makes the measurement more reliable and valid. Nonetheless, there is a major drawback: it is up to the researcher to establish what it
means to be human. There is a possibility that responding
to the listed traits or properties does not equate to concluding humanness as a whole. Even if certain participants evaluate a group to be very low on each of the qualities, they
might disagree if asked whether they considered a group
non-human.
The AoH scale (Kteily et al., 2015) is the latest development in measuring dehumanisation and represents a direct
approach. Researchers allow respondents to formulate their
own definition of humanity and directly ask them how human they think a given group is.
The AoH measurement was introduced in response to the
need to investigate the most blatant forms of dehumanisation. While straightforward, aggressive forms of dehumanisation spark interest in the topic, most studies investigate
its subtle forms (Kteily et al., 2015). Subtle measurements
are valid, reliable, and theoretically well-grounded, however, they miss a crucial element in intergroup hostility:
overtly thinking about others as animals. To address this
gap, Kteily and colleagues (2015) proposed a one-item
scale. It includes a direct question about the degree of humanity/animality. Responses are indicated using a slider
scale located below a schematic illustration of human evolution. The proposed method is ‘brief, face-valid and intuitive and it theoretically (…) captures a number of important characteristics of blatant dehumanization’ (Kteily et
al., 2015, p. 4)
Extensive research, with some garnering increased public attention, following the AoH approach, has demonstrated that the method addresses a theoretically and socially salient issue. As it turns out, blatant dehumanisation
not only remains prevalent among many societies but also
predicts violent attitudes better than subtle measurements
(Kteily et al., 2015). Multiple studies have demonstrated a
correlation between results of the AoH scale and theoretically expected beliefs, opinions, and traits (e.g. Kteily et al.,
2015, Kteily & Bruneau, 2017, Bruneau et al., 2018).
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
can make them more cognitively available, which may affect
subsequent processes of judgement.
We assume that anchoring, implicit assumptions, and associative priming may impact results of questionnaires because respondents are subjected to the immanent processes
of social and cognitive information processing, not because
they are directly affected by the researchers’ intentions. All
these features may not be consciously or intentionally introduced by researchers, however as they are subjectively
perceived, they play an important role.
Research Problem
We argue that the peripheral features of the AoH scale,
which are not theoretically justified, may substantially affect results. If this is the case, it could be problematic to
identify the degree to which results generated by the measurement reflect the ‘real level’ of a latent value, and to
what degree they are a by-product of a complex measurement situation encompassing cognitive and social features.
First, we would like to note the issue of the initial placement of the dot on the slider scale below the AoH illustration. According to the illustration in the paper introducing
the method (Kteily et al., 2015), the dot is placed on the extreme left, under the picture of the least developed creature – a quadrupedal monkey. The same dot position was
used in the questionnaire file for online research, which was
shared with us by courtesy of Nour S. Kteily (private correspondence, 2018), and in many subsequent illustrations
from papers using the AoH scale.
While the authors of the first paper describing the
method discuss some peripheral elements of the measurement (such as instruction), they do discuss to the position
of the dot, which may also be an important feature. We theorise that the choice of initial dot position may have nontrivial, theoretically important consequences for the measurement through changes in the implicit premises about
the level of humanity and changes in the meaning of moving the dot.
Placing the dot at the extreme right would create a default ‘100%’ level of humanity, which could reflect the
premise that all groups are biologically complete human beings. In this case, moving the dot would mean diminishing
the initial full humanity of the group, ergo dehumanising it.
Placing the dot on the extreme left, chosen by the authors, sets the default level of humanity as “0%”, which
could suggest a different theoretical assumption (e.g. that
humanity is a “hard to earn” status). In this case, the respondent decides how much humanity to add above the initial ‘zero’, and therefore moving the dot means humanisation.
It can be argued that the dot should be placed in the
middle, as this gives respondents the same degree of choice
when moving left and right, or that there should be no dot
on the screen at all, which seems the most theoreticallyneutral option.
Whatever position is chosen, this property of the measurement could benefit from theoretical reflection and justification. Moreover, important empirical consequences of
the extreme left position could be suspected. Through anchoring mechanisms, such a placement could lower the
Collabra: Psychology
3
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
itsma-van Rooijen & L. Daamen, 2006); thus, it is reasonable to suspect that the type or presentation of the research
topic can form a reference point that helps people to find
‘the right answer’ (Strack et al., 2016). For the AoH measurement, the following questions could be posed: What is
the influence of the initial position indicator on the slider
scale? What is the influence of the combined display of an
in-group on out-groups on a single screen?
Moreover, we would like to challenge the implicit assumption that asking whether people are fully human is
harmless and morally neutral. This issue is most important
from an ethical perspective.
It is possible that, at least partly, awareness of social
norms is what keeps people from endorsing and expressing
prejudice. When these norms are dismantled, for example,
through the influence of an authority figure or a shift in
political discourse, prejudice intensifies among members
of a given society, and they re-evaluate their views. When
norms about prejudice seem to be more permissive, individuals think of themselves as less prejudiced, as they compare
themselves to a more bigoted ‘average citizen’ (Crandall et
al., 2018).
We argue that posing a question about the degree of humanity can signal norms, as it provides clear permission to
think about others in a blatantly dehumanising way. By asking this question, the questioner establishes a premise that
differences in humanness may exist, and that expressing
views about them is reasonable and appropriate. Notably,
the AoH scale does not provide the respondent with an opportunity to become aware of this premise and respond to it
(e.g. in the form of a pre-question ‘Do you believe that there
are differences in the level of humanness among groups of
people?’). Instead, the scale follows the default implicit assumption that the respondent subscribes to the notion of
varying degrees of humanness.
Theoretically, respondents can express a view indicating
no differences in the degree of humanness, but the presented default assumption may lead them away from this
view. The influence of ‘defaults’ has been demonstrated in
critically important decisions with real-life consequences,
such as organ donation (Johnson & Goldstein, 2004). Similar patterns are expected in less engaging situations, such
as the anonymous completion of an online questionnaire.
Furthermore, as mentioned earlier, it has been empirically
demonstrated that people can act in accordance with implicit assumptions of questionnaires, for example, by stating opinions about non-existent topics or presenting
knowledge about matters they have previously declared a
lack of knowledge.
Another reason why we believe that the AoH measurement can affect respondents’ views on an out-group is the
phenomenon of associative/context priming (DeCoster &
Claypool, 2004; Zeelenberg et al., 2003).
It has been demonstrated that when two stimuli are presented simultaneously, one can prime associations with the
other. The associations between derogated out-groups and
different animals are common. They are constrained by social norms, but individuals can easily encounter them outside the mainstream media, even if they may not endorse
them. Henceforth, animal-out-group associations are present in the memory and displaying a visual that links them
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
To test this hypothesis, we introduce two conditions. In
the control condition, the display pattern from the original
study is retained, which means that the groups are presented simultaneously, one below the other, in random order. In the experimental condition, the random sequence of
groups is retained, but every group is displayed separately
with no possibility of seeing previously given scores.
Third, we examine the impact of participating in the AoH
measurement on attitudes toward out-groups. We suppose
that participating in the AoH scale can shift beliefs about an
out-group, such that after responding to the scale, individuals may hold more dehumanising views (H4) and more prejudice (H5) toward the groups which they were asked about.
To test these hypotheses, we measure the level of prejudice and infrahumanisation at the end of all AoH trials.
Scores for prejudice and infrahumanisation are compared
after completing the AoH scale with the results of the control group, who will respond to a bogus questionnaire of
similar length and structure, free of intergroup and human/
animal connotations.
In addition to the third research problem, we address
how the impact of the AoH scale can be compared to the impact of a similar prejudice-related scale. If the AoH can influence attitudes toward groups, can the same be said about
other, similar measurements? To test this, we introduce another condition with a ‘Feeling thermometer’ scale. The
‘Feeling thermometer’ is similar to the AoH scale. It utilises
a slider scale and encompasses a metaphorical way of expressing a positive or negative attitude. It differs in that it
does not lift any social taboo, and neither image nor instruction contains any suggestion of generic, essential differences between social groups. Therefore, we suppose that
infrahumanisation of out-groups would be greater after responding to the AoH than the ‘Feeling thermometer’ scale
(H6).
The results of this study are valuable, regardless of
whether hypotheses were confirmed. Every instance in
which the hypotheses are proven wrong could be interpreted in favour of robustness and ethical feasibility of the
method. Note that if the measure proves to be unaffected
by the anchoring effect or by cognitive clues suggesting the
researchers’ expectations, it could be treated as evidence in
favour of both the reality of blatant dehumanisation and
the reliability of the method. If all hypotheses are proven
wrong, it could mean that the AoH measurement follows
assumptions of the classical test theory in the sense that
it does not influence the measured variable. It could also
mean that the measured disposition towards a group is generally well established so that it manifests itself in the same
way regardless of changes in the measurement situation.
Method
To test the hypotheses, we conducted an experimental
study involving participants via an online panel. The analysis was performed using the Bayesian approach, with all hypotheses pre-registered via the Open-Science Framework
using the template by van’t Veer and Giner-Sorolla (2016).
All materials and data are freely available through an online
repository (https://osf.io/c5k8q/).
Collabra: Psychology
4
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
score, as the initial position of the dot can serve as an anchoring point in the evaluation process (Furnham & Boo,
2011; Reitsma-van Rooijen & L. Daamen, 2006; Tversky &
Kahneman, 1974).
Another potential issue is the display of the groups. In
the original method, all evaluated groups were displayed on
the same page. This feature of the measurement situation
has also been left undiscussed, while we argue that it may
be important for results.
Considering measurement as a social situation in which
participants may seek to guess hidden expectations and
rules, we argue that displaying the groups together along
with the instructions which read: ‘Some people think that
people can vary in how human-like they seem (…)’ can result in the impression that the expectation of the task is to
indicate the differences. First, such instruction can serve as
social proof for the validity of the idea that people present
different levels of humanness. Second, when all the groups
are presented together, participants can more easily diversify their responses, without remembering them. Summing
up, the display pattern where respondents could easily see
all their answers, along with instructions encouraging to indicate differences, could result in increased variability of
scores.
Considering all these reasons, we argue that participating in the AoH measurement can affect views about others.
By removing a social taboo, introducing the premise that
differences in degrees of humanness exist, and strengthening and invoking associative primes between humans and
animals, the AoH measurement could induce dehumanisation rather than just measure it.
To address these concerns, we investigate three research
problems.
First, we evaluate whether the initial placement of the
dot affects scores obtained by the AoH scale. To do so, we
manipulate the dot’s position, creating three conditions. In
the control condition, the dot is placed where it appears
originally, on the extreme left. In the two experimental conditions, the dot is placed in the middle and extreme right.
We hypothesise that because of the anchoring-adjustment heuristic (Furnham & Boo, 2011; Tversky & Kahneman, 1974), the middle position will result in substantially
higher scores than the left position (H1), while the right
position will yield higher results than the middle position
(H2). We suppose that this effect will occur only with respect to highly derogated out-groups because of the ceiling
effect— scores for a favoured out-group may be too high to
be heightened further. From recent public opinion polls, we
conclude that the most disregarded out-groups for the intended population are Muslim refugees, Arabs, Roma, and
Russians (Omyła-Rudzka, 2019; Stefaniak et al., 2017).
Therefore, we propose the first two hypotheses with respect
to them.
Second, we investigate the role of the display pattern
of groups in creating variability among results for different
groups. Due to the perceived social expectation mechanism
and cognitive availability, combined with anchoring heuristics, we expect that the mean within-subject variance will
be higher when groups are displayed all at once. We hypothesise that the scores for different groups will be differentiated when groups are displayed together (H3).
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
Deviations from Pre-registered Protocol
Regarding the missing data handling, we decided to deviate from pre-registered protocol. It turned out that our
pre-registered criteria for data exclusion proved inadequate
to meaningfully detect the low-effort and suspicious responses and there are better alternatives possible. Here are
lists of changes along with their justifications:
Participants and Data Gathering
Measurements
Participants constitute a sample of the Polish population, representative of age, gender, and educational attainment. The population structure was sourced from the government’s statistical office and representativeness was
obtained through targeted sampling. Participants were recruited via online panel (‘Ariadna’). All participants received reward points from the panel and provided informed
consent. The sample composition and recruitment method
reflect the design in Bruneau et al. (2018).
The desired sample was estimated using Bayes factor design analysis with fixed ‘n,’ described by Schönbrodt and
Wagenmakers (2017). We planned to recruit 200 partici-
We used three questionnaires: AoH, Infrahumanisation
and Feeling thermometer. These methods were used to
evaluate eight groups: Poles (in-group), Germans, Russians,
Roma, Arabs, Muslim refugees, Czechs, and Americans.
Additionally, we created a bogus measurement which
was intended to serve as a control condition task in place of
the AoH scale.
Ascent of humans. The measurement of blatant dehumanisation was first introduced in a study by Kteily et al.
(2015). Since then, it has been used in various forms and
under different names. Originally the scale was dubbed the
‘Ascent of man’, although most recent papers refer to it as
Collabra: Psychology
5
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
1. Instead of using open questions to screen-out suspicious responses, we used a quality-check tool, provided by Qualtrics - “ExpertReview”. This tool analyses re-captcha scores, time of completion, duplicate
responses and pattern of missing responses to identify low-quality data. We decided that this automatic
tool would serve our goal much better than our arbitrary, qualitative analysis. At the time of pre-registration, we were not aware of this tool being available.
2. We decided to drop the initial idea of “forcing” responses because of the panels’ recommendation
against such measures. Instead, we opted for “requesting response” - if the participant left some item
unanswered, they saw a completion request. The respondent could ignore the message and proceed, consciously leaving some questions unanswered. We decided that in such a case, responses could be
reasonably treated as low-effort and dropped from
the analysis.
3. We decided to drop the exclusion criteria regarding
“(…) participants whose time of completing the questionnaire is extremely above or below typical (under
and above 3 SD)”. After inspecting our results, we
found around 50, unevenly distributed outliers, some
of them very extreme, which clearly indicated breaks
in the survey completion. The standard deviation
proved to be so high, that it could not form meaningful cut-off points. Furthermore, we discovered no unrealistically fast answers, and extremely long answers
had not differed in quality as judged by other criteria
(missing answers, ExpertReview). We concluded that
since breaks do not indicate low-quality answers and
cut-off criteria would be either meaningless (3 SD) or
too arbitrary (alternative method chosen after datainspection), we should not use time-based criteria for
data exclusion at all. We included completion time in
our database to allow independent evaluation if desired.
pants for each of the seven conditions. The hypotheses tests
were assumed to be conclusive when BF ≥ 6. This value was
chosen as it is commonly interpreted as moderate support
for a hypothesis (van Doorn et al., 2019), which we find to
be conclusive enough to achieve the scientific goals of the
study.
To compute the probability of obtaining compelling evidence given BF = 6, n = 200, and ES = 0.4, we performed
a Monte Carlo simulation using the R-package ‘BFDA’
(Schönbrodt & Stefan, 2018). The simulation was repeated
10,000 times, with the default Cauchy prior (zero-centred,
r = 0.707). We chose an effect size of 0.4 because the mean
effect size of the difference between the ingroup and outgroups in the Bruneau et al. (2018) study was ES = 0.61.
We decided that detecting an effect of peripheral properties
that were more than half the size of the effect of the focal
test would be a significant finding from a theoretical and
practical perspective.
Under H1, the probability of a false negative result was
< 0.01%, while that of inconclusive results was 8.7%. Under
H0, the probability of a false-positive result was 0.5%, while
that of inconclusive results was 31.8%. Note that the actual
n was higher for testing hypotheses 1–3, as we used two or
three conditions per side, resulting in 400/400 and 600/600
comparisons (see Figures 2 and 3).
Our final sample was larger than we planned because of
the additional volume added from the research panel. We
decided to include additional participants to maximise the
utility of the used resources.
We excluded 49 participants with missing answers in
non-demographic questions. Additionally, we excluded one
participant with suspicious ID which did not match the pattern of the Panel’s ID. Qualtrics ExpertReview quality detection system indicated eight possible records from bots,
but these records contained missing answers as well, so no
respondents were excluded solely on this particular criterion.
The final sample consisted of 1751 participants (927 females, 810 males, 14 missing answers, Mage = 42.65, SDage
= 14.13, ranging from 17 to 85, 14 missing answers). The
participants’ levels of education were: primary – 11.5%, vocational – 19.9%, secondary – 33.5% , higher – 34.3%, 14
missing answers. The participants’ places of residence were:
village – 39.7%, small city (up to 20k residents) – 9.3%,
medium city (20k-99k residents) – 18.3%, large city (100k or
more) – 31.9%, 14 missing answers.
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
Figure 1. Illustration above the slider scale in
“Ascent of Humans” measurement.
in-group. The particular version of the method used in this
study follows the prejudice measurement from Bruneau,
Kteily, and Laustsen (2018).
Infrahumanisation. Infrahumanisation was measured
by the list of emotions originally developed by Demoulin et
al. (2004) and adapted and normalised by Bilewicz, Mikołajczak, Kumagai, and Castano (2010). Based on ratings given
by the respondents in the adaptation study, the research
team, assisted by expert judges, chose 20 emotions, with
5 for each category: high humanity/low desirability, high
humanity/high desirability, low humanity/low desirability,
and low humanity/high desirability. The list was chosen
with consideration to humanity/desirability scores, but also
so that it does not contain redundant or obscure words.
Respondents rated the extent to which they believed the
members of the group ‘X’ are, in general, likely to feel the
given emotion, on a seven-point scale. The full list of emotions and the list chosen for this study are available at
https://osf.io/c5k8q/.
Bogus scale. To conclude the influence of evaluating
groups via the AoH scale, the participants in control conditions needed to be engaged in a task similar to AoH, but free
of in-group/out-group and low/high humanity associations.
In the control condition, participants were asked to evaluate eight different brands of mobile phones (Samsung, Apple, Huawei, LG, Alcatel, HTC, Sony Ericsson, Motorola) in
terms of how innovative and modern they seemed. The instructions read:
‘Some people think that brands of a mobile phone vary
in how innovative and modern they seem. According to this
view, some brands seem highly innovative, whereas others
seem to be derivative and archaic. Using the sliders below
indicates how innovative you consider the brand to be’.
Participants then saw an image of five mobile phones
presented from the oldest to the most contemporary smartphone, and they were asked to evaluate the eight mobile
phone brands (see Supplementary Materials or OSF repository: https://osf.io/c5k8q/)
Research Design
We randomly assigned participants to one of the eight
experimental conditions.
In six (3×2) conditions, participants first completed the
AoH scale with one of three dot positions (left, middle,
right) combined with one of two display patterns (joint,
separated). Subsequently, participants completed the ‘feeling thermometer’ and ‘infrahumanisation’ measurements
Collabra: Psychology
6
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
AoH, following recommendations from Kteily and Bruneau
(2017) to make the name more inclusive.
Using Google Scholar, EBSCO, ResearchGate, and
Mendeley search engines, we identified 16 works published
between 2015 and 2019 using a version of the AoH scale.
Of these, 12 studies were peer-reviewed papers, one was a
doctoral dissertation, one was a working paper, and one was
a research report from an academic research centre, while
one was a conference paper announced as scheduled for
publishing in a peer-reviewed journal (a list of the considered works is to be found in the Supplementary Materials
and OSF repository: https://osf.io/c5k8q/).
After reviewing the sources, we concluded that the studies varied in the details of the measurement. For instance,
some used reference points underneath the slider scale,
while others did not. Differences were also found in the instructions presented. Most often, none of the measurement
properties were directly described in full detail. They had
to be deducted from presented pictures, examined from uploaded research materials, or confirmed via contact with the
authors. To reach conclusions about what the most ‘standard’ method would look like, we combined our insights
from the source review with information from direct contact
and the obtained study materials.
We concluded that although there is no precise, full consensus regarding the design of the Ascent of Humans scale,
the most common features have been: lack of a reference
point underneath the slider scale, initial position of the dot
at the extreme left, multiple groups per screen display, randomised group display order, and instructions which read:
‘Some people think that people can vary in how humanlike they seem. According to this view, some people seem
highly evolved, whereas others seem no different than lower
animals. Using the sliders below, indicate how evolved you
consider the group of people to be.’
What remains unchanged throughout all investigated
studies is the picture used. To the best of our knowledge, it
has always been the same black-and-white graphic, depicting five silhouettes ranging from a quadrupedal monkey to
an anatomically contemporary human (see: Figure 1)
The dehumanisation score for each group was obtained
by subtracting the rating of an out-group from the rating of
the in-group.
Based on these facts, we established the AoH scale with
all of the properties described above as our reference point
for experimental manipulations.
In our analysis, we used two types of AoH scores. The relative AoH score (AoHrel) was computed by subtracting the
score of the outgroup from that of the in-group. A higher
AoHrel value indicates stronger dehumanisation. The absolute score (AoHabs) is the degree of humanity attributed
to the group, and it can assume values from 0 to 100 (full
humanity).
Prejudice. Prejudice was assessed using a feeling thermometer, a commonly used method in which participants
are asked ‘How warm (favourable) or cold (unfavourable) do
you feel towards the following groups?’ Answers are given
on a 5-point scale (with two presented anchors: 1 = very unfavourable, 5 = very favourable; Haddock et al., 1993).
Relative prejudice toward each group was computed by
subtracting the score of an out-group from the score of the
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
in a randomised order.
In the seventh condition, participants first completed a
bogus scale measurement followed by the ‘feeling thermometer’ and ‘infrahumanisation’ scale in a randomised
order.
In the eighth condition, participants first completed the
‘feeling thermometer’ scale followed by ‘infrahumanisation’ scale.
The order of groups was randomized across all conditions and scales.
The number of participants in each group were: AoH
joint display/left dot (n = 239), AoH joint display/middle dot
(n = 221), AoH joint display/right dot (n = 222), AoH seperate display/left dot (n = 223), AoH seperate display/middle
dot (n = 217), AoH seperate display/right dot (n = 225), ‘bogus scale’ (n = 225), ‘feeling thermometer’ (n = 229).
The research plan for each group is summarized in Figure
2.
Data Analysis
Data analysis was conducted using the Bayesian approach. Due to the absence of previous related studies, we
used default priors with a zero-centred Cauchy distribution,
r = .707. As previously mentioned, a Bayesian factor of six in
favour of either null or alternative hypotheses was considered conclusive. See Figure 3 for detailed list of statistical
procedure and key variables in all hypotheses.
No outliers were identified in terms of the time of response or any otherwise suspicious answers. 49 respondents were removed due to missing answers in dependant
variables measures, one respondent was removed from the
database due to atypical respondent ID and unusual order
of question display, which suggested an error in Qualtrics
engine or online panel software.
Results
Here, we present the analyses of the pre-registered hypotheses along with non-pre-registered exploratory analyses. All analyses of pre-registered hypotheses are supplemented with a Bayesian factor robustness check – a method
that allows testing the sensitivity of the Bayesian factor
to different widths of priors distributions. Plots for these
checks can be found in the OSF repository (https://osf.io/
c5k8q/).
Pre-registered Analyses
All the pre-registered analyses were Bayesian MannWhitney-U for independent samples (van Doorn et al.,
2019). In accordance with the pre-registered plan, we decided to use ‘U’ tests due to discrepancies between distributions of all dependent variables and the normal distribution. Specifically, all distributions were extremely
left-skewed, with the mode being equal to the maximum
score of the scale (100). In Figure 4, we present the combined distribution of AoHabs scores for all four tested groups
(Arabs, Muslim refugees, Roma, Russians).
The distributions of AoHabs scores for each group followed roughly the same shape.
To formally confirm or reject hypotheses, we used the
pre-registered criteria of BF > 6. The prior probability is a
zero-centred Cauchy distribution with a scale parameter of
.707 in all cases.
Sliders’ scale dot position and the AoH score. We hypothesised (H1) that the AoHabs score for the left dot position (n = 452) would be substantially lower than that for the
middle (n = 419). The null hypothesis was δ = 0, and the alternative was directional: δ < 0. We obtained the following
results:
• Inconclusive results for Roma: BF01= 2.44, posterior
effect size distribution was centred around Glass’s δ =
-.11, 95% CI [-.24, -.01]
Collabra: Psychology
7
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
Figure 2. Diagram of experimental conditions and procedure sequence.
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
Figure 3. Summary of hypotheses with corresponding groups, variables and planned analyses.
• Inconclusive results for Russians: BF01 = 4.04, posterior effect size distribution was centred around
Glass’s δ = -.09, 95% CI [-.22, -.01]
• Data in favour of the H0 for Arabs: BF01 = 7.15, posterior effect size distribution was centred around
Glass’s δ = -.07, 95% CI [-.20, -.01]
• Data in favour of the H0 for Muslim refugees: BF01
= 10.17, posterior effect size distribution was centred
around Glass’s δ = -.06, 95% CI [-.17, - <.01]
Analogically, we expected (H2) that the AoHabs score for
the middle (n = 419) dot would be substantially lower than
the score for the left dot (n = 421). The null hypothesis was
δ = 0, and the alternative hypothesis was directional: δ < 0.
We obtained the following results:
• Data in favour of the H0 for Roma: BF01 = 19.89, posterior effect size distribution was centred around
Glass’s δ = -.03, 95% CI [-.13, - <.01]
• Data in favour of the H0 for Russians: BF01 = 23.61,
posterior effect size distribution was centred around
Glass’s δ = -.03, 95% CI [-.12, - <.01]
• Data in favour of the H0 for Arabs: BF01 = 18.11, posterior effect size distribution was centred around
Glass’s δ = -.04, 95% CI [-.14, - <.01]
• Data in favour of the H0 for Muslim refugees: BF01
= 15.79, posterior effect size distribution was centred
around Glass’s δ = -.04, 95% CI [-.14, - <.01]
Given our criteria, both hypotheses regarding the influence of dot position were either disconfirmed or inconclusive.
Group display pattern and the within-subject variance of AoHabs score. We verified the hypothesis that
when groups are displayed on a single screen, one below
Figure 4. Distribution of absolute AoH score for all
groups combined.
the other, the AoHabs scores will be more varied than when
groups are displayed on a single screen (H3).
To test this, we computed the within-subject variance for
all the groups’ scores and then tested the difference in variances between the joint display (n = 651) and separate-display groups (n = 649). The null hypothesis was δ = 0, and the
alternative was directional: δ < 0.
The data was strongly in favour of the null hypothesis:
BF01 = 51.84, posterior effect size distribution was centred
around Glass’s δ = .02, 95% CI [.00, .05].
Impact of participating in the AoH measurement on
attitudes toward out-groups. With respect to the second
problem, we verified three hypotheses:
Collabra: Psychology
8
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
Table 1. Bayesian Mann-Whitney U Test for comparison of AoH and bogus group on infrahumanisation (H4).
Posterior median effect
size (δ)
Lower 95
CI
Upper 95
CI
< 0.01
0.24
0.12
0.01
0.30
0.08
< 0.01
0.24
0.10
0.01
0.28
BF₊₀
₊
BF₀₊₊
W
Rhat
Arabs
0.13
7.47
25571.50
1.00
0.08
Roma
0.33
3.04
27217.50
1.00
Russians
0.15
6.74
26018.00
1.00
Muslim refugees
0.23
4.40
27103.00
1.00
Note. For all tests, the alternative hypothesis specifies that group AoH is greater than group bogus.
Note. Result based on data augmentation algorithm with 5 chains of 1000 iterations.
Table 2. Bayesian Mann-Whitney U Test for comparison of AoH and bogus group on feeling thermometer (H5).
Upper 95
CI
0.04
< 0.01
0.16
1.00
0.08
< 0.01
0.24
1.00
0.06
< 0.01
0.20
1.00
0.05
< 0.01
0.19
BF₀₊₊
W
Rhat
Arabs
0.06
17.80
24337.50
1.00
Roma
0.14
7.14
26240.00
Russians
0.09
11.05
24956.00
Muslim refugees
0.08
11.71
25161.50
Note. For all tests, the alternative hypothesis specifies that group AoH is greater than group bogus.
Note. Result based on data augmentation algorithm with 5 chains of 1000 iterations.
• H4: Participating in AoH measurement will result in
higher infrahumanisation scores toward out-groups
when compared with participating in the bogus scale
measurement.
• H5: Participating in AoH measurement will result in
higher feeling thermometer scores toward out-groups
when compared with participating in the bogus scale
measurement (note that a higher feeling thermometer score indicates more prejudice toward out-group).
• H6: Participating in AoH measurement will result in
higher infrahumanisation scores toward out-groups
when compared with participating in feeling thermometer measurements.
We tested a group of participants previously engaged in
the standard AoH measurement (left dot, joint display) versus the group who completed a bogus scale (see p. 21 and
Figure 2) or feeling thermometer scale.
For all three hypotheses, the null hypothesis was δ = 0,
and the alternative was δ > 0.
Infrahumanisation scores for all four out-groups proved
to be marginally influenced or independent of prior engagement in the AoH measurement. The Bayesian factor in
favour of the null hypothesis ranged from BF01 = 7.47 for
Arabs and BF01 = 3.04 for Roma. This indicates that evidence from the data ranged from inconclusiveness to moderate support for the null hypothesis (Table 1).
Feeling thermometer scores were also unaffected by
prior engagement in the AoH versus the bogus scale. The
Bayesian Factor in favour of the null hypothesis ranged
from BF01 = 7.14 for the Roma and BF01 = 17.80 for Arabs,
which provided moderate to strong support for the null hypothesis (Lee & Wagenmakers, 2013; Table 2).
The last pre-registered hypothesis stated that participating in AoH measurement will have a stronger influence
on out-group derogation than participating in a somewhat
similar slider-based measurement: the feeling thermometer. The null hypothesis was δ = 0 and the alternative was δ
> 0.
In all four tested out-groups, the Bayesian Factor
favoured the null hypothesis, but only in two of them, BF
reached a conclusiveness threshold (BF01 = 8.79 for Muslim
refugees and BF01 = 13.01 for Roma). The Bayesian factors
for Russians and Arabs are inconclusive.
In summary, evidence suggests that we should shift our
beliefs towards the notion that participants previously engaged in AoH measurement are just as likely to infrahumanise as those who responded to the feeling thermometer
scale (Table 3).
Notably, owing to the sample plan analysis (see section
Participants and Data Gathering), we know that inconclusiveness is substantially more probable under the true null
hypothesis than the alternative. Another plausible interpretation for the inconclusive results is that some effects may
exist, but their sizes are below the minimum effect of interest.
Exploratory Analyses
In addition to the pre-registered analysis, we decided to
explore the database in search of additional valuable insights and inspiration for future research. We decided to explore three areas: (1) relationships between AoH, prejudice,
and infrahumanisation, (2) the prevalence of blatant dehumanisation of various out-groups, and (3) the distribution
of AoH scores.
Collabra: Psychology
9
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
Posterior median effect
size (δ)
Lower 95
CI
BF₊₀
₊
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
Table 3. Bayesian Mann-Whitney U Test for comparison of AoH and ‘thermo’ group on infrahumanisation.
Posterior median effect
size (δ)
Lower 95
CI
Upper 95
CI
0.17
0.02
0.35
1.00
0.05
0.00
0.19
1.00
0.09
0.01
0.26
1.00
0.06
0.00
0.22
BF₊₀
₊
BF₀₊₊
W
Rhat
Arabs
0.99
1.01
28428.00
1.01
Roma
0.08
13.01
25712.50
Russians
0.19
5.18
27060.00
Muslim refugees
0.11
8.97
26468.00
Note. For all tests, the alternative hypothesis specifies that group AoH is greater than group bogus.
Note. Result based on data augmentation algorithm with 5 chains of 1000 iterations.
Table 4. Mean relative AoH scores in current study versus in the study by Bruneau et al. (2018).
Germans
Current study (Poland)
Bruneau et al., 2018, Study 1 (Czech Republic)
Bruneau et al., 2018, Study 2 (Hungary)
Relationship between Blatant Dehumanisation, Infrahumanisation and Prejudice. Measures of blatant dehumanisation, infrahumanisation, and prejudice proved to
be interrelated. Due to the highly skewed distribution of all
variables, we used a non-parametric Kendall’s tau-b coefficients with default prior distribution (zero-centred, beta
= 1). The strongest relationship was between blatant dehumanisation (AoHrel) and prejudice (feeling thermometer).
The correlation for all out-groups combined was rτ(9008)
= .36, 95% CI [.35, .37], BF10 > 1000. The correlation between AoHrel and infrahumanisation was also significant,
but much smaller, rτ(9008) = .06, 95% CI [.05, .07], BF10 >
1000.
These results replicate the pattern identified in previous
studies, in which AoH scores proved to be highly correlated
with measurements of explicit prejudice and mildly correlated with other measurements of dehumanisation (Kteily
et al., 2015; Kteily & Bruneau, 2017). Moreover, the infrahumanisation score was correlated with the feeling thermometer scale: rτ(9008) = .11, 95% CI [.10, .12], BF10 > 1000.
Interestingly, the more the out-group was negatively
perceived, the stronger the association between blatant dehumanisation and prejudice. For the most disfavoured
groups, Muslim refugees, Arabs, and Roma, the correlations
were rτ (1283) = .40, 95% CI [.37, .44], BF10 > 1000; rτ (1287)
= .34, 95% CI [.31, .38], BF10 > 1000; and rτ(1285) = .33, 95%
CI [.30, .37], BF10 > 1000, respectively. For most favourably
viewed Americans, this effect was about half the size:
rτ(1291) = .18, 95% CI [.14, .21], BF10 > 1000.
Prevalence of blatant dehumanisation of various
out-groups and distribution of scores. Our choice of outgroups, population, and measurement methods was based
on the study by Bruneau et al. (2018). Thus, we compare
our results with those of this work. We present two types
of AoH scores: relative and absolute. The relative AoH score
(AoHrel) was computed by subtracting the score of the outgroup from that of the in-group. A higher AoHrel value indicates stronger dehumanisation. The absolute score
Muslim Refugees
Roma
Russians
-2.07
18.57
13.41
8.69
.5
37.5
38.7
11.8
0.0
26.0
27.6
--
(AoHabs) is the degree of humanity attributed to the group,
and it can assume values from 0 to 100 (full humanity).
In accordance with our expectations, the four groups that
we assumed to be negatively perceived stood out from other
groups in AoHrel scores. Similar to the results obtained by
Bruneau et al. (2018) on Central European samples (Hungary and the Czech Republic), Muslim refugees (M = 18.6,
SD = 28.86), and Roma (M = 13.46, SD = 25.16) proved to
be most blatantly dehumanised. However, the degree of dehumanisation was smaller than that in the original study
(Table 4).
Regarding groups which we assumed to be positively perceived (Czechs, Germans, and Americans), we found no substantial evidence for widespread dehumanisation. Moreover, Germans and Czechs were estimated to be even
slightly more human than the in-group (AoHrel for Germans: M = -2.07, SD =20.01, Czechs: M = -.15, SD = 19.24).
We examined the average scores, but a quick glimpse at
the distribution plots led us to the conclusion that Mean or
any other measure of central tendency neglects important
information.
Figure 5 shows the distribution of AoHabs scores. The
panels are sorted in descending order of the mean AoHabs.
The top-left panel displays the distribution for the most humanised group (Germans) and the bottom-right, the least
humanised (Muslim refugees). Most noteworthy, we observed extreme inflation of the ‘100’ and adjacent scores for
each group. Even for the most dehumanised group (Muslim refugees), 29.84% of all scores equalled 100. For the ingroup (Poles), 48.74% of scores equalled 100, and for the
most humanised group (Germans), 51.44%.
Beside the highly inflated peak at ‘100’, the distribution
was close to uniform, with some small peaks at values: ‘0’,
‘25’, ‘50’ and ‘75’.
In summary, we can identify three distinctive features of
the AoHabs distribution:
1. The scores are always strongly concentrated on the
Collabra: Psychology
10
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
Mean AoHrel
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
highest possible value
2. Lower values are distributed along minimally leftskewed, almost horizontal lines
3. There are small peaks at the four evenly spaced areas
We suppose that these peaks are caused by silhouettes
above those areas (see Figure 4). These pictures may serve
as distinct, visible cues. After all, the anchoring mechanism
may have been in play, but the anchors turned out to be pictures rather than slider-dots.
Figure 6 shows the violin plots of the distribution of the
AoHrel scores sorted by the increasing mean AoHrel score.
The plots do not resemble ‘violins’, because they represent a peculiar distribution. What is striking is the completely different shape of the distribution for positively perceived (Germans, Czechs, and Americans) and negatively
perceived out-groups (Russians, Arabs, Roma, and Muslim
refugees). For the first three out-groups, we can see a massive concentration of the results around ‘0’. These ‘disks’ in
the middle represent a large portion of scores showing virtually no relative dehumanisation.
When it comes to four negatively perceived out-groups, we
can see that AoHrel = 0 is only mildly dominant and scores
slightly below and above zero are quite common as well.
Furthermore, one can notice that even in the case of the
highest mean AoHrel score (represented by the dots), the
cluster of central-tendency scores remain in the same place
(around 0). It is the shape of this cluster and the small
amount of the above-central tendency scores that make the
difference in the mean score.
What theoretical insights can be obtained from this visual analysis?
The first and most important information is that a low
average AoH score for an out-group does not indicate a general consensus about their lower degree of “humanity” - it
indicates less universal agreement that they are fully human. While full humanity was always the most common
score, the difference between the more and less dehumanised out-groups was due to the proportion of in-group
members who do not express this dominant view.
The second insight is that the complete lack of discrimination of the outgroups is not uncommon. Even in the case
of most unfavourably viewed groups, there is still a significant proportion of people who do not dehumanise them.
Furthermore, the in-group is also subjected to absolute dehumanisation (more than 50% of the respondents viewed
their in-group as less than fully human).
Discussion
This study aimed to address the methodological and ethical issues associated with the AoH measurement through
a transparent, pre-registered experimental procedure. The
results of these tests were overwhelmingly disproving when
it came to our concerns.
First, we hypothesised that the raw score of the AoH
measurement can be substantially influenced by the sliderscale dot position or by the pattern of the group display.
If our hypothesis has been confirmed, we would state that
the AoH score may create a specific impression rather than
capture pre-existing beliefs. Consequently, we interpret the
falsification of our hypotheses as a reason to shift our beliefs toward the notion that the results of AoH measurement stem from sources other than the peripheral properties of the measurement. Overall, these results should be
interpreted as evidence against the notion that AoH scores
are just artefacts of a particular measurement method.
Second, and perhaps more importantly, we found a
strong, conclusive disproval of our ethical and methodolog-
Collabra: Psychology
11
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
Figure 5. Distribution of absolute AoH scores for all groups.
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
ical concerns regarding the influence of participating in
AoH measurement. We hypothesised that participating in
AoH measurement can strengthen prejudice, resulting in a
more negative perception of the out-group in the following
measurements. If the hypothesis was confirmed, it would
pose serious ethical concerns and cast doubt on the pre-existing body of theoretical validity evidence.
After filling out the AoH questionnaire, respondents did
not express a more negative and dehumanising view of the
out-groups. This discovery weakens our main ethical concern: by giving such questionnaires to the public, we might
induce prejudice. Furthermore, this study provides more
confidence regarding AoH scores to be a good predictor of
multiple negative attitudes toward out-groups. We proved
that correlations between AoH scores and other prejudicerelated measurements do not stem from the uncontrolled
causal effect, but rather from underlying relationships.
In addition to our main pre-registered hypothesis, we
share novel insights into many characteristics of the measurement. Above all, we were able to systematically evaluate the prevalence of blatant dehumanisation in a given
population.
We conclude that despite dehumanisation being visible
on the mean scores for out-groups, a substantial fraction
of the respondents did not dehumanise out-groups at all.
After inspecting the distribution of results, it may be observed that scores indicating full humanity were massively
inflated. Such a point-inflated distribution indicates the
dual mechanism of responses – one mechanism account for
the difference between the inflated score and the rest of the
distribution and the second mechanism underlies the variability within the rest of the distribution. For instance, investigating cigarette smoking habits by asking ‘how many
cigarettes do you smoke weekly?’ would obtain a technically
continuous variable, however, analysing it just as such
would be incomplete. The difference between ‘0’ and ‘1’ is
the difference between a non-smoker and a regular smoker,
and a massive inflation of ‘0’ scores in the population may
be observed.
The best approach would be to treat the difference between ‘0’ and ‘1’, and the variance in the rest of the scale
as two separate phenomena. This will allow us to include
qualitative differences between dehumanising and non-dehumanising individuals (analogous to ‘smokers’ and ‘nonsmokers’), which will not only reflect AoH scores more accurately but also provide a better insight into the
relationships with other variables. There are statistical
techniques that allow the modelling of such variables in a
dual way. (e.g. hurdle models or zero-inflated Poisson, see:
Green, 2021).
Apart from methodological aspects, the distribution of
the scores provides valuable theoretical information. The
percentage of respondents displaying no out-group derogation was substantially higher with AoH measurement than
with other measurements from this domain. This implies
that this prejudicial view is comparatively rare. Perhaps the
central claim behind the development of the AoH scale –
that blatant dehumanisation is still prevalent in contemporary society needs an important complement.
Blatant dehumanisation is present, yes, but is not universal, and not nearly as common as more subtle prejudice.
We believe that this may be the reason why AoH is a better
predictor of out-group aggression or discrimination. Out of
all widely used methods, AoH may be the best at capturing
a firm, consciously held prejudice. In that respect, AoH may
bridge an important gap by examining blatant dehumanisa-
Collabra: Psychology
12
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
Figure 6. Violin plots of relative AoH score for all tested out-groups. Out-groups are presented in the order of
ascending mean score.
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
tion. Recent research on prejudice is often said to concentrate too much on the subtle, unconscious biases on the expanse of overtly hurtful, self-conscious, and active racism,
sexism, etc., which are still an important social issue.
Limitations and Future Directions
Contributions
Contributed to conception and design: KI. TG, DD.
Contributed to acquisition of data: KI.
Contributed to analysis and interpretation of data: KI.
Drafted and/or revised the article: KI. TG, DD.
Approved the submitted version for publication: KI. TG,
DD.
Competing Interests
Authors declare no competing interests regarding presented work.
Acknowledgements
We would like to thank Michał Bilewicz for providing a
valuable insights and suggestions regarding the design of
the study and the measurements we could use.
Funding Information
The study has been funded by SWPS University of Social
Sciences and Humanities, Faculty of Psychology (grant
competition nr 1/2019/2020) from the subvention of the
Ministry of Science and Higher Education, Republic of
Poland.
Data Accessibility Statement
All data, reproducible files for data analyses and experimental materials are publicly accessible via Open Science
Collabra: Psychology
13
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
Currently, the line of research on dehumanisation has
been questioned (Over, 2021). The main concerns are theoretical: How exactly is dehumanisation defined? To what
extent could it drive inter-group violence? Are the comparisons to animals universally derogative and specifically attributed to out-groups? Over (2021) argues that the proponents of dehumanisation research do not provide enough
evidence to support the notion that dehumanisation was a
driving factor for violence and discrimination or that historically persecuted out-groups were consequently perceived
as less human. Over (2021) suggests that the main driving
force behind inter-group atrocities is an extremely negative
out-group perception, often focused on the arcs which make
sense only when applied to human beings (traitors,
schemers).
Over (2021) argues that comparisons to animals are present only when they serve to enhance and consolidate these
negative connotations. Consequently, when individuals associate certain out-groups with animals, it may not necessarily mean that they think of the members as less human.
This may mean that they hold strong, negative views about
these out-groups and that they often came across messages
that embed these views in some animal metaphors, which
have now become a part of an association-net around this
out-group.
Therefore, does AoH measurement provide evidence that
a substantial portion of individuals think of others as not
fully, biological humans? We believe that this is not necessarily the case.
Our findings refute a critical point whose confirmation
would indicate that AoH scores and correlations with related concepts are largely artefacts. In this sense, we have
provided evidence that AoH scores represent a certain psychological reality. However, the question remains as to what
exactly this method measures.
The first paper by Kteily and colleagues (2015) examined
only convergent and predictive validity, and to the best of
our knowledge, no published, peer-reviewed work since the
method’s introduction has addressed measurement validity
and reliability. Our work has significant limitations when
examining the accuracy of the AoH scale as well. First, we
used only a self-report questionnaire and did not control
for or mitigate the social desirability of the responses. Secondly, other possible problems and important questions
about the scale were not addressed, e.g. could it confuse
perceptions of humanity with perceptions of ‘ape-ness’ or
masculinity? (the pictures only depict human males, and
being human is directly juxtaposed with being an ape).
Another limitation of the conclusions of our study is the
dependent variables used. To maintain comparability, we
chose two methods (feeling thermometer and infrahumanisation) that have been widely used in conjunction with the
AoH.
However, these methods also have their limitations. The
validity of the ‘feelings thermometer’ as a measure of prejudice is not a topic widely discussed in the literature - it is
much more often used to validate other scales than in the
context of testing its own validity.
The infrahumanisation index on the other hand has been
shown to have moderately low test-retest reliability (r = .46,
Kteily et al. 2015, p. 910). This latter point may not be crucial in the context of our results, as we were more interested
in infrahumanisation as a state than a trait, but it may limit
the interpretation of the infrahumanisation score as a measure of entrenched attitudes towards outgroups.
Summing up, the next important topic regarding Ascent
of Humans scale is establishing whether it examines actual
views of non-metaphorical, biological inferiority, or is it a
well-calibrated, one-item measurement of extreme prejudice. In both cases, the method may be a valuable tool,
but we believe that more research is needed to establish
whether results can be interpreted at face value.
One such crucial research could be testing the predictive,
discriminant validity of the blatant dehumanisation construct. If this theoretical construct is substantially different
from negative attitudes, it should be possible to name an
outcome that is different for highly dehumanised outgroups than for extremely negatively perceived ones. Such a
study, especially with pre-registered plans and predictions,
could be an important input to the current discussion regarding dehumanisation.
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
Framework.
(https://osf.io/c5k8q/)
Submitted: September 29, 2021 PDT, Accepted: March 04, 2022
PDT
Ethics Approval Statement
Study was approved by the SWPS University of Social Sciences and Humanities ethics review board.
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License
(CCBY-4.0). View this license’s legal deed at http://creativecommons.org/licenses/by/4.0 and legal code at http://creativecommons.org/licenses/by/4.0/legalcode for more information.
Collabra: Psychology
14
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
References
Green, J. A. (2021). Too many zeros and/or highly
skewed? A tutorial on modelling health behaviour as
count data with Poisson and negative binomial
regression. Health Psychology and Behavioral Medicine,
9(1), 436–455. https://doi.org/10.1080/21642850.202
1.1920416
Haddock, G., Zanna, M. P., & Esses, V. M. (1993).
Assessing the structure of prejudicial attitudes: The
case of attitudes toward homosexuals. Journal of
Personality and Social Psychology, 65(6), 1105–1118. h
ttps://doi.org/10.1037/0022-3514.65.6.1105
Haslam, N. (2006). Dehumanization: An Integrative
Review. Personality and Social Psychology Review,
10(3), 252–264. https://doi.org/10.1207/s15327957ps
pr1003_4
Johnson, E. J., & Goldstein, D. G. (2004). Defaults and
Donation Decisions. Transplantation, 78(12),
1713–1716. https://doi.org/10.1097/01.tp.000014978
8.10382.b2
Kteily, N., & Bruneau, E. (2015, September 18).
Americans see Muslims as less than human. No
wonder Ahmed was arrested. Washington Post. http
s://www.washingtonpost.com/posteverything/wp/201
5/09/18/americans-see-muslims-as-less-than-huma
n-no-wonder-ahmed-was-arrested/
Kteily, N., & Bruneau, E. (2017). Darker demons of our
nature: The need to (Re)focus attention on blatant
forms of dehumanization. Current Directions in
Psychological Science, 26(6), 487–494. https://doi.org/
10.1177/0963721417708230
Kteily, N., Bruneau, E., Waytz, A., & Cotterill, S. (2015).
The ascent of man: Theoretical and empirical
evidence for blatant dehumanization. Journal of
Personality and Social Psychology, 109(5), 901–931. htt
ps://doi.org/10.1037/pspp0000048
Lee, M. D., & Wagenmakers, E.-J. (2013). Bayesian
cognitive modeling: A practical course. Cambridge
University Press. https://doi.org/10.1017/cbo9781139
087759
Leyens, J.-P., Cortes, B., Demoulin, S., Dovidio, J. F.,
Fiske, S. T., Gaunt, R., Paladino, M.-P., RodriguezPerez, A., Rodriguez-Torres, R., & Vaes, J. (2003).
Emotional prejudice, essentialism, and nationalism:
The 2002 Tajfel Lecture. European Journal of Social
Psychology, 33(6), 703–717. https://doi.org/10.1002/ej
sp.170
Leyens, J.-P., Demoulin, S., Vaes, J., Gaunt, R., &
Paladino, M. P. (2007). Infra-humanization: The Wall
of Group Differences. Social Issues and Policy Review,
1(1), 139–172. https://doi.org/10.1111/j.1751-2409.20
07.00006.x
Leyens, J.-P., Paladino, P. M., Rodriguez-Torres, R., Vaes,
J., Demoulin, S., Rodriguez-Perez, A., & Gaunt, R.
(2000). The Emotional Side of Prejudice: The
Attribution of Secondary Emotions to Ingroups and
Outgroups. Personality and Social Psychology Review,
4(2), 186–197. https://doi.org/10.1207/s15327957pspr
0402_06
Collabra: Psychology
15
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
Bastian, B., Laham, S. M., Wilson, S., Haslam, N., &
Koval, P. (2011). Blaming, praising, and protecting our
humanity: The implications of everyday
dehumanization for judgments of moral status.
British Journal of Social Psychology, 50(3), 469–483. htt
ps://doi.org/10.1348/014466610x521383
Bilewicz, M., Mikołajczak, M., Kumagai, T., & Castano,
E. (2010). Which emotions are uniquely human?
Understanding of emotion words across three
cultures. In B. Bokus (Ed.), Studies in the Psychology of
Language and Communication (pp. 275–285). Matrix.
Bishop, G. F., Oldendick, R. W., Tuchfarber, A. J., &
Bennett, S. E. (1980). Pseudo-Opinions on Public
Affairs. Public Opinion Quarterly, 44(2), 198–209. http
s://doi.org/10.1086/268584
Bruneau, E., & Kteily, N. (2017). The enemy as animal:
Symmetric dehumanization during asymmetric
warfare. PLOS ONE, 12(7), e0181422. https://doi.org/1
0.1371/journal.pone.0181422
Bruneau, E., Kteily, N., & Laustsen, L. (2018). The
unique effects of blatant dehumanization on attitudes
and behavior towards Muslim refugees during the
European ‘refugee crisis’ across four countries.
European Journal of Social Psychology, 48(5), 645–662.
https://doi.org/10.1002/ejsp.2357
Castano, E., & Kofta, M. (2009). Dehumanization:
Humanity and its Denial. Group Processes &
Intergroup Relations, 12(6), 695–697. https://doi.org/1
0.1177/1368430209350265
Crandall, C. S., Miller, J. M., & White, M. H., II. (2018).
Changing Norms Following the 2016 U.S. Presidential
Election: The Trump Effect on Prejudice. Social
Psychological and Personality Science, 9(2), 186–192. h
ttps://doi.org/10.1177/1948550617750735
DeCoster, J., & Claypool, H. M. (2004). A Meta-Analysis
of Priming Effects on Impression Formation
Supporting a General Model of Informational Biases.
Personality and Social Psychology Review, 8(1), 2–27. ht
tps://doi.org/10.1207/s15327957pspr0801_1
Demoulin, S., Leyens, J., Paladino, M.,
Rodriguez‐Torres, R., Rodriguez‐Perez, A., & Dovidio,
J. (2004). Dimensions of “uniquely” and
“non‐uniquely” human emotions. Cognition &
Emotion, 18(1), 71–96. https://doi.org/10.1080/02699
930244000444
Esses, V. M., Veenvliet, S., Hodson, G., & Mihic, L.
(2008). Justice, Morality, and the Dehumanization of
Refugees. Social Justice Research, 21(1), 4–25. http
s://doi.org/10.1007/s11211-007-0058-4
Furnham, A., & Boo, H. C. (2011). A literature review of
the anchoring effect. The Journal of Socio-Economics,
40(1), 35–42. https://doi.org/10.1016/j.socec.2010.1
0.008
Giner-Sorolla, R., Burgmer, P., & Demir, N. (2021).
Commentary on Over (2021): Well-Taken Points
About Dehumanization, but Exaggeration of
Challenges. Perspectives on Psychological Science,
16(1), 24–27. https://doi.org/10.1177/1745691620953
788
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
Strack, F., Bahník, Š., & Mussweiler, T. (2016).
Anchoring: Accessibility as a cause of judgmental
assimilation. Current Opinion in Psychology, 12,
67–70. https://doi.org/10.1016/j.copsyc.2016.06.005
Sturgis, P., & Smith, P. (2010). Fictitious Issues
Revisited: Political Interest, Knowledge and the
Generation of Nonattitudes. Political Studies, 58(1),
66–84. https://doi.org/10.1111/j.1467-9248.2008.0077
3.x
Tileagă, C. (2007). Ideologies of moral exclusion: A
critical discursive reframing of depersonalization,
delegitimization and dehumanization. British Journal
of Social Psychology, 46(4), 717–737. https://doi.org/1
0.1348/014466607x186894
Tversky, A., & Kahneman, D. (1974). Judgment under
uncertainty: Heuristics and biases. Science, 185(4157),
1124–1131. https://doi.org/10.1126/science.185.415
7.1124
van Doorn, J., van den Bergh, D., Bohm, U., Dablander,
F., Derks, K., Draws, T., Etz, A., Evans, N. J., Gronau,
Q. F., Haaf, J. M., Hinne, M., Kucharský, Š., Ly, A.,
Marsman, M., Matzke, D., Raj, A., Sarafoglou, A.,
Stefan, A., Voelkel, J. G., & Wagenmakers, E.-J.
(2019). The JASP Guidelines for Conducting and
Reporting a Bayesian Analysis. Preprint. https://doi.or
g/10.31234/osf.io/yqxfr
van ’t Veer, A. E., & Giner-Sorolla, R. (2016). Preregistration in social psychology—A discussion and
suggested template. Journal of Experimental Social
Psychology, 67, 2–12. https://doi.org/10.1016/j.jesp.20
16.03.004
Zeelenberg, R., Pecher, D., & Raaijmakers, J. G. W.
(2003). Associative repetition priming: A selective
review and theoretical implications. In J. S. Marsolek
& J. Chad (Eds.), Rethinking implicit memory (pp.
261–283). https://dare.uva.nl
Collabra: Psychology
16
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
Novick, M. R. (1966). The axioms and principal results
of classical test theory. Journal of Mathematical
Psychology, 3(1), 1–18. https://doi.org/10.1016/0022-2
496(66)90002-2
Omyła-Rudzka, M. (2019). Stosunek do innych narodów.
Komunikat z badań, [Attitude towards other nations.
Research report.] (No. 17/2019). Centrum Badania
Opinii Społecznej. https://cbos.pl/SPISKOM.POL/201
9/K_017_19.PDF
Over, H. (2021). Seven Challenges for the
Dehumanization Hypothesis. Perspectives on
Psychological Science, 16(1), 3–13. https://doi.org/10.1
177/1745691620902133
Reitsma-van Rooijen, M., & L. Daamen, D. D. (2006).
Subliminal anchoring: The effects of subliminally
presented numbers on probability estimates. Journal
of Experimental Social Psychology, 42(3), 380–387. http
s://doi.org/10.1016/j.jesp.2005.05.001
Rosenthal, R. (1963). On the social psychology of the
psychological experiment: The experimenter’s
hypothesis as unintended determinant of
experimental results. American Scientist, 51(2),
268–283. http://www.jstor.org/stable/27838693
Schönbrodt, F. D., & Stefan, A. M. (2018). BFDA: An R
package for Bayes factor design analysis (version 0.3). ht
tps://github.com/nicebread/BFDA
Schönbrodt, F. D., & Wagenmakers, E.-J. (2017). Bayes
Factor Design Analysis: Planning for compelling
evidence. Psychonomic Bulletin & Review, 25(1),
128–142. https://doi.org/10.3758/s13423-017-1230-y
Sigelman, L., & Thomas, D. (1984). Opinion Leadership
& the Crystallization of Nonattitudes: Some
Experimental Results. Polity, 16(3), 484–493. https://d
oi.org/10.2307/3234561
Stefaniak, A., Malinowska, K., & Witkowska, M. (2017).
Kontakt międzygrupowy i dystans społeczny w Polskim
Sondażu Uprzedzeń [Intergroup contact and social
distance in Polish Prejudice Survey], 3, 25. http://cbu.ps
ychologia.pl
Ascent of Humans: Investigating Methodological and Ethical Concerns About the Measurement
Supplementary Materials
S1. The List of Considered Works Using Ascent of Humans
Download: https://collabra.scholasticahq.com/article/33297-ascent-of-humans-investigating-methodological-andethical-concerns-about-the-measurement/attachment/84708.docx?auth_token=CxKZL3LLQoufT42MuhaD
S2. The Illustration of the Bogus Scale (Evolution of Mobile Phones)
Download: https://collabra.scholasticahq.com/article/33297-ascent-of-humans-investigating-methodological-andethical-concerns-about-the-measurement/attachment/85022.jpg?auth_token=CxKZL3LLQoufT42MuhaD
Downloaded from http://online.ucpress.edu/collabra/article-pdf/8/1/33297/498131/collabra_2022_8_1_33297.pdf by guest on 23 March 2022
Peer Review History
Download: https://collabra.scholasticahq.com/article/33297-ascent-of-humans-investigating-methodological-andethical-concerns-about-the-measurement/attachment/85023.docx?auth_token=CxKZL3LLQoufT42MuhaD
Collabra: Psychology
17