Nothing Special   »   [go: up one dir, main page]

Proposal of An Automated Approach To Support The Systematic Review of Literature Process

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Proposal of an Automated Approach to Support the

Systematic Review of Literature Process


Jefferson Seide Mollri

Luiz Eduardo da Silva

Fabiane Barreto Vavassori Benitti

UNIVALI Universidade do Vale do Itaja UNIVALI Universidade do Vale do Itaja UNIVALI Universidade do Vale do Itaja
Rua Uruguai, 458, Centro
Rua Uruguai, 458, Centro
Rua Uruguai, 458, Centro
Itaja, Brazil
Itaja, Brazil
Itaja, Brazil
+55 47 30452242
+55 47 30452242
+55 47 33417544 (#8057)

jefferson.molleri@univali.br

linkz.ns@univali.br

ABSTRACT
Context: Systematic Literature Reviews (SLR) is a scientific
method to identify and assess all available studies related to a
specific research topic. Due its characteristics, SLRs are a time
consuming, hard process that requires a properly documented
protocol for scientific acknowledgment.
Objectives: In this context, this paper aims to propose a
business model to support automation of the systematic review
method, contributing to the productivity and quality of the
process.
Method: Through the results of a previous review, we identified
several contributions to the systematic review process. We
define the process using the Business Process Model Notation
(BPMN) and relate possible tool contributions to the proposed
model.
Results: The model and contributions proposed in this paper
can be used to guide the development of computational tools to
support SLR process and the proper execution of its
methodology.
Conclusions: The implementation of tools supporting SLR
processes seems relevant to reduce effort and to ensure the
quality of the application of the methodology.

Keywords
Systematic Literature Review, Computational Tool, Business
Process Model.

1. INTRODUCTION
Systematic Literature Review (SLR) is an empirical
methodology of research which aims to gather and evaluate all
the available evidence about a specific research topic [1]. SLRs
are a key tool for evidence-based research and practice as they
combine the results of multiple studies. These literature reviews
are important, as the volume of studies to be considered by the
researchers is constantly expanding [2].
Despite its importance, the SLR process is not a easy task,
as it uses specific concepts usually unknown to researchers
familiar with traditional literature reviews. Even when
conducted according to their good practice rules, often SLRs
suffers from lack of scientific rigor in its several steps. An
automated systematic review process aims a stricter and better
managed method of conducting this methodology, avoiding the
the biases of the unsystematic review [1].

fabiane.benitti@univali.br

Further, the systematic reviews require considerably more


effort than traditional reviews, as they provide additional data on
variations in the primary studies [3]. Therefore, some systematic
reviews strongly depend on a computerized infrastructure to
support their processes [4]. In view of difficulties in applying
this empirical research methods, there is a need to invest efforts
in support tools for planning and conducting systematic
literature reviews [1]. An automated or semi-automated tool is
essential to provide quality and productivity to the process [5].
Based on the systematic review process proposed by
Kitchenham [6] and later improved by Kitchenham and Charters
[2] we develop a business model to support systematic reviews
process through automated tools. Through a prior systematic
review of literature [7], we also identified several proposed
contributions to this automated process.
This paper is organized as follows: section 2 introduces an
overview of the model to an automated systematic review
process; sections 3 to 5 present the proposed contributions to
each stage of the process; and section 6 discusses the results
obtained with the proposed model facing the identified
contributions, and further works on implementation of the
automated process.

2. PROPOSED MODEL
As described by Kitchenham [6] and later by Biolchini [1], the
systematic review process consists of three main phases, each
one containing a number of discrete activities and specific stages
to its conclusion. The process is sequential, and later stages
depend on results of their predecessors.
Thus, we present an overview of the process with a focus
on automation of its activities, described by the Business
Process Modeling Notation (BPMN), as detailed by the Object
Management Group1. BPMN provides a graphical notation for
specifying processes based on a flow-charting technique very
similar to the activity diagram of Unified Modeling Language
(UML) [8]. The model can be used for modeling workflow
processes and to represent the semantics of complex processes.
actors for success or factors are not relevant for the research
[12].

Available at http://www.bpmn.org/

Figure 1. Systematic Review Process


The business model proposed in this paper involves three
phases of systematic review process presented by Kitchenham
[6], as illustrated in Figure 1. Each phase and its stages are
described as follows, and the contributions identified are related
to each one. We detail further discrete activities, artifacts
produced, participants involved, and the communication
mechanisms. The completeness of the process takes place when
finalizing systematically all the steps in this modeling.

not have the time or expertise to perform a systematic review


itself. Proposed model allows the management of stakeholders
in this stage2. Stakeholders can assume different roles in the
process: (i) the research coordinator, (ii) the researchers; (iii)
independent reviewers and moderators, and (iv) potential
observers with no assigned tasks in the process. Even when
conducted individually, stakeholder management enables to
invite field experts as observers or moderators in the process.

Following elements are used to represent the SLR process


in a BPMN context: (i) events, representing initial and end
states; (ii) activities, as phases and stages; (iii) gateways,
representing conditions and convergence actions; (iv) sequence
flows, that integrate events and activities; (v) communication
flows, which represent the interaction among participants; (vi)
data objects representing the artifacts of the process; (vii)
messages, which details the contents of a communication; and
(viii) associations, to connect actions and artifacts.

The specifying the research question(s) stage is based on


the PICOC (Population, Intervention, Comparison, Outcome,
Context) criteria to frame research questions, as described by
Pai et al. [11] and expanded by Petticrew and Roberts [12].
Aiming the quality of the review, the researcher is encouraged to
fill out all the criteria, but it is possible to perform SLR with less
formalism blanking the comparison criterion in characterization
reviews [13] or the context criterion when the factors for success
or factors are not relevant for the research [12].

3. CONTRIBUITIONS TO PLANNING
THE REVIEW PHASE

Proposed model should also suggests generate a search


string by concatenating each of the terms proposed for PICOC
acronym, using the logical operator 'AND' to associate criteria.
The logical operator 'OR' can be used for concatenate various
terms within the same criterion on the research question.

SLR process starts with a planning phase, consisting in five


distinct stages, as illustrated in Figure 2. First stage,
identification of the need for a review aims the originality of
the work by consulting knowledge repositories in the study
field, as conceived by Lopes and Travassos [9] or to identify a
SLR to be repeated in a new context or iteration that contributes
to previously obtained knowledge. Repeatability is one of the
basic premises for assuring the quality of a SLR [10].
Commissioning the review stage is suggested when an
organization requires information about a specific topic but does

Developing a review protocol consists of documentation


of protocol elements and some additional planning information
of a systematic literature review as proposed by Kitchenham and
Charters [2]: (i) background; (ii) the research questions; (iii)
research strategy; (iv) study selection criteria; (v) study selection
procedures; (vi) study quality assessment procedures; (vii) data
extraction strategy; (viii) synthesis of the extracted data; (ix)
dissemination strategy; and (x) project timetable.

Figure 2. Planning review phase


2

Available at http://goo.gl/PHXZu

Elements of the protocol should be stored in a


computational artifact that can be accessed later. Selection
criteria, electronic databases references, extraction data and the
timetable must be also stored in a suitable data structure, as
presented in the process model 3. After the protocol
documentation, stakeholders should be informed of their
assigned tasks and dates. The stage further comprises exporting
a review protocol document supported by guidelines such as
Kitchenham and Charters [2] or templates as detailed by
Biolchini et al. [1].
Evaluating the review protocol stage is optionally
conducting to ensure the quality of the process. The protocol
documentation is reviewed by moderators, who submit
comments to the research coordinator that performs the
necessary changes.
This proposed model suggests several contributions in
automated support to the planning review phase, described in
Table 1. Identified contributions aim to address all stages in the
planning phase, automating a number of activities through
specific methods and techniques.
Table 1. Contributions to Planning Review Phase
Stage 1.1 Identification of the need for a review
Consult a knowledge repository, as Lopes and Travassos [9].

Stage 1.2 Formatting the main report


Manage stakeholders, according to communication strategies [14].

Stage 1.3 Specify the research question(s)


Address research question by filling the five criteria of the PICOC
acronym [11][12].
Concatenate the terms proposed for each criteria on the PICOC
acronym to automatically generate a search string.
Input the RSL protocol elements and additional information for
planning the review in a computational artifact.

Stage 1.14 Developing a review protocol


Documentation of the RSL protocol elements and additional
information for planning the review in a computational artifact.
Export a review protocol document in rich text format, containing all
its elements and additional information for planning the review [2].

Manage inclusion and exclusion criteria in a computational structure


that can be used during selection of primary studies stage.
Manage process timetable, assigning tasks to the stakeholders [2][20].
Inform stakeholders tasks and dates, according to communication
strategies [14].
Use guidelines such as Kitchenham and Charters [2], or protocol
templates, as by Biolchini et al. [1] to support the SLR process.

Stage 1.5 Evaluating the review protocol


Manage comments from moderators in the review protocol artifact.

4. CONTRIBUITIONS TO
CONDUCTING THE REVIEW PHASE
Conducting the review phase also involves five stages, as
detailed in Figure 3. We notice that not all stages in this phase
are performed sequentially, but there is also a strong dependence
of the early stages.
During the identification of research stage4, search strings
are automatically generated for each electronic database, as
suggested by Brereton et al. [15]. Search strings are created by
applying over the generic search string (detailed at the research
question(s) stage) specific requirements of the electronic
databases. Automatically generated strings can also be
reformulated by researchers to suit specific formats or rules.
Based on that, proposed model suggests preliminary
searches on electronic databases to check volume and accuracy
of the identified studies, as well as possible biases. Researchers
can also include comparison studies that, if identified in
preliminary searches could be analyzed by query expansion
techniques to suggest new terms for the search string [16].
Results of preliminary searches may cause changes in the
search string and methods documented in the review protocol.
Once finalized the changes, preliminary searches could be
repeated, refining the search string until it is suitable to the
researchers. To solve divergences about changes on search
strings, proposed business model also suggests communication
mechanisms between stakeholders. Following the activity of
preliminary searches, the references of the studies should be
collected and managed.

Figure 3. Conducting the review phase

Available at http://goo.gl/GKh9b

Available at http://goo.gl/1C40T

Proposed model also includes reference management of


collected studies by manual input of specific references or
importing references through common formats such as EndNote
and BibTex. Automation can be done by integration with
Application Programming Interfaces (API) of the electronic
databases. Details of the studies should also be stored in the
search process documentation, an artifact containing a list of the
primary studies collected, that serve as input for the subsequent
stages in the conducting the review phase [6].
The selection of primary studies stage assess the collected
studies list according to selection criteria defined in the review
protocol. This assessment can be done in three different ways, as
illustrated in the process model 5. The selection through multiple
researchers involves every research team member assessing all
the studies individually. Divergences found should be solved by
consensus or by the intervention of a mediator. For such,
communication mechanisms upon the divergences are part of the
proposed model. Statistical coefficients, as the Cohen Kappa
statistic [16] should also be available as supporting tools,
providing helpful information in measure the agreement
between researchers.
Another selection method involves inter-rater reliability
tests, as cited by Khan, Niazi and Ahmadd [18], on which a
researcher assess all collected studies, and later a secondary
reviewer assess only a sample of these. The secondary reviewer
can be another member of the research team or the research
coordinator. If the assessment of the sample matches the primary
researcher's, the selection is considered valid. Otherwise, the
selection of the studies should be redone to ensure the reliability
of the study selection criteria applied.
The last selection method involves the automatic
assessment of collected studies from a minimum rating of
acceptable quality. Details regarding the quality assessment are
provided in the stage below.
Study quality assessment stage consists in creating a list
of assessment criteria to be answered, ensuring the quality of the
selected studies. Studies with a higher ranking are more
significant for the systematic review than the others. This
questionnaire can also be applied as a method for selection of
primary studies, or to score the primary studies suitable for the
data synthesis stage. The research coordinator can define a
specific questionnaire, or choose study quality procedures such
as the CRD Guidelines [18] and the Cochrane Reviewers'
Handbook [19] as assessment criteria. At the end of the stage,
the search process documentation is updated with the scores of
selected studies.
Data extraction and monitoring stage involves collect
specific information needed to address the review questions.
Extraction forms are designed based on the review protocol and
filled in several ways: some data could be automatically
extracted; other should be collected from the study references;
finally the context-specific data require that the researcher read
the selected studies and interpret their results.

The data synthesis summarize the extracted data as tables


and charts that demonstrate in a more natural way the results of
the selected studies. The proposed model includes application of
statistical models as the random effects model or the fixed
effects model [12], forest plots and/or other appropriate formats
for graphical representation of data [2], and the selection of
prominent sentences to supply an informative summary [20] for
an automated systematic review process.
Finally, contributions to the model in conducting the
review phase are presented in Table 2.
Table 2. Contributions to Conducting the Review Phase
Stage 2.1 Identification of research
Formulate and maintain specific search strings for electronic
databases.
Generate search strings to electronic databases, as Brereton et al. [15].
Include known primary studies for comparison.
Suggest new terms for the search string composition from application
of text mining techniques over comparative studies [16].
Conduct preliminary searches on electronic databases to check the
volume and accuracy of the identified studies.
Identify and highlight comparative studies during preliminary
searches.
Facilitate communication among stakeholders to resolve divergences
concerning search string and/or search methods changes.
Manage references of identified studies in preliminary searches.
Import references of the primary studies identified through common
formats such as EndNote and BibTex.
Import references of the primary studies identified through integration
with APIs in electronic databases.
Document the search process and electronic databases information.

Stage 2.2 Selection of primary studies


Select primary studies according to criteria for inclusion and exclusion
of studies defined in the review protocol.
Manage divergences of selection of primary studies between
researchers.
Inform research coordinator/moderators about divergences of selection
of primary studies.
Apply statistical coefficients, as the Cohen Kappa statistic [17] to
assist in solving divergences.
Conduct inter-rater reliability tests between researchers for quality
assurance in the selection of studies [18].

Stage 2.3 Study quality assessment


Develop and apply study quality assessment lists.
Assess primary studies through the quality criteria list.
Consult quality assessment support procedures, as the CRD Guidelines
[18] and the Cochrane Reviewers' Handbook [20].
Use templates of quality assessment lists [19].

Available at http://goo.gl/EP2ts

Stage 2.4 Data extraction and monitoring


Maintain data extraction forms.
Automatically generate data extraction forms from data defined in the
review protocol.
Record specific data to selected studies into data extraction forms.
Automatically extract data from the primary study references, such as
title, authors, journal, publication details [2].

After finish the report formatting, mediators can evaluating


the report, adding comments as in the evaluating the review
protocol stage. Quality assessment for secondary studies [19]
and an efficient communication mechanism among stakeholders
are proposed support tools for this stage.
Table 3 shows the contributions intended to reporting the
review phase in proposed model.
Table 3. Contributions to Reporting the Review Phase

Stage 2.5 Data synthesis


Simple tabulating numerical data extracted from the primary study.

Stage 3.1 Specifying dissemination mechanisms

Calculating the weighted average by the application of statistical


models: the random effects model or the fixed effects model [12].

Define the publication format.

Summarizing the extracted data in graph form.


Automatically generating graphical representation of data in forest
plots and/or other appropriate formats [2].
Obtaining proeminent sentences through text mining techniques to
provide an informative summary of the review [14].
Selecting prominent sentences to increase the data synthesis and the
main report

5. CONTRIBUITIONS TO
REPORTING THE REVIEW PHASE
The last phase of SLR process involves a report of the
systematic review, and later its dissemination and evaluation, as
illustrated in Figure 4. Its first stage, specifying dissemination
mechanisms, consists in define the format of main report, as a
technical report or a journals or conference paper. The proposed
model provides templates for automatic generation of reports, as
Kitchenham and Charters [2] or Biolchini et al. [1]. Knowledge
repositories in software engineering could also be used to record
and disseminate the research [9].
Formatting the main report stage is done by combining
the appropriate report template with the data synthesis results.
Researchers must be able to write specific sections of the report
in a similar way to the review protocol, and later export this
document.

Automatically generate the report through report templates.


Record main report in a knowledge repository, as proposed by
Lopes and Travassos [9].
Stage 3.2 Formatting the main report
Document main report as a computational artifact.
Export the main report in rich text format containing all the
sections documented combined with the template specified.
Include comments from moderators in the main report artifact.
Stage 3.3 Evaluating the report
Use lists of quality templates for assessing the quality of
secondary studies [19].
Submit main report to a knowledge repository, as conceived by
Lopes and Travassos [9].
After the report is approved by the moderators, the
systematic review is given as finished. Despite this, their data
and documents created in the process should be available for
further evaluation, acting as a validation mechanism of the
methodology and sustaining the quality of SLR produced. This
data could also serve as basis for repeating the systematic review
in a new context or iteration.

Figure 4. Conducting the review phase

6.

CONCLUSIONS

The process of conducting systematic reviews involves a series


of discrete activities, arranged in specific stages and phases
identified by Kitchenham [6], later by Biolchini [1] and finally
Kitchenham and Charters [2]. In addition to these activities,
several authors provide further guidelines, that contribute to the
quality of the process or reducing the effort by automating more
exhaustive tasks.
This paper enhances these guidelines with the proposal of a
business model that addresses and support the SLR process. The
proposed model is detailed in a BPMN notation and includes all
phases and stages of the process, providing support on the
proper execution of its methodology. Contributions identified in
a previous paper, its supporting methods and techniques were
related to the specific stages of the model, forming a basis for
development of computational tools to support the process.
As further works, we plan to implement features that
comprise the model proposed, also plan and conduct the
software evaluation on productivity and quality criteria. Given
this, the model proposed seems to be relevant in the context of
empirical studies, providing effort reduction through automated
tasks, and ensuring the quality through the systematic
conduction of the methodology.

7.

REFERENCES

[1] Biolchini, J., Gomes, P., Cruz, A., & Travassos, G. 2005.
Systematic review in software engineering. Technical Report.
Universidade Federal do Rio de Janeiro.

[2] Kitchenham, B., Charters. S. 2007. Guidelines for performing


systematic literature reviews in software engineering (version
2.3). Technical Report. University of Durham.

[3] Shull, F., Carver, J., and Travassos, G. H. 2001. An empirical


methodology for introducing software processes. In Proceedings
of 8th European Software Engineering Conference (Vienna,
Austria, September, 2001). ACM. 288296.

[4] Zamboni, A., Thommazo, A., Hernandes, E. C. M., Fabbri, S. C.


P. F. 2010. StArt Uma Ferramenta Computacional de Apoio
Reviso Sistemtica. In Proceedings of CBSOFT Congresso
Brasileiro de Software: Teoria e Prtica (Salvador, Brazil,
September, 2010) SBC. 16.

[5] Ali, S., Briand, L. C., Hemmati, H., Panesar-Walawege. R. K.,


2010. A Systematic Review of the Application and Empirical
Investigation of Search-Based Test Case Generation, IEEE
Transactions on Software Engineering, 36, 6 (Nov.-Dec. 2010),
742762.

[6] Kitchenham, B. 2004. Procedures for Performing Systematic


Reviews. Technical Report. Keele University Joint Technical
Report.

[7] Mollri, J. S., Benitti, F. B. V., 2012. Automated Approaches to


Support Secondary Study Processes: a Systematic Review. In
Proceedings of 24th SEKE International Conference on
Software Engineering & Knowledge Engineering (San Francisco,
USA, July, 2012). Knowledge Systems Institute Graduate School.
143147.

[8] OBJECT MANAGEMENT GROUP. Business Process Model and


Notation. Available at: <http://www.bpmn.org/>. Acessed: jun
2012.

[9] Lopes, V. P., Travassos, G. H. 2008. Infra-estrutura Conceitual


para Ambientes de Experimentao em Engenharia de Software. In
Proceedings of ESELAW'08 - Experimental Software Engineering
Latin American Workshop (Salvador, Brazil, November, 2008)
SBC. 3444.

[10] Biolchini, J., Gomes, P., Cruz, A., Ucha, T., & Travassos, G.
2007. Scientific research ontology to support systematic review in
software engineering. Advanced Engineering Informatics, 21, 2
(Apr. 2007), 133151.

[11] Pai, M., Mcculloch, M., Gorman, J. D., Pai, N., Enanoria, W.,
Kennedy, G., Tharyan, P., Colford, J. M. 2004. Systematic reviews
and meta-analyses: an illustrated, step-by-step guide. National
Medical Journal of India, 17, 2 (Mar-Apr. 2004), 89-95.

[12] Petticrew, M., Roberts, H. 2006. Systematic Reviews in the Social


Sciences: A Practical Guide. Blackwell Publishing, Malden, MA.

[13] Travassos, G. H., dos Santos, P. S. M., Neto, P. G. M., &


Biolchini, J. 2008. An environment to support large scale
experimentation in software engineering. In IEEE International
Conference on Engineering of Complex Computer Systems
ICECCS (Belfast, Northern Ireland, March, 2008). IEEE. 193
202.

[14] Felizardo, K. R., Andery, G. F., Maldonado, J. C., Minghim, R.


2009. Uma Abordagem Visual para Auxiliar a Reviso da Seleo
de Estudos Primrios na reviso sistemtica, In: Proceedings of VI
Experimental Software Engineering Latin American Workshop
Eselaw, 6. (So Carlos, Brazil, November 11-13, 2009). SBC. 83
133.

[15] Brereton, P., Kitchenham, B. A., Budgen, D., Turner, M., Khalil,
M. 2007. Lessons from Applying the Systematic Literature Review
Process within the Software Engineering Domain. Journal of
Systems and Software, 80, 4 (Apr. 2007), 571583.

[16] Ananiadou, S., Okazaki, N., Procter, R., Rea, B., Thomas, J. 2009.
Supporting Systematic Reviews using Text Mining. Social
Science Computer Review, 27, 4 (Nov 2009), 509523.

[17] Cohen, J. 1968. Weighted Kappa: nominal scale agreement with


provision for scaled disagreement or partial credit. Psychological
Bulletin, 70, 4 (Oct. 1968), 213220.

[18] Khan, K. S., Ter Riet, G., Glanville, J., Sowden, A. J., Kleijnen, J.
2000. Undertaking Systematic Review of Research on
Effectiveness. CRDs Guidance for those Carrying Out or
Commissioning Reviews, 2.ed. CRD Report n.4. York: NHS
Centre for Reviews and Dissemination, University of York.

[19] Dyb, T., Dingsyr, T. 2008. Strength of evidence in systematic


reviews in software engineering, In: Proceedings of ESEM'08 ACM-IEEE International Symposium On Empirical Software
Engineering And Measurement (Kaiserslautern, Germany, October
9-10, 2008). ACM New York. 178187.

[20] Alderson, P., Green, S., & Higgins, J. P. T. 2004. Cochrane


reviewers handbook 4.2. 2. In: The Cochrane Library, 1 (2004).
Chichester: John Wiley & Sons Ltd.

You might also like