PLOS COMPUTATIONAL BIOLOGY
EDUCATION
A mentorship and incubation program using
project-based learning to build a professional
bioinformatics pipeline in Kenya
Ruth Nanjala1,2, Festus Nyasimi1,3, Daniel Masiga1, Caleb Kipkurui Kibet ID1*
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
OPEN ACCESS
Citation: Nanjala R, Nyasimi F, Masiga D, Kibet CK
(2023) A mentorship and incubation program
using project-based learning to build a professional
bioinformatics pipeline in Kenya. PLoS Comput Biol
19(3): e1010904. https://doi.org/10.1371/journal.
pcbi.1010904
Editor: Francis Ouellette, McGill University,
CANADA
Published: March 2, 2023
Copyright: © 2023 Nanjala et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Funding: The authors gratefully acknowledge the
financial support for this research by the following
organizations and agencies: U.S. National Institutes
of Health (NIH) and the National Human Genome
Research Institute (NHGRI) grant number
U24HG006941 to DM; the Fogarty International
Center of the National Institutes of Health under
Award Number U2RTW010677 to DM. In addition,
we gratefully acknowledge core financial assistance
to icipe provided by the Swedish International
Development Cooperation Agency (Sida); the
Swiss Agency for Development and Cooperation
(SDC); the Australian Centre for International
Agricultural Research (ACIAR); the Federal
Democratic Republic of Ethiopia; and the
1 International Centre of Insect Physiology and Ecology, Nairobi, Kenya, 2 Kennedy Institute for
Rheumatology, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences,
University of Oxford, United Kingdom, 3 The University of Chicago, Chicago, Illinois, United States of
America
* ckibet@icipe.org
Abstract
The demand for well-trained bioinformaticians to support genomics research continues to
rise. Unfortunately, undergraduate training in Kenya does not prepare students for specialization in bioinformatics. Graduates are often unaware of the career opportunities in bioinformatics, and those who are may lack mentors to help them choose a specialization. The
Bioinformatics Mentorship and Incubation Program seeks to bridge the gap by laying the
foundation for a bioinformatics training pipeline using project-based learning. The program
selects six participants through an intensive open recruitment exercise for highly competitive
students to join the program for four months. The six interns undergo intensive training
within the first one and a half months before being assigned to mini-projects. We track the
progress of the interns weekly through code review sessions and a final presentation at the
end of the four months. We have trained five cohorts, most of whom have secured master’s
scholarships within and outside the country and job opportunities. We demonstrate the benefit of structured mentorship using project-based learning in filling the training gap after
undergraduate programs to generate well-trained bioinformaticians who are competitive in
graduate programs and bioinformatics jobs.
This is a PLOS Computational Biology Methods paper.
Introduction
The field of bioinformatics has grown substantially since the completion of the first draft of the
human genome in the 1990s [1]. The tremendous increase in the volume of data generated
from sequencing has increased the demand for bioinformaticians who can skillfully interpret
large and complex datasets [2]. Although bioinformatics is increasingly crucial for life sciences
research, undergraduate biology education is not structured adequately to incorporate bioinformatics skills and knowledge. Curriculum gaps can limit biology students from reaching
their full educational potential, restricts their job options, and hinders research progress [3].
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
1 / 13
PLOS COMPUTATIONAL BIOLOGY
Government of the Republic of Kenya. The views
expressed herein do not necessarily reflect the
official opinion of the donors. The funders had no
role in study design, data collection and analysis,
decision to publish, or preparation of the
manuscript.
Competing interests: The authors have declared
that no competing interests exist.
The countries of the global north have incorporated the core competencies of bioinformatics into undergraduate life sciences courses [3], including fully approved graduate degree
training programs in bioinformatics. Therefore, the transition to research and graduate school
is seamless, as students already have basic research and computational biology skills. Short bioinformatics courses that last only a few days to a few weeks fill the skill gap in resource-constrained settings. The introductory [4] and intermediate [5] bioinformatics training of Human
Heredity and Health in Africa Bioinformatics Network (H3ABioNet) [6] has successfully used
African bioinformatics experts to reach large cohorts of trainees through blended learning.
However, these courses lack structured mentorship and project-based learning critical for
skills retention and bioinformatics career specialization [2,7].
Africa, regarded as the cradle of humankind [8], has a high genetic diversity and a high burden of infectious diseases. Bioinformatics capacity is required to improve research in Africa
and address the burden of disease [9]. Although advanced bioinformatics training remains a
privilege for countries with cutting-edge scientific resources [10], some African countries,
South Africa [11], Mali [12], Nigeria [13], Ghana [10], and Uganda [7] have worked to close
such gaps through different bioinformatics capacity-building programs. In Kenya, there is no
structured bioinformatics mentorship.
Few organizations in Kenya—International Centre of Insect Physiology and Ecology
(icipe), the International Livestock Research Institute (ILRI), Kenya Medical Research Institute, Wellcome Trust (KEMRI-WT)—play a crucial role in closing this gap through shortterm training and workshops, but this is not enough. In the absence of ongoing support, those
who receive training cannot continue learning and utilize the skills gained through research
projects. Furthermore, short-term training is sustainable because trainees do not retain their
bioinformatics skills.
The Human Hereditary and Health in Africa (H3Africa) project led to the demand for
bioinformaticians who could analyze genomic data generated from H3Africa projects [14].
The Fogarty International Center, which supports H3Africa projects, supported bioinformatics capacity development by sponsoring master’s and PhD students through projects such as
the Eastern Africa Network for Bioinformatics Training program (EANBiT) [15] and Nurturing Genomics and Bioinformatics Capacity in Africa [16]. However, there remains a gap in
how to attract highly motivated graduates interested in pursuing bioinformatics as a career.
Short-term training fills additional gaps in niche areas such as metagenomics, but structured
project-based immersive training and mentoring offer the best impact in establishing sustained interest [17].
Bioinformatics Incubation and Mentorship Program Design
We established the Bioinformatics Incubation and Mentorship Program to fill the gap by creating a foundation for bioinformatics career development. The program recruits six highly
motivated students interested in bioinformatics every four months to participate in structured
training and mentorship. The aim is to generate a pool of competitive bioinformatics trainees
who can proceed to master’s degree training or support ongoing research.
The mentorship program is based on a project-based learning approach to increase bioinformatics capacity. We modeled the incubation program to impart key bioinformatics competencies
to bioinformatics scientists, according to ISCB [18], with clear learning objectives and results.
The program aims to provide knowledge and skills in bioinformatics competencies: data management; bioinformatics tools, resources, and their use; the scientific discovery process and the role
of bioinformatics in it; computer science systems basics; scripting and programming, open
source, and version control tools, all of which are relevant to the bioinformatics discipline.
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
2 / 13
PLOS COMPUTATIONAL BIOLOGY
Project-based learning approach
The selected interns undergo rigorous training within the first one and a half months. The
training covered by different tutors is meant to introduce the field of bioinformatics, guide the
interns in designing their purpose road maps, teach key people skills, and introduce them to
data management, open science, and reproducible research (Table 1). In addition to bioinformatics skills, the program includes information literacy workshops on how to access subscribed electronic resources, publish in credible journals, maintain a scholarly presence online,
and manage references.
The project-based learning approach is implemented in two phases: reproducing bioinformatics analysis in a published paper and data from ongoing projects at icipe. In the last two
weeks of the second month, the interns reproduced methods in a paper that would have
applied techniques taught in the first one and a half months—sequence analysis, phylogenetics,
and NGS—and shared their data. The interns are assessed through a final presentation at the
end of the month; the strengths of the intern are evaluated to facilitate appropriate assignments
to projects from the second phase. In the second phase, the mini-projects carried out within
two months related to ongoing genomics work at icipe. In the first week, the interns present
how they would tackle the project, how they would collaborate, and the tools they would use.
Weekly code review sessions help track the progress of interns and enable code debugging and
documentation using GitHub wikis and GitHub projects.
Methods of program delivery
The various fields of bioinformatics are condensed into a progression of self-contained modules taught in logical order (Fig 1). The curriculum design also included a practical set of abilities that are crucial to putting that information into practice. Before teaching advanced topics,
essential topics are taught.
Using seven contact hours per day for six weeks, the course is delivered using both theory
sessions in the morning and practical hands-on sessions in the afternoon, which are handled
by the local Senior Faculty, the program coordinator, and MSc fellows as teaching assistants.
When mentees require them outside those scheduled times, the program coordinator and
teaching assistants provide additional practical sessions and office hours. The interns are then
tested at the end of each module through regular reading and practical exercises.
Once the key concepts have been delivered in the first six weeks, the interns are put into
groups of 2 or 3 based on shared interests in readiness for the mini-projects phase. First, the
interns are tasked to reanalyze data from published papers and their results compared with
those published study. The interns are then assigned projects from ongoing research at the
Centre, where they work closely with the scientists to analyze and interpret the data. During
this phase, they receive help from the local senior faculty, program coordinator, and MSc fellows through weekly code review sessions. They then make a final presentation to summarize
their mini-projects on the last day of the internship.
In addition to standard bioinformatics topics (sequence analysis, phylogenetics, NGS, and
workflows), journal club meetings, soft skills sessions, and the information literacy workshop
are part of the internship program. These sessions are included because they are crucial to the
professional development of the mentees. We also guide interns in job and MSc applications
through mock reviews and interviews in the last month. The cover letters, CVs, and motivations of the fellows are reviewed, and feedback is provided based on the skills gained by the
interns.
Mini-projects focused on open and reproducible genomic analysis (Fig 1).
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
3 / 13
PLOS COMPUTATIONAL BIOLOGY
Table 1. Bioinformatics Incubation Training Program. The topics covered in each module, and the resources used for the training are available in the GitHub repo:
https://github.com/mbbu/training-materials-and-resources.
Time period Course structure
ISCB core competencies
Blooms
taxonomy
Week 1
Onboarding and general overview of the Molecular Biology and
Bioinformatics Unit
O–Effective teamwork to accomplish a common
scientific goal.
Knowledge
Introduction to collaborative documents–Slack, HackMD
I–GUI/ Web-based computing skills appropriate to the
discipline
application
Developing the purpose roadmap, objective, and goals
P–Engage in continuing professional development in
bioinformatics
evaluation
Open Science and Data Management, and Reproducibility (introductory
lecture on open science, FAIR data, and data management plans)
P–Engage in continuing professional development in
bioinformatics
application
Version Control with Git and GitHub
J–Command line and scripting-based computing skills
appropriate to the discipline.
application
I–GUI/Web-based computing skills appropriate to the
discipline
application
Introduction to Unix Shell–BASH scripting
J–Command line and scripting-based computing skills
appropriate to the discipline.
application
Brief introduction of the bioinformatics field
F–Bioinformatics tools and their usage.
evaluation
Introduction to Next-Generation Sequencing and file formats
C–Biological data generation technologies
Knowledge to
analysis
Soft skills (Personality, Motivation, Leadership attributes and styles)
N–Effective communication of bioinformatics and
genomics topic with a wide range of audiences
application
Python programming (basic data types and operations, string
manipulation, data structures: lists, tuples, sets and dictionaries, control
statements, functions, scripting with python)
J–Command line and scripting-based computing skills
appropriate to the discipline.
analysis
Soft skills (Communication and presentation skills, giving feedback and
Conflict resolution)
N–Effective communication of bioinformatics and
genomics topic with a wide range of audiences.
application
R Programming language
(Introduction to R and RStudio, data structures in R, exploring data
frames, subsetting data)
J–Command line and scripting-based computing skills
appropriate to the discipline
analysis
E–Statistical research methods in the context of
molecular biology, genomics, medical, and population
genetics research
application
Week 2
Week 3
Week 4
Introduction to phylogenetics
B–Depth in at least one area of biology
application
Workflow languages (Nextflow and Snakemake) and Containerization
(Docker and Singularity)
F–Bioinformatics tools and resources and their usage
application
Soft skills (Career planning and peer mentoring)
N–Effective communication of bioinformatics and
genomics topics with a wide range of audiences.
analysis
Data Science and Machine learning (Machine learning Concepts,
Introduction to Linear Regression, Random Forest and Decision Trees,
Feature Engineering in Genomics)
K–Construction of software systems of varying
complexity based on design and development
principles
knowledge
Introduction to HPC and Cloud Computing (SSH, moving data, Slurm
scheduler, lecture on AWS)
H–Computing requirements appropriate to solve a
given scientific problem
application
Week 6 and
Week 7
Mini-project on reproducing bioinformatics data analysis from published
papers
B–Depth in at least one area of biology
application
I–GUI/Web-based computing skills appropriate to the
discipline
analysis
Week 8
Information literacy workshop (Publishing online, maintaining online
scholarly presence, Reference management)
N–Effective communication of bioinformatics and
genomics topic with a wide range of audiences.
knowledge
Molecular Biology Laboratory tour
NA
NA
Week 5
(Continued )
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
4 / 13
PLOS COMPUTATIONAL BIOLOGY
Table 1. (Continued)
Time period Course structure
ISCB core competencies
Blooms
taxonomy
Week
9–Week 17
B–Depth in at least one area of biology
application
O–Effective teamwork to accomplish a common
scientific goal
analysis
Mini-projects
Journal club presentations.
F–Bioinformatics tools and resources and their usage
application
E–Statistical research methods in the context of
molecular biology, genomics, medical, and population
genetics research
application
N–Effective communication of bioinformatics and
genomics topic with a wide range of audiences.
analysis
A–General Biology
comprehension
https://doi.org/10.1371/journal.pcbi.1010904.t001
Recruitment
The Bioinformatics Incubation and Mentorship Program is designed for highly motivated students interested in bioinformatics recruited through widely advertised calls and direct recommendations from relevant university departments. It is hosted at the icipe Molecular Biology
and Bioinformatics Unit (MBBU). To be selected for the program, interested students from
different universities in Kenya submit applications to icipe through a web-accessible form
developed in REDCap [19,20]. The application includes a motivation letter and a curriculum
vitae detailing their background.
For each round, we shortlist at least 15 applications for an interview. The applications are
critically reviewed through an initial long list to ensure completeness and essential qualifications. We are interested in highly motivated students who have completed their undergraduate
Fig 1. Mini-projects design and sample projects. The project-based learning employed in the incubation program is centered on open and reproducible
research.
https://doi.org/10.1371/journal.pcbi.1010904.g001
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
5 / 13
PLOS COMPUTATIONAL BIOLOGY
Fig 2. Recruitment process.
https://doi.org/10.1371/journal.pcbi.1010904.g002
degrees. The desire to pursue Bioinformatics should not be an afterthought and must have put
some effort into writing their motivation letter with clarity on how it fits their career plan. At
least two reviewers then score long-listed applications based on whether the applicant is motivated to pursue a career in bioinformatics, the clarity of how the internship fits their career
plan, and the quality of the application.
The rigorous selection process is vital for success, as we work with highly motivated and
curious students. Interview questions aim to reveal what makes the applicant uniquely qualified for the internship based on prior interests, problem-solving skills, curiosity, and how the
training aligns with their career goals. The questions were as follows: (1) For this internship,
tell us who you are and what makes you uniquely qualified for the position. (2) You are faced
with a challenge you have not encountered before; briefly tell us about your thought process
and the approach to solving it. (3) What are your expectations from the internship and how
does it align with your goals? Three interviewers select six interns to join the program through
independent ranking. A summary of the recruitment process is described in Fig 2.
Impact of the program
The internship program has been run successfully for five iterations since October 2020, mentoring 27 students. The program has generated a pool of competitive students who have
secured master’s degrees within and outside the country and job opportunities. Some interns
continued various ongoing projects at icipe and with our collaborators. Table 2 documents the
post-internship positions secured by some of the interns. Those not included [10] are exploring their next steps in the bioinformatics field and as teaching assistant positions for H3ABioNet courses and the Carpentry organization, while others have taken short courses and miniprojects to perfect their bioinformatics skills.
Feedback from the participants
We assess and monitor the progress and impact of the bioinformatics mentorship and incubation program through pre- and post-internship surveys, weekly updates, and code reviews.
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
6 / 13
PLOS COMPUTATIONAL BIOLOGY
Table 2. Post-internship progression.
Intern
Cohort
Background
Internship period
Intern 1
1
BSc Molecular Biology
October 2020–January 2021
Student’s current position
MSc Bioinformatics candidate, Kenya
Intern 2
1
BSc Microbiology and Biotechnology
October 2020–January 2021
Software Engineer, Kenya
Intern 3
1
BSc Biotechnology
October 2020–January 2021
MSc Bioinformatics (the Netherlands, Europe)
Intern 4
1
BSc Microbiology
October 2020–January 2021
MSc Genetics (Brazil, South America)
Intern 5
2
BSc Molecular Biology
February 2021–May 2021
MSc Bioinformatics candidate, Kenya
Intern 6
2
BSc Applied Bioengineering
February 2021–May 2021
MSc Bioinformatics candidate, Kenya
Intern 7
2
BSc Medical Biology and Chemistry
February 2021–May 2021
Assistant Research Officer, Research Organization, Kenya
Masters in Life Sciences and Health (Europe)
Intern 8
2
BSc Biochemistry and Molecular Biology
February 2021–May 2021
Intern 9
2
BSc Marine Biology
February 2021–May 2021
MSc Life Science (Belgium, Europe)
Intern 10
3
BSc Botany
June 2021–September 2021
Scientist at a Pharmaceutical company
Intern 12
3
BSc Biochemistry
June 2021–September 2021
Bioinformatician, Research Organization, Kenya
Intern 13
3
BSc Biochemistry
June 2021–September 2021
MSc. Bioinformatics candidate, Kenya
Intern 14
4
BSc Statistics
October 2021–January 2022
Intern, Industry, Kenya
Intern 15
4
BSc Medical Biochemistry
October 2021–January 2022
Software developer, Industry, Kenya
Intern 16
4
BSc Genomic Sciences
October 2021–January 2022
Intern, Research Organization, Kenya
Intern 17
5
BSc Biochemistry and Molecular Biology
February 2022–May 2022
Research Assistant, Bioinformatics, Germany
https://doi.org/10.1371/journal.pcbi.1010904.t002
The interns join the program with the expectations of learning bioinformatics in depth, improving presentation skills, learning to code, creating meaningful friendships and networks, being
mentored, interacting with leading scientists, and gaining in-depth knowledge of genomics and
genetics. Before the program, the research interests of the interns were more generalized, but
after the internship was completed, the interests aligned. Some of the examples are as follows:
“Research has always been my interest, and training has shaped me in an interesting field of
metagenomics. An added value was that I got to do metagenomics analysis as a mini-project. Yes, my research interests have changed.”
“Before the internship, I wasn’t quite sure what I wanted to venture into, but now I am
interested in using bioinformatics tools to study infectious and non-infectious diseases.”
“Before training, I wanted to be a genomics researcher. Going out, I am still interested in
genomics research, but the internship program opened my mind to find a field to specialize
in, and that’s what I’m still trying to figure out.”
The interns improved their competencies in command line (Fig 3A), Git and GitHub
(Fig 3B), working with the HPC (Fig 4A), building bioinformatics pipelines and workflows
(Fig 4B), Python (Fig 5A), and R and Rmarkdown (Fig 5B).
As part of the post-internship survey, the interns highlighted their training experience, as
shown in Fig 6. They noted that the topics, content, and materials were helpful and sufficient
but highlighted that the time was not always enough.
In addition to quantitative feedback, the interns reflected on their experience through a
blog post published on our program page.
We highlight one representative example in the following:
“Coding has always fascinated me. Being from an applied bioengineering (genomics-based)
background, I actively sought a bridge between the two. The internship allowed me to learn
and polish my programming skills under the guidance of a tutor and to see its application
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
7 / 13
PLOS COMPUTATIONAL BIOLOGY
Fig 3. The figures summarize the interns’ competencies in Command line and Git acquired from the modules pre- and post-internship. Scores: 1 –Never
used, 5 –Advanced.
https://doi.org/10.1371/journal.pcbi.1010904.g003
in multi-omics data processing of real-life problems. In addition, it has cultivated a selflearning discipline that has played a key role in solidifying and expanding my bioinformatics knowledge. The skills and experience gained during the internship gave me a competitive advantage when applying for bioinformatics opportunities, paving the way to obtain
the EANBiT Master’s fellowship. The internship increased my resilience and self-discipline,
which still plays a key role in my MSc Bioinformatics journey.”
Successes, challenges, and future improvements
Key achievements. Design and implementation of a project-based learning bioinformatics training curriculum. Using the curriculum, we have mentored and trained 27 interns, generating a pool of fellows to join the bioinformatics training pipeline and contribute to the
bioinformatics capacity in the continent.
Fig 4. The figures summarize the interns’ competencies in HPC, and Pipelines acquired from the modules pre- and post-internship. Scores: 1 –Never
used, 5 –Advanced.
https://doi.org/10.1371/journal.pcbi.1010904.g004
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
8 / 13
PLOS COMPUTATIONAL BIOLOGY
Fig 5. The figures summarize the competencies of the interns in Python and R acquired from the modules pre- and post-internship. Scores: 1 –Never used, 5 –
Advanced.
https://doi.org/10.1371/journal.pcbi.1010904.g005
Reproducibility. A key component of the mentorship program is to enhance collaboration and reproducible research. All mini-projects have been properly documented on the organization’s GitHub account, enhancing reproducibility. The curriculum and the resources used
Fig 6. A summary of the responses on the relevance of the topics covered content organization, material distribution, and
time allocation.
https://doi.org/10.1371/journal.pcbi.1010904.g006
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
9 / 13
PLOS COMPUTATIONAL BIOLOGY
for the internship program are publicly available on the MBBU GitHub pages and can be used
to establish a similar program.
Sustainability through the Mentorship Series. To continue mentoring interns beyond
the internship period, we launched the Science Journey Seminar series, where interns interact
with scientists worldwide through seminars. These seminars highlight paths and journeys of
established and upcoming scientists to motivate, enlighten, and guide them on potential career
paths in bioinformatics.
Challenges. The fast pace and turnaround of the program meant that we had to continuously design projects that fit the interest of the interns. Sometimes, however, some interns’
interest did not align with the work at the Centre, or that data-rich projects requiring their
expertise were not readily available from within the host institute. We addressed this by forging collaboration outside the institute to ensure that the interest of fellows was catered to. The
short duration of the mentorship program, though by design, may not allow the completion of
some of the projects. In these cases, we have had fellows retained by the projects after the
internship.
Discussion and conclusions
A substantial increase in genomics and proteomics data has catalyzed the demand for welltrained bioinformaticians [21], but undergraduate courses do not prepare students for specialization in bioinformatics, and training opportunities are insufficient [22]. We established a
structured project-based bioinformatics incubation and mentorship program to prepare
undergraduate students for a career in bioinformatics. The entire program is a mentoring and
incubation program rather than a typical internship program. Interns are taught, supervised,
and mentored by researchers and MSc in bioinformatics students in the Molecular Biology
and Bioinformatics Unit at icipe. Mentorship is vital as the research mentor influences the professional growth of the intern [23]. The curriculum covers key bioinformatics competencies
and critical soft skills that culminate in mini-projects tailored to meet the interests of the
interns and facilitate the retention of skills. By reproducing research articles after structured
training, interns can grasp how experts have applied the skills and tools they have learned.
Together with the presentations at the journal club, they can critically improve their ability to
judge the validity and rigor of scientific approaches and findings [24] and be exposed to the
rapidly changing literature [25]. Reproducing research articles is thus an essential component
of the training as the interns get to understand the benefits practically.
Structured training and dedicating the last two months of the internship program to a
mini-project contributed to the success of the program. The projects span various topics based
on the interests of the interns, available data, and ongoing research within icipe. The emphasis
on project-based learning (PBL) is vital in cementing the knowledge and skills learned, an integral approach yet to be widely implemented in bioinformatics curricula [20]. The PBL increases
the degree of participation of the trainees [26], as previously demonstrated through a five-week
summer school [23]. Students become protagonists in the teaching–learning process and
develop a more critical problem solving mindset as we challenge them to apply skills to real
problems, attaching purpose and benefit to the skills acquired, thus ensuring retention [27].
During the mini-projects, we train the interns to think critically and computationally, as
this helps them understand research problems and code debugging. “Confidence in dealing
with complexity, persistence in working with difficult problems, tolerance of ambiguity, the
ability to deal with open-ended problems, and the ability to communicate and work with others to achieve a common goal” are all skills that computational thinking helps students develop
[28]. The program builds collaboration skills through group tasks and mini-projects and
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
10 / 13
PLOS COMPUTATIONAL BIOLOGY
advocates for Open Science and reproducible analysis, demonstrated by documenting their
work in public GitHub repositories. Documenting is key as it helps create code that is tidy,
error-free, and reusable [29].
Weekly code review sessions facilitate continuous feedback and help uncover bugs and
code issues with a fresh set of eyes. They also provide room for new ideas and easier ways to
perform tasks. In addition, they help track the progress of a project and identify gaps and ways
to improve the project. Bioinformatics research and practice have significant implications for
life sciences, and effective quality assurance techniques such as code reviews and testing are
vital to ensuring software quality [30]. We advocate for open science during the internship.
The collaborative and reproducibility components of Open Science are critical to their development as scientists because they lead to more citations, media attention, possible collaborators, career prospects, and funding opportunities [31].
The interns give feedback occasionally by filling out pre-internship surveys, verbal feedback
after every taught module, and post-internship surveys. The pre-internship surveys highlight
the training needs of the interns and point out areas that require more emphasis during training. It also helps to highlight the research interests of the interns so that we can help them find
suitable mentors. Verbal feedback after every taught module is essential in gauging whether
the interns understood the topic of interest. The post-internship survey highlights the success
of the internship journey and the areas that need improvement.
Figs 3, 4 and 5 show a tremendous improvement in the bioinformatics capacity of the
interns after the internship considering they had a short time to grasp and practice the modules taught. The Bioinformatics Incubation and Mentorship Program has increased the number of well-trained bioinformaticians and helped build the capacity for bioinformatics trainers;
some interns have become certified Carpentries Instructors.
We have established a structured project-based mentorship and incubation program that
fills the bioinformatics training gap between undergraduate and graduate programs or job
opportunities. The program trainees are highly competitive for MSc programs and job placements within and outside Kenya, demonstrating the benefit of mentorship in filling the transition gap, thus creating a pool of highly motivated trainees for recruitment to graduate school,
and supporting genomic data analysis in Africa.
Acknowledgments
We thank the former and current EANBiT MSc students Eric, Pauline, Winfred, Brian, Alex,
Ken, Laurah, and Gladys for their support in reviewing the applications, supporting the training, and providing valuable feedback. We also acknowledge the scientist who contributed
mini-projects: Nelly, Kiatoko, and Merid.
Author Contributions
Conceptualization: Daniel Masiga, Caleb Kipkurui Kibet.
Formal analysis: Ruth Nanjala, Festus Nyasimi, Caleb Kipkurui Kibet.
Funding acquisition: Daniel Masiga.
Investigation: Festus Nyasimi, Caleb Kipkurui Kibet.
Methodology: Caleb Kipkurui Kibet.
Project administration: Ruth Nanjala, Caleb Kipkurui Kibet.
Resources: Daniel Masiga.
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
11 / 13
PLOS COMPUTATIONAL BIOLOGY
Supervision: Ruth Nanjala, Festus Nyasimi, Daniel Masiga, Caleb Kipkurui Kibet.
Validation: Caleb Kipkurui Kibet.
Writing – original draft: Ruth Nanjala.
Writing – review & editing: Ruth Nanjala, Festus Nyasimi, Daniel Masiga, Caleb Kipkurui
Kibet.
References
1.
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. Initial sequencing and analysis
of the human genome. Nature. 2001; 409(6822):860–921. https://doi.org/10.1038/35057062 PMID:
11237011
2.
Bishop ÖT, Adebiyi EF, Alzohairy AM, Everett D, Ghedira K, Ghouila A, et al. Bioinformatics educationperspectives and challenges out of Africa. Brief Bioinform. 2015 Mar; 16(2):355–364. https://doi.org/10.
1093/bib/bbu022 PMID: 24990350
3.
Wilson Sayres MA, Hauser C, Sierk M, Robic S, Rosenwald AG, Smith TM, et al. Bioinformatics core
competencies for undergraduate life sciences education. PLoS ONE. 2018 Jun; 13(6). https://doi.org/
10.1371/journal.pone.0196878 PMID: 29870542
4.
Ahmed AE, Awadallah AA, Tagelsir M, Suliman MA, Eltigani A, Elsafi H, et al. Delivering blended bioinformatics training in resource-limited settings: A case study on the University of Khartoum H3ABioNet
node. Brief Bioinform. 2020; 21(2):719–728. https://doi.org/10.1093/bib/bbz004 PMID: 30773584
5.
Ras V, Botha G, Aron S, Lennard K, Allali I, Weitz SC, et al. Using a multiple-delivery-mode training
approach to develop local capacity and infrastructure for advanced bioinformatics in Africa. PLoS Comput Biol. 2021; 17(2):1–11. https://doi.org/10.1371/journal.pcbi.1008640 PMID: 33630830
6.
Mulder NJ, Adebiyi E, Alami R, Benkahla A, Brandful J, Doumbia S, et al. H3ABioNet, a sustainable
pan-African bioinformatics network for human heredity and health in Africa. Genome Res. 2016; 26
(2):271–277. https://doi.org/10.1101/gr.196295.115 PMID: 26627985
7.
Jjingo D, Mboowa G, Sserwadda I, Kakaire R, Kiberu D, Amujal M, et al. Bioinformatics mentorship in a
resource limited setting. Brief Bioinform. 2022 Jan; 23(1):1–8.
8.
Brunet M. Short note: The track of a new cradle of mankind in Sahelo-Saharan Africa (Chad, Libya,
Egypt, Cameroon). J African Earth Sci. 2010 Nov; 58(4):680–683.
9.
Rotimi C, Abayomi A, Abimiku A, Adabayeri VM, Adebamowo C, Adebiyi E, et al. Research capacity.
Enabling the genomic revolution in Africa. Science (80-). 2014; 344(6190):1346–1348.
10.
Karikari TK. Bioinformatics in Africa: The Rise of Ghana? PLoS Comput Biol. 2015;11(9).
11.
Mulder NJ, Christoffels A, de Oliveira T, Gamieldien J, Hazelhurst S, Joubert F, et al. The Development
of Computational Biology in South Africa: Successes Achieved and Lessons Learnt. PLoS Comput Biol.
2016; 12(2). https://doi.org/10.1371/journal.pcbi.1004395 PMID: 26845152
12.
Shaffer JG, Mather FJ, Wele M, Li J, Tangara CO, Kassogue Y, et al. Expanding research capacity in
sub-Saharan Africa through informatics, bioinformatics, and data science training programs in Mali.
Front Genet. 2019; 10(APR):1–13. https://doi.org/10.3389/fgene.2019.00331 PMID: 31031807
13.
Fatumo S, Ebenezer TE, Ekenna C, Isewon I, Ahmad U, Adetunji C, et al. The Nigerian Bioinformatics
and Genomics Network (NBGN): A collaborative platform to advance bioinformatics and genomics in
Nigeria. Glob Health Epidemiol Genom. 2020;5. https://doi.org/10.1017/gheg.2020.3 PMID: 32742665
14.
Adoga MP, Fatumo SA, Agwale SM. H3Africa: A tipping point for a revolution in bioinformatics, genomics and health research in Africa. Source Code Biol Med. 2014; 9(1). https://doi.org/10.1186/17510473-9-10 PMID: 24829612
15.
EANBiT. Eastern Africa Network for Bioinformatics Training. [cited 2022 Sep 8]; Available from: https://
eanbit.icipe.org/
16.
Makerere university. Nurturing Genomics & Bioinformatics Research Capacity in Africa [Internet]. Available from: https://h3africa.org/index.php/consortium/nurturing-genomics-and-bioinformatics-researchcapacity-in-africa-breca/ [cited 2020 Sep 23].
17.
Emery LR, Morgan SL. The application of project-based learning in bioinformatics training. PLoS Comput Biol. 2017; 13(8):e1005620. https://doi.org/10.1371/journal.pcbi.1005620 PMID: 28817584
18.
Fernandes P. International Society for Computational Biology (ISCB). Dictionary of Bioinformatics and
Computational Biology. 2004.
19.
Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O’Neal L, et al. The REDCap consortium: Building an international community of software platform partners. J Biomed Inform [Internet]. 2019 Jul 1
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
12 / 13
PLOS COMPUTATIONAL BIOLOGY
[cited 2023 Jan 21];95. Available from: https://pubmed.ncbi.nlm.nih.gov/31078660/ https://doi.org/10.
1016/j.jbi.2019.103208 PMID: 31078660
20.
Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture
(REDCap)-A metadata-driven methodology and workflow process for providing translational research
informatics support. J Biomed Inform. 2009 Apr; 42(2):377–381. https://doi.org/10.1016/j.jbi.2008.08.
010 PMID: 18929686
21.
Yu U, Lee SH, Kim YJ, Kim S. Bioinformatics in the Post-genome Era. J Biochem Mol Biol. 2004; 37
(1):75–82. https://doi.org/10.5483/bmbrep.2004.37.1.075 PMID: 14761305
22.
Anupama J, Francescatto M, Rahman F, Fatima N, DeBlasio D, Shanmugam AK, et al. The ISCB Student Council Internship Program: Expanding computational biology capacity worldwide. PLoS Comput
Biol. 2018; 14(1). https://doi.org/10.1371/journal.pcbi.1005802 PMID: 29346365
23.
Guston DH. Mentorship and the Research Training Experience. Responsible Science: Ensuring the
Integrity of the Research Process: Volume II [Internet]. 1993. Available from: https://www.ncbi.nlm.nih.
gov/books/NBK236193/
24.
Peng RD. Reproducible research in computational science. Science (80-). 2011 Dec; 334(6060):1226–
1227. https://doi.org/10.1126/science.1213847 PMID: 22144613
25.
Bhattacharya S. Journal club and post-graduate medical education. Indian J Plast Surg. 2017 Sep; 50
(3):302–305. https://doi.org/10.4103/ijps.IJPS_222_17 PMID: 29618866
26.
Bell S. Project-Based Learning for the 21st Century: Skills for the Future. Clear House A J Educ Strateg
Issues Ideas [Internet]. 2010; 83(2):39–43. Available from: https://www.tandfonline.com/action/
journalInformation?journalCode=vtch20.
27.
de la Torre-Neches B, Rubia-Avi M, Aparicio-Herguedas JL, Rodrı́guez-Medina J. Project-based learning: an analysis of cooperation and evaluation as the axes of its dynamic. Humanit Soc Sci Commun.
2020 Dec; 7(1):1–7.
28.
Kafai YB, Proctor C. A Revaluation of Computational Thinking in K–12 Education: Moving Toward
Computational Literacies. Educ Res. 2022; 51(2):146–151.
29.
Dudley JT, Butte AJ. A quick guide for developing effective bioinformatics programming skills. PLoS
Comput Biol. 2009; 5(12):e1000589. https://doi.org/10.1371/journal.pcbi.1000589 PMID: 20041221
30.
Umarji M, Seaman C, Koru AG, Liu H. Software engineering education for bioinformatics. Proceedings 22nd Conference on Software Engineering Education and Training, CSEET 2009. IEEE Computer
Society; 2009. p. 216–23.
31.
McKiernan EC, Bourne PE, Brown CT, Buck S, Kenall A, Lin J, et al. How open science helps researchers succeed. 2016; 5:e16800. https://doi.org/10.7554/eLife.16800 PMID: 27387362
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010904 March 2, 2023
13 / 13