Overview
The msigdbr R package provides Molecular Signatures Database (MSigDB) gene sets typically used with the Gene Set Enrichment Analysis (GSEA) software:
- in an R-friendly “tidy” format with one gene pair per row
- for multiple frequently studied model organisms, such as mouse, rat, pig, zebrafish, fly, and yeast, in addition to the original human genes
- as gene symbols as well as NCBI Entrez and Ensembl IDs
- without accessing external resources and requiring an active internet connection
Installation
The package can be installed from CRAN.
{r} install.packages("msigdbr")
Releases that are not available on CRAN can be installed from GitHub (specific release or version can be specified):
{r} remotes::install_github("igordot/msigdbr", ref = "v2022.1.1")
Usage
The package data can be accessed using the msigdbr()
function, which returns a data frame of gene sets and their member genes. For example, you can retrieve mouse genes from the C2 (curated) CGP (chemical and genetic perturbations) gene sets.
{r} library(msigdbr) genesets = msigdbr(species = "mouse", category = "C2", subcategory = "CGP")
Check the documentation website for more information.