A from-scratch rewrite of the Drug-Gene Interaction Database.
First, make sure you have all of the following installed:
Clone and enter the repository:
git clone https://github.com/dgidb/dgidb-v5
cd dgidb-v5
First, you may need to switch your Ruby version with RVM to match the version declared in the first few lines of the Gemfile. For example, to switch to version 3.1.0:
rvm install 3.1.0
rvm 3.1.0
From the repo root, enter the server subdirectory:
cd server
If RVM is properly installed, you should expect to encounter a warning message here:
RVM used your Gemfile for selecting Ruby, it is all fine - Heroku does that too,
you can ignore these warnings with 'rvm rvmrc warning ignore ./Gemfile'.
To ignore the warning for all files run 'rvm rvmrc warning ignore allGemfiles'.
Next, install Rails and other required gems with bundle
:
bundle install
The server will need a running Postgres instance. Postgres start commands may vary based on your OS and processor type. The following should work on M1 Macs:
pg_ctl -D /opt/homebrew/var/postgres start
# on older macs you may need to use a different path instead, eg "pg_ctl -D /usr/local/var/postgres start"
The database must be constructed manually. This command will also vary, but it should be something like this:
createdb -U postgres dgidb
Next, back in the main shell, import a database dump file (ask on Slack if you need the latest file):
psql -d dgidb -f dgidb_dump_20220526.psql # provide path to data dump
That should take a few minutes. Finally, start the Rails server:
rails s
Navigate to localhost:3000/api/graphiql
in your browser. If the example query provided runs successfully, then you're all set.
To perform a data load from scratch, first run the reset
task to provide a clean, seeded DB:
rake db:reset
Some Python libraries are required for importing data. From the repo root, create a Python virtual environment and install required dependencies:
python3 -m venv .venv
source .venv/bin/activate
pip install -r scripts/requirements.txt
A Python script is supplied to ensure that primary source data is available. This can also be used to acquire new versions of data that supply discrete releases (like ChEMBL):
python3 scripts/download_files.py
Then, load claims:
rake dgidb:import:all
Then, run grouping. See documentation for the therapy and gene normalizers for more.
By default, the groupers will expect a normalizer service to be running locally on port 8000; use the THERAPY_HOSTNAME
and GENE_HOSTNAME
environment variables to specify alternate hosts:
export THERAPY_HOSTNAME=http://localhost:7999
rake dgidb:group:drugs
export GENE_HOSTNAME=http://localhost:7998
rake dgidb:group:genes
rake dgidb:group:interactions
Finally, normalize remaining metadata:
rake dgidb:normalize:drug_approval_ratings
rake dgidb:normalize:drug_types
rake dgidb:normalize:populate_source_counters
Navigate to the /client directory:
# from dgidb-v5 root
cd client
Install dependencies with yarn:
yarn install
Start the client:
yarn start
Frontend style is enforced by ESLint and Prettier. Conformance is ensured by pre-commit. Before your first commit, run
pre-commit install
In practice, Prettier will do most of the formatting work for you to be in accordance with ESLint. Run the following to autoformat a file:
yarn run prettier --write path/to/file