- PyTorch V1.7
- DGL V0.5
- numpy
- sklearn
- drug.txt: list of drug names.
- protein.txt: list of protein names.
- disease.txt: list of disease names.
- se.txt: list of side effect names.
- drug_dict_map: a complete ID mapping between drug names and DrugBank ID.
- protein_dict_map: a complete ID mapping between protein names and UniProt ID.
- mat_drug_se.txt : Drug-SideEffect association matrix.
- mat_protein_protein.txt : Protein-Protein interaction matrix.
- mat_drug_drug.txt : Drug-Drug interaction matrix.
- mat_protein_disease.txt : Protein-Disease association matrix.
- mat_drug_disease.txt : Drug-Disease association matrix.
- mat_protein_drug.txt : Protein-Drug interaction matrix.
- mat_drug_protein.txt : Drug-Protein interaction matrix.
- Similarity_Matrix_Drugs.txt : Drug & compound similarity scores based on chemical structures of drugs ([0,708) are drugs, the rest are compounds).
- Similarity_Matrix_Proteins.txt : Protein similarity scores based on primary sequences of proteins.
- mat_drug_protein_homo_protein_drug.txt: Drug-Protein interaction matrix, in which DTIs with similar drugs (i.e., drug chemical structure similarities > 0.6) or similar proteins (i.e., protein sequence similarities > 40%) were removed (see the paper).
- DTI_newly.csv: 1040 new DTIs related to this dataset from the latest approved DTI dataset (V5.1.8, 2021-01-03) .