Drugs and Targets for Training

To train DrugE-Rank, 1324 human protein targets and 1242 FDA approval drugs are derived from DrugBank at the end of the year 2015, and there are 5845 known interactions.


DrugE-Rank: Improving Drug-Target Interaction Prediction of New Candidate Drugs or Targets by Ensemble Learning to Rank

Learning to rank (LTR) is the known, most powerful technique in the feature-based methods, while similarity-based methods are well-accepted, due to their idea of connecting the chemical and genomic spaces, represented by drug and target similarities, respectively. We propose a new method, DrugE-Rank, to improve the performance of the problem by nicely combining the advantages of the two different types of the methods. That is, DrugE-Rank uses LTR, for which multiple well-known similarity-based methods can be used as components of ensemble learning.

kNN: K-Nearest Neighbors

Basic idea of kNN model is using the closest data points to make estimation for every new instances. Therefore, it takes advantage of local information and forms non-linear, adaptive classification boundaries for each new data point while predicting.

BLM-svc/svr: Bipartite Local Models with C-SVC and e-SVR SVM (Support Vector Machine)

It was transformed to multiple binary classification/regression problems to make prediction for new drugs/targets by BLM, one for each label. Then predicting result from these SVM are united as final estimation score. SVM used here is implemented in LibSVM.

(Net)LapRLS: (Interaction Network based) Laplacian Regularized Least Squares

This is a semi-supervised learning method, which makes use of unknown drug-target pair while training, so that it produces strong generalization capability. Further, a kernel based on known drug-target interactions is introduced into LapRLS, which utilizes global information in the interaction network.

WNN-GIP: Weighted Nearest Neighbor based Gaussian Interaction Profile classifier

Interaction score profile for a new drug/target is computed by weighted nearest neighbor algorithm. Then, it is used to constructing Gaussian interaction profile kernel for classifying.


Any problem or bug occurs when using, please contact Qing-Jun Yuan (13210240043@fudan.edu.cn).

We will highly apprecitate your support and kindness.


Bleakley, K. and Yamanishi, Y. (2009). Supervised prediction of drug-target interactions using bipartite local models. Bioinformatics, 25(18), 2397-2403.

van Laarhoven, T. and Marchiori, E. (2013). Predicting drug-target interactions for new drug compounds using a weighted nearest neighbor profile. PloS one, 8(6), e66952.

van Laarhoven, T., Nabuurs, S. B., and Marchiori, E. (2011). Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics, 27(21), 3036-3043.

Ding, H., Takigawa, I., Mamitsuka, H., and Zhu, S. (2014). Similarity-based machine learning methods for predicting drug-target interactions: a brief review. Briefings in Bioinformatics, 15(5), 734-747.

Zheng, X., Ding, H., Mamitsuka, H., and Zhu, S. (2013). Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1025-1033. ACM.

Law, V., Knox, C., Djoumbou, Y., Jewison, T., Guo, A. C., Liu, Y., Maciejewski, A., Arndt, D.,Wilson, M., Neveu, V., et al. (2014). Drugbank 4.0: shedding new light on drug metabolism. Nucleic acids research, 42(D1), D1091-D1097.

Xia, Z., Zhou, X., Sun, Y., and Wu, L. (2009). Semi-supervised drug-protein interaction prediction from heterogeneous spaces. In The Third International Symposium on Optimization and Systems Biology, volume 11, pages 123-131. Citeseer.

Liu, T.-Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3), 225-331.