Drugs and Targets for Training

To train DrugE-Rank, 1324 human protein targets and 1242 FDA approval drugs are derived from DrugBank at the end of the year 2015, and there are 5845 known interactions.

Methods

DrugE-Rank: Improving Drug-Target Interaction Prediction of New Candidate Drugs or Targets by Ensemble Learning to Rank

Learning to rank (LTR) is the known, most powerful technique in the feature-based methods, while similarity-based methods are well-accepted, due to their idea of connecting the chemical and genomic spaces, represented by drug and target similarities, respectively. We propose a new method, DrugE-Rank, to improve the performance of the problem by nicely combining the advantages of the two different types of the methods. That is, DrugE-Rank uses LTR, for which multiple well-known similarity-based methods can be used as components of ensemble learning.

kNN: K-Nearest Neighbors

Basic idea of kNN model is using the closest data points to make estimation for every new instances. Therefore, it takes advantage of local information and forms non-linear, adaptive classification boundaries for each new data point while predicting.

BLM-svc/svr: Bipartite Local Models with C-SVC and e-SVR SVM (Support Vector Machine)

It was transformed to multiple binary classification/regression problems to make prediction for new drugs/targets by BLM, one for each label. Then predicting result from these SVM are united as final estimation score. SVM used here is implemented in LibSVM.

(Net)LapRLS: (Interaction Network based) Laplacian Regularized Least Squares

This is a semi-supervised learning method, which makes use of unknown drug-target pair while training, so that it produces strong generalization capability. Further, a kernel based on known drug-target interactions is introduced into LapRLS, which utilizes global information in the interaction network.

WNN-GIP: Weighted Nearest Neighbor based Gaussian Interaction Profile classifier

Interaction score profile for a new drug/target is computed by weighted nearest neighbor algorithm. Then, it is used to constructing Gaussian interaction profile kernel for classifying.

Contact

Any problem or bug occurs when using, please contact Qing-Jun Yuan (13210240043@fudan.edu.cn).

We will highly apprecitate your support and kindness.

References

Bleakley, K. and Yamanishi, Y. (2009). Supervised prediction of drug-target interactions using bipartite local models. Bioinformatics, 25(18), 2397-2403.

van Laarhoven, T. and Marchiori, E. (2013). Predicting drug-target interactions for new drug compounds using a weighted nearest neighbor profile. PloS one, 8(6), e66952.

van Laarhoven, T., Nabuurs, S. B., and Marchiori, E. (2011). Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics, 27(21), 3036-3043.

Ding, H., Takigawa, I., Mamitsuka, H., and Zhu, S. (2014). Similarity-based machine learning methods for predicting drug-target interactions: a brief review. Briefings in Bioinformatics, 15(5), 734-747.

Zheng, X., Ding, H., Mamitsuka, H., and Zhu, S. (2013). Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1025-1033. ACM.

Law, V., Knox, C., Djoumbou, Y., Jewison, T., Guo, A. C., Liu, Y., Maciejewski, A., Arndt, D.,Wilson, M., Neveu, V., et al. (2014). Drugbank 4.0: shedding new light on drug metabolism. Nucleic acids research, 42(D1), D1091-D1097.

Xia, Z., Zhou, X., Sun, Y., and Wu, L. (2009). Semi-supervised drug-protein interaction prediction from heterogeneous spaces. In The Third International Symposium on Optimization and Systems Biology, volume 11, pages 123-131. Citeseer.

Liu, T.-Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3), 225-331.