Welcome

Pan-specific methods are those that in principle can make predictions for all alleles with known protein sequence. In other words, they can learn from experimental data restricted to a limited set of alleles, but own the ability to accurately predict peptide binding to more alleles, even the ones with little or no experimental binding data. Thanks to pan-specific methods, accurate predictions are available for novel alleles very simply and cheaply. Considering highly diverse MHC molecules, breakthrough in allele coverage and reasonable prediction accuracy enable them to be most promising solutions for peptide binding prediction.
    Some studies have also been carried out on the prediction of other pathway events in antigen processing and presenting, such as proteasome cleavage and TAP transport for MHC-I. Generally speaking, binding prediction in itself can identify possible epitopes quite accurately, but higher accuracy should be achieved by incorporating predictions that model these two pathway events. One example is NetCTLpan, improved from the original NetCTL and another state-of-art pathway prediction method MHC-pathway. It’s a fine MHC-I pathway epitope discovery tool that integrated predictions of cleavage sites, TAP transport efficiency and MHC binding.

PanReview: Zhang L, Udaka K, Mamitsuka H, Zhu S. Toward More Accurate Pan-Specific MHC-Peptide Binding Prediction: A Review of Current Methods and Tools. Brief Bioinform. 2012 May;13(3):350-64. Epub 2011 Sep 22.(please cite it if you find it useful)

Pan-Specific Tools

Table 1. Overview of MHC-I Pan-Specific Tools

WebServer Output Score Model/Method Coverage Ref
ADT Lower, Binder Adaptive Double Threading (ADT) HLA-A,B,C [1]
KISS Higher, Binder Kernel based SVM HLA-A,B,C [2]
NetMHCpan Higher(t=0.426), Binder ANN HLA-A,B,C and non-humans [3,4]
PickPocket Higher, Binder PSSM HLA-A,B,C and non-humans [5]
MULTIPRED2-I Higher, Binder Integration & NetMHCpan HLA-A,B,C [6]

Table 2. Overview of MHC-II Pan-Specific Tools

WebServer Output Score Model/Method Coverage Ref
TEPITOPE/Propred Higher, better PSSM 50 HLA-DRs [7,8]
TEPITOPEpan Higher, better PSSM 700 HLA-DRs [21]
SIADT Lower, Binder Shift-Invariant ADT HLAs and non-humans [9]
MultiRTA Higher, Binder RTA HLA-DRs,DPs [10]
NetMHCIIpan-1.0 Higher(t=0.426), Binder ANN & SMM-align HLA-DRs [11]
NetMHCIIpan-2.0 Higher(t=0.426), Binder ANN & NN-align HLA-DRs [12]
NetMHCIIpan-3.0 Higher(t=0.426), Binder ANN & NN-align HLA-DRs,DPs,DQs,H-2 [20]
MULTIPRED2-II Higher, Binder Integration & NetMHCIIpan HLA-DRs [6]
MHC2SKpan IC50, Lower, Binder kernel based methodn HLA-DRs [22]

Output Score gives whether higher or lower prediction score indicates stronger binder. (t=0.426) gives the threshold of non-binder and binder. In addition, methods like MultiRTA and NetMHCIIpan(1.0, 2.0) also give IC50 binding affinities directly. The commonly used threshold is 500(nM), and lower value indicates stronger binder.

Major Data Sources

A. Databases

Database Brief Description
IMGT/HLA HLA sequences
Immune Epitope Database (IEDB) MHC-binding peptides, ligands and epitopes
SYFPEITHI MHC ligands and peptide motif
IPD-MHC MHC sequences of diverse non-human species
Protein Data Bank (PDB) Experimentally-determined protein and complex structures
MHCBN MHC/TAP binding peptides and T cell epitopes
EpiMHC MHC ligands
UniProt Protein sequence and functional information
LANL-HIV MHC/TAP binding peptides and T cell epitopes

B. Datasets

Database Brief Description Class Year Ref
Zhang Datasets   (local) HLA Binding and ligand datasets from IEDB and SYFPEITHI I 2009 [13]
Hoof Datasets   (local) Non-human binding dataset and HLA ligand datasets from in-house, IEDB and SYFPEITHI I 2008 [4]
Peters Datasets   (local) MHC-I binding data from IEDB of five species besides human I 2006 [14]
Heckerman Datasets    (local) Binary data from SYFPEITHI, MHCBN, LANL I 2006 [15]
Wang-1 Datasets    (local) MHC-II binding peptides, PDB structure, Th cell epitopes II 2008 [16]
Wang-2 Datasets    (local) Experimentally measured binging affinities covering 26 HLA and mouse alleles II 2010 [17]
Nielsen-1.0 Datasets    (local) Quantitative peptide binding data in NetMHCIIpan-1.0 II 2008 [11]
Nielsen-2.0 Datasets    (local) Quantitative peptide binding data, SYFPEITHI ligand data and IEDB epitope data in NetMHCIIpan-2.0 II 2010 [12]
DFRMLI Datasets    (local) Preprocessed and scaled immunological binding and ligand data sets for machine learning I and II 2008 [18,19]

The plural form of dataset in each entry in table B above denotes more than one groups or types of data compiled, e.g. Zhang Datasets incorporates Zhang's human dataset and Zhang's no-human dataset.

Pan-I and Pan-II Dataset

S1. Pan-I dataset contains 208 epitopes of 28 HLA-I alleles with their source proteins. For each ligand, we gives its restricted MHC allele, amino acids sequence, original GenBank ID and source protein.

S2. Pan-II independent dataset compiles close to 6000 data points (5997) covering 17 HLA-DR alleles retrived from IEDB.

References

  1. ADT: Jojic N, et al. 2006. Learning MHC-I Peptide Binding. Bioinformatics, 22:e227-235.
  2. KISS: Jacob L, Vert JP. 2007. Efficient peptide-CMHC-I binding prediction for alleles with few known binders. Bioinformatics, 24(3):358-66.
  3. NetMHCpan:  Nielsen M, et al. 2007. NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence. PLoS One, 2(8):e796.
  4. Hoof I, et al. 2008. NetMHCpan, a method for MHC class I binding prediction beyond humans.Immunogenetics, 61(1):1-13.
  5. PickPocket:  Zhang H, Lund O, Nielsen M. 2009. The PickPocket method for predicting binding specificities for receptors based on receptor pocket similarities: application to MHC-peptide binding. Bioinformatics, 25(10):1293-9.
  6. MULTIPRED2:Zhang GL, et al. 2010. MULTIPRED2: A computational system for large-scale identification of peptides predicted to bind to HLA supertypes and alleles. .J Immunol Methods, [Epub ahead of print].
  7. TEPITOPE::  Sturniolo T, Bono E, Ding J, et al. 1999. Generation of tissue-specific and promiscuous HLA ligand database using DNA microarrays and virtual HLA class II matrices. Nat. Biotechnol, 17:555-561.
  8. Propred::  Singh H, Raghava GP. 2001. ProPred: prediction of HLA-DR binding sites.Bioinformatics, 17(12):1236-7.
  9. SIADT: Zaitlen N, et al. 2008. Shift-invariant adaptive double threading: learning MHC II-peptide binding. Journal OF Computational Biology, 15(7):927-42.
  10. MultiRTA:  Bordner AJ, Mittelmann HD. 2010. MultiRTA: a simple yet reliable method for predicting peptide binding affinities for multiple class II MHC allotypes. BMC Bioinformatics, 11:482.
  11. NetMHCIIpan-1.0::  Nielsen M, et al. 2008. Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan. PLoS Comput Biol., 4(7):e1000107.
  12. NetMHCIIpan-2.0::  Nielsen M, et al. 2010. NetMHCIIpan-2.0 - Improved pan-specific HLA-DR predictions using a novel concurrent alignment and weight optimization training procedure. Immunome Res., 6:9
  13. Zhang H, Lundegaard C, Nielsen M. 2008. Pan-specific MHC class I predictors: a benchmark of HLA class I pan-specific prediction methods.Bioinformatics, 25(1):83-9.
  14. Peters B, et al. 2006. A community resource benchmarking predictions of peptide binding to MHC-I molecules.PLoS Comput Biol., 2(6):e65.
  15. Heckerman D, et al. 2006. Leveraging information across HLA alleles/super-types improves HLA-specific epitope prediction.J Comput Biol., 14(6):736-46.
  16. Wang P, et al. 2008. A systematic assessment of MHC class II peptide binding predictions and evaluation of a consensus approach.PLoS Comput Biol., 4(4):e1000048.
  17. Wang P, et al. 2010. Peptide binding predictions for HLA DR, DP and DQ molecules. BMC Bioinformatics,11:568.
  18. Lin HH, et al. 2008. Evaluation of MHC class I peptide binding prediction servers: applications for vaccine research. BMC Immunol,9:8.
  19. Lin HH, et al. 2008. Evaluation of MHC-II peptide binding prediction servers: applications for vaccine research. BMC Bioinformatics, 9 Suppl 12:S22.
  20. NetMHCIIpan-3.0::  Karosiene E, et al. 2013. NetMHCIIpan-3.0, a common pan-specific MHC class II prediction method including all three human MHC class II isotypes, HLA-DR, HLA-DP and HLA-DQ.. Immunogenetics, 65(10):711-24
  21. TEPITOPEpan::  Zhang L, et al. 2012. TEPITOPEpan: extending TEPITOPE for peptide binding prediction covering over 700 HLA-DR molecules.. PLoS One,7(2):e30483
  22. MHC2SKpan::  Guo, et al. 2013. MHC2SKpan: a novel kernel based approach for pan-specific MHC class II peptide binding prediction. BMC Genomics,14(Suppl 5):S11