Datasets

Metal Ions Training Datasets Independent Testing datasets (for holo structures) Testing datasets (for apo structures)
Ca Ca_training Ca_independent_testing Ca_apo_testing
Co Co_training Co_independent_testing Co_apo_testing
Cu Cu_training Cu_independent_testing Cu_apo_testing
Fe Fe_training Fe_independent_testing Fe_apo_testing
Mg Mg_training Mg_independent_testing Mg_apo_testing
Mn Mn_training Mn_independent_testing Mn_apo_testing
Ni Ni_training Ni_independent_testing Ni_apo_testing
Zn Zn_training Zn_independent_testing Zn_apo_testing

All datasets download(in .zip format)

Raw dataset download (only PDB identifiers and information)

Source Code

Please download the following two source code files for generating model.

  • Framework (all_features.pl)
  • Generation of random forest (metal_predict.R)
  • Instruction

        1. MetalExplorer was written in R and Perl. To run MetalExplorer, please make sure that your computer has installed R and Perl systems.
        2. Third-party software required by MetalExplorer:
  • DSSP for structure analysis;
  • PSSM conservation scores matrix;
  • NACCESS for prediction of solvent accessibility;
  • DISOPRED for protein disorder prediction;
  • Biopython for calculation of solvent exposure;
  • Residue-residue contact for calculation of residue contact and network.
  •     3. Running MetalExplorer
  • Before runing MetalExplorer, please put the all_features.pl, metal_predict.R and your input files in the same folder;
  • The input file should be named as 'predict_PDB_SEQ'. You can provide multiple testing examples in different lines. For example, one line can be written like this:
    "Ca 1APN A ALA 1 26.967 23.385 25.386 1"
    There are nine columns in the above line which are seperated by the space character. These columns are ion, PDB, chain, amino acid, position of residue in the original PDB file, orthogonal coordinates for X in Angstroms of CA atom for this residue, orthogonal coordinates for Y in Angstroms of CA atom for this residue, orthogonal coordinates for Z in Angstroms of CA atom for this residue and position of the residue in the PDB FASTA-formatted sequence. An example of input can be downloaded here.
  • Run 'perl all_features.pl '. For example, run 'perl all_features.pl ./' to locate the result files in current folder.