MM-PBSA.py.pdf

(Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carryout simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.) Copyright Ross Walker 2006 AMBER ADVANCED TUTORIALS TUTORIAL 3 MM-PBSA Perl Version By Ross Walker & Thomas Steinbrecher Python Version By Dwight McGee, Bill Miller III, & Jason Swails In this tutorial we will use the MM-PBSA method to calculate the binding free energy for the association of two proteins. The overall objective of the MM-PBSA method and it's complementary MM- GBSA method is to calculate the free energy difference between two states which most often represent the bound and unbound state of two solvated molecules or alternatively to compare the free energy of two different solvated conformations of the same molecule. Ideally we would like to calculate this free energy of binding directly as shown in the figure below: However, in such a simulation of these solvated states the majority of the energy contributions would come from solvent-solvent interactions and the fluctuations in total energy would be an order of magnitude larger than binding energy. Thus the calculation would take an inordinate amount of time to converge. Thus a more effective method is to divide up the calculation according to the following thermodynamic cycle: Evidently from this diagram the binding free energy delta-G bind,solv can be calculated by: In the MM-PBSA approach the different contributions to the binding free energy above are calculated in various ways: Solvation free energies are calculated by either solving the linearised Poisson Boltzman or Generalized Born equation for each of the three states (this gives the electrostatic contribution to the solvation free energy) and adding an empirical term for hydrophobic contributions: delta-G vacuum is obtained by calculating the average interaction energy between receptor and ligand and taking the entropy change upon binding into account if necessary/desired: The entropy contribution can be found by performing normal mode analysis on the three species but in practice entropy contributions can be neglected if only a comparison of states of similar entropy is desired such as two ligands binding to the same protein. The reason for this is that normal mode analysis calculations are computationally expensive and tend to have a large margin of error that introduces significant uncertainty in the result. The average interaction energies of receptor and ligand are usually obtained by performing calculations on an ensemble of uncorrelated snapshots collected from an equilibrated molecular dynamics (MD) simulation. In this tutorial we will demonstrate the use of the MM/PB(GB)SA scripts included with Amber and AmberTools to automatically perform all the necessary steps to estimate the binding free energy of a protein-protein complex (RAS and RAF) and a protein-ligand complex (Estrogen Receptor and Raloxifene) using both MM-GBSA and MM-PBSA methods in serial and parallel. Furthermore, we will be demonstrating the use of Alanine Scanning and Normal Mode entropy calculations using the script. In principle, the calculation of the binding free energy described above would require three independent MD simulations of the complex and both individual proteins. However, typically one makes the approximation that no significant conformational changes occur upon binding so that the snapshots for all three species can be obtained from a single trajectory. This is the 'single trajectory approach' and is what we will use in this tutorial.  Section 1 : Build the starting structure and run a simulation to obtain an equilibrated system.  Section 2 : Run the production simulation and obtain an ensemble of snapshots.  Section 3 : Calculate the binding free energy and analyse the results (tutorial forks here between different versions of MM/PBSA). (Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.) Copyright Ross Walker 2006 AMBER ADVANCED TUTORIALS TUTORIAL 3 - SECTION 1 MM-PBSA By Ross Walker & Thomas Steinbrecher 1) Build the starting structure and run a simulation to obtain an equilibrated system. The system we will model in this simulation is the complex between the human H- Ras protein and the Ras-binding domain of C-Raf1 (Ras-Raf), which is central to the signal transduction cascade. Here is a partially equilibrated, pre-prepared pdb file of the RAS-RAF complex. ras-raf.pdb This structure contains the ras and raf proteins and also a physiologically necessary GTP nucleotide as illustrated in the figure below: For the purposes of this tutorial and for the sake of simplicity we will avoid treating the GTP molecule in the calculation since this would require the setup of new parameters for this compound and is beyond the scope of this tutorial. Thus we will simply remove it from the calculation by erasing it from the pdb file. While not strictly correct this approximation is somewhat reasonable since, as can be seen from the figure above, GTP is not directly involved in the binding interface. There is also a magnesium ion in this protein that is essentially bound to the GTP molecule so we will remove this as well. Hence you should remove residues 243 and 244 from the pdb file. The next step is to split this pdb file into the two separate structures such that you have a ras-raf.pdb, a ras.pdb and a raf.pdb. We will use these three structures to create three gas phase prmtop and inpcrd file pairs for the MM-PBSA calculation as well as one for the solvated complex which will be used to run the MD simulations: > $AMBERHOME/bin/tleap -s -f $AMBERHOME/dat/leap/cmd/leaprc.ff99 Caution: For AMBER 14 please use -f $AMBERHOME/dat/leap/cmd/oldff/leaprc.ff99 in the tleap call for loading the ff99 force field. com = loadpdb ras-raf.pdb ras = loadpdb ras.pdb raf = loadpdb raf.pdb Make sure you select the correct radii for the calculation method you intend to use. For details see the set PBRadii paragraph in the LEaP section of the manual and the recommendations provided here. set default PBRadii mbondi2 saveamberparm com ras-raf.prmtop ras-raf.inpcrd saveamberparm ras ras.prmtop ras.inpcrd saveamberparm raf raf.prmtop raf.inpcrd Now before you quit tleap you should create the solvated complex for running the MD simulation: charge com > Total unperturbed charge: -0.000000 > Total perturbed charge: -0.000000 (Hence there is no need to add counter ions) solvatebox com TIP3PBOX 12.0 saveamberparm com ras-raf_solvated.prmtop ras-raf_solvated.inpcrd quit Here are the files: ras-raf.prmtop, ras- raf.inpcrd, ras.prmtop, ras.inpcrd, raf.prmtop, raf.inpcrd, ras- raf_solvated.prmtop, ras-raf_solvated.inpcrd 1.1) Equilibrate the solvated complex We will equilibrate the solvated complex by carrying out a short minimisation, 50ps of heating and 50 ps of density equilibration with weak restraints on the complex followed by 500ps of constant pressure equilibration at 300K. All simulations will be run with shake on hydrogen atoms, a 2 fs time step and langevin dynamics for temperature control. The input files are as follows: min.in heat.in minimise ras-raf &cntrl imin=1,maxcyc=1000,ncyc=500, cut=8.0,ntb=1, ntc=2,ntf=2, ntpr=100, ntr=1, restraintmask=':1-242', restraint_wt=2.0 / heat ras-raf &cntrl imin=0,irest=0,ntx=1, nstlim=25000,dt=0.002, ntc=2,ntf=2, cut=8.0, ntb=1, ntpr=500, ntwx=500, ntt=3, gamma_ln=2.0, tempi=0.0, temp0=300.0, ntr=1, restraintmask=':1-242', restraint_wt=2.0, nmropt=1 / &wt TYPE='TEMP0', istep1=0, istep2=25000, value1=0.1, value2=300.0, / &wt TYPE='END' / density.in equil.in heat ras-raf &cntrl imin=0,irest=1,ntx=5, nstlim=25000,dt=0.002, ntc=2,ntf=2, cut=8.0, ntb=2, ntp=1, taup=1.0, ntpr=500, ntwx=500, ntt=3, gamma_ln=2.0, temp0=300.0, ntr=1, restraintmask=':1-242', restraint_wt=2.0, / heat ras-raf &cntrl imin=0,irest=1,ntx=5, nstlim=250000,dt=0.002, ntc=2,ntf=2, cut=8.0, ntb=2, ntp=1, taup=2.0, ntpr=1000, ntwx=1000, ntt=3, gamma_ln=2.0, temp0=300.0, / Caution: In the examples in this tutorial we do not change the value of the random seed used for the random number generator. This is controlled by the namelist variable ig. This is largely for issues of reproducibility of the results within a tutorial setting. However, when running production simulations, especially when using ntt=2 or 3 (Anderson or Langevin thermostats) it is essential that you change the random number seed from the default on EVERY MD restart. If you are using AMBER 10 (bugfix.26 or later) or AMBER 11 or later you can do this automatically by setting ig=-1 in the cntrl namelist. Otherwise you can specify a positive random number of your choosing for ig each time you restart a calculation. For more details on the pitfalls of not doing this you should refer to the following publication: Cerutti DS, Duke, B., et al., "A Vulnerability in Popular Molecular Dynamics Packages Concerning Langevin and Andersen Dynamics", JCTC, 2008, 4, 1669-1680 You should run all 4 of these simulations using commands along the lines of: $AMBERHOME/bin/sander -O -i min.in -o min.out -p ras-raf_solvated.prmtop -c ras-raf_solvated.inpcrd \ -r min.rst -ref ras-raf_solvated.inpcrd $AMBERHOME/bin/sander -O -i heat.in -o heat.out -p ras-raf_solvated.prmtop -c min.rst \ -r heat.rst -x heat.mdcrd -ref min.rst gzip -9 heat.mdcrd $AMBERHOME/bin/sander -O -i density.in -o density.out -p ras- raf_solvated.prmtop -c heat.rst \ -r density.rst -x density.mdcrd -ref heat.rst gzip -9 density.mdcrd $AMBERHOME/bin/sander -O -i equil.in -o equil.out -p ras- raf_solvated.prmtop -c density.rst \ -r equil.rst -x equil.mdcrd gzip -9 equil.mdcrd This takes approximately 5 hours on 16 processors of a 1.7GHz IBM P690. Here are the output files: equil.tar.gz Before we proceed with the MM-PBSA production MD run we need to verify that the system has equilibrated. For this we will look at temperature, density, total energy and RMSD. We can start by using the following perl script (process_mdout.pl) that will extract useful information from the output files. ./process_mdout.pl heat.out density.out equil.out Since the first heating run was performed under constant volume conditions the density data is not recorded. Hence you will need to edit the summary.DENSITY file and remove the first 50 lines. (Because xmgrace is stupid and just gets confused otherwise). xmgrace summary.DENSITY xmgrace summary.TEMP xmgrace summary.ETOT Additionally we will examine the protein backbone RMSD with respect to the minimized structure to see if conformational stability has been achieved during the equilibration. This can be done using ptraj or cpptraj with the following script: measure_equil_rmsd.ptraj trajin equil.mdcrd.gz 1 250 1 reference ras-raf_solvated.inpcrd rms reference out equil.rmsd @CA,C,N xmgrace equil.rmsd DENSITY TEMPERATURE TOTAL ENERGY BACKBONE RMSD The density, temperature and total energy plots have all clearly converged by the end of our equilibration period. The RMSD while seeming to begin to level does not appear to have completely converged but for the purposes of this tutorial is acceptable. In a real calculation you would want to run, depending on your system, significantly more equilibration time. We are now ready to perform the production runs. (Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.) Copyright Ross Walker 2006 AMBER ADVANCED TUTORIALS TUTORIAL 3 - SECTION 2 MM-PBSA By Ross Walker & Thomas Steinbrecher 2) Run the production simulation and obtain an ensemble of snapshots. The production phase of the simulation should be run using the same conditions as the final phase of equilibration to prevent an abrupt jump in the potential energy due to a change in simulation conditions. We will run a total of 2 ns or production recording the coordinates every 10 ps. This should be sufficiently far apart that the structures are uncorrelated. Depending on your system you might obtain good results with snapshots taken closer together. As long as all the structures you obtain are uncorrelated the more snapshots you have the lower the statistical error of your results should be. Note for system such as the RAS-RAF complex we have here a simulation time of 2 ns is most likely too short to obtain a set of uncorrelated snapshots that adequately sample the equilibrium ensemble. A value of 20 ns or so would probably be more appropriate. However, this will suffice for the purposes of this tutorial. Here is the input file: prod.in prod ras-raf &cntrl imin=0,irest=1,ntx=5, nstlim=250000,dt=0.002, ntc=2,ntf=2, cut=8.0, ntb=2, ntp=1, taup=2.0, ntpr=5000, ntwx=5000, ntt=3, gamma_ln=2.0, temp0=300.0, / This should then be run 4 times to obtain 2 ns of simulation time. Since this is a simple periodic boundary PME simulation one can use PMEMD to do the simulation if required. This will typically offer better performance and scaling in parallel. Below is an example script I used on San Diego Supercomputer Center's Teragrid Cluster to run this job on 96 processors. The calculation took a total of 10 hours. run.x #SDSC Teragrid PBS Script #PBS -j oe #PBS -l nodes=48:ppn=2 #PBS -l walltime=12:00:00 #PBS -q dque #PBS -V #PBS -M [email protected] #PBS -A account_no #PBS -N run_pmemd_96 cd /gpfs/projects/prod/ mpirun -v -machinefile $PBS_NODEFILE -np 96 /usr/local/apps/amber9/exe/pmemd -O -i prod.in -o prod1.out \ -p ras-raf_solvated.prmtop -c equil.rst -r prod1.rst -x prod1.mdcrd mpirun -v -machinefile $PBS_NODEFILE -np 96 /usr/local/apps/amber9/exe/pmemd -O -i prod.in -o prod2.out \ -p ras-raf_solvated.prmtop -c prod1.rst -r prod2.rst -x prod2.mdcrd mpirun -v -machinefile $PBS_NODEFILE -np 96 /usr/local/apps/amber9/exe/pmemd -O -i prod.in -o prod3.out \ -p ras-raf_solvated.prmtop -c prod2.rst -r prod3.rst -x prod3.mdcrd mpirun -v -machinefile $PBS_NODEFILE -np 96 /usr/local/apps/amber9/exe/pmemd -O -i prod.in -o prod4.out \ -p ras-raf_solvated.prmtop -c prod3.rst -r prod4.rst -x prod4.mdcrd gzip -9 prod*.mdcrd Here are the output files: prod.tar.gz (84.8 MB) It is essential, for good results, that our system still be exploring equilibrium phase space during the production phase. We will check this in the same fashion as we did for the last equilibration step by plotting the density, temperature, total energy and backbone RMSD. DENSITY TEMPERATURE TOTAL ENERGY BACKBONE RMSD Note the production RMSD does not look to be truly equilibrated while the other properties are essentially constant (note the small scales). Ideally we should probably run a much longer production run (ca. 20 ns). For the purposes of this tutorial, however, we will continue with what we have. We can now proceed to section 3 where we will calculate the binding free energy. The first link will take you to the instructions for using (and installing) the Python script MMPBSA.py. The second link will take you to the instructions for using the Perl script mm_pbsa.pl. (Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.) Copyright Dwight Mcgee, Bill Miller III, and Jason Swails 2009 AMBER ADVANCED TUTORIALS TUTORIAL 3 Python Script MMPBSA.py Dwight McGee, Bill Miller III, & Jason Swails In this tutorial we will demonstrate the use of the MM-PBSA method as released in AmberTools to calculate binding free energy, run alanine scanning, and calculate normal modes for entropy calcalations. The tutorial is broken down as follows:  Section 3.1 : Calculate the binding free energy of a protein-protein complex (Ras-Raf).  Section 3.2 : Calculate the binding free energy of a protein-ligand complex (Estrogen Receptor and Raloxifene).  Section 3.3 : Calculate the binding free energy of Ras-Raf and use Alanine Scanning to compare to the binding energy of a mutant Ras-Raf complex that has had a residue mutated to alanine and analyze the results.  Section 3.4 : Calculate the binding free energy of Ras-Raf in parallel using three processors.  Section 3.5 : Calculate the entropy of the Estrogen Receptor and Raloxifene complex using Normal Mode Analysis (Nmode).  Section 3.6 : Decomposing the free energy contributions to the binding free energy of Ras-Raf in a per-residue or pairwise per-residue basis. (Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.) Copyright McGee, Miller, and Swails 2009 AMBER ADVANCED TUTORIALS TUTORIAL 3 - SECTION 3.1 Python Script MMPBSA.py Dwight McGee, Bill Miller III, and Jason Swails The important files for calculating the binding free energy using MMPBSA.py are the topology files and the mdcrd file (ras-raf_top_mdcrd.tgz) Calculate the binding free energy of Ras-Raf. We will now calculate the interaction energy and solvation free energy for the complex, receptor and ligand and average the results to obtain an estimate of the binding free energy. Please note that we will not perform a calculation of the entropy contribution to binding in this part of the tutorial and so strictly speaking our result will not be a true free energy but could be used to compare against similar systems. See Section 3.5 for an example of using Normal Mode Analysis (Nmode) to calculate the entropy contribution for a system or uncomment out the last line in the &general namelist of the input file below to perform a Quasi- Harmonic entropy calculation using the ptraj module in AMBER. We will carry out the binding energy calculation using both the MM-GBSA method and the MM-PBSA method for comparison. This is accomplished with the following input file for MMPBSA.py: mmpbsa.in Input file for running PB and GB &general endframe=50, verbose=1, # entropy=1, / &gb igb=2, saltcon=0.100 / &pb istrng=0.100, / The input files for MMPBSA.py are designed to be similar to the setup of an mdin file used in the sander module of AMBER. The start of each namelist is designated by an ampersand (&) followed by the name of the namelist. Furthermore, a backslash (/) or '&end' can be used to end the namelist. For a complete list of all variables please see the User's Manual here. This input file is divided into three namelists: general, pb, and gb. The general namelist is designed to specify variables that are not specific to a particular part of the calculation, but to all parts. In this setup we have defined RAS to be the receptor and RAF to be the ligand. The 'endframe' variable sets what frame of the mdcrd to stop on. The '&gb' and '&pb' namelist markers let the script know to perform MM-GBSA and MM-PBSA calculations with the given values defined within those namelists. The 'verbose' variable allows the user to specify how much output is written to the output file. The four python scripts (MMPBSA.py, utils.py, alamdcrd.py, and inputparse.py) should have been placed in $AMBERHOME/bin/ during installation. The script can be initiated (using the above input file) using analagous command-line flags to those used by sander and pmemd. $AMBERHOME/bin/MMPBSA.py -O -i mmpbsa.in -o FINAL_RESULTS_MMPBSA.dat -sp ras-raf_solvated.prmtop -cp ras-raf.prmtop -rp ras.prmtop -lp raf.prmtop -y *.mdcrd This will run the script interactively and print the progress of the calculation to STDOUT and any errors or warnings to STDERR. Finally, timings will be printed once the calculation has completed showing the time taken during each step of the calculation. Command-line arguments can be given with shell-recognized wildcards (i.e. * and ? for bash). For example, the '-y *.mdcrd' on the command line tells the script to read in all files in the working directory that end in '.mdcrd' and use them as the trajectories to be analyzed. Here are all the output files created by this script: pb_gb_output1.tgz. The script creates three unsolvated mdcrd files (complex, receptor, and ligand) using ptraj that are the coordinates analyzed during the GB and PB calculations. The *.mdout files contain the energies for all frames specified. A PDB file of the average structure is created align (via RMS) all snapshots to prepare for a quasi- harmonic entropy calculation with ptraj if one is requested. All files created by MMPBSA.py should begin with the prefix '_MMPBSA_' except for the final output file, FINAL_RESULTS_MMPBSA.dat. FINAL_RESULTS_MMPBSA.dat | Run on Thu Feb 11 12:18:37 EST 2010 |Input file: |-------------------------------------------------------------- |Input file for running PB and GB |&general | endframe=50, verbose=1, |# entropy=1, |/ |&gb | igb=2, saltcon=0.100 |/ |&pb | istrng=0.100, |/ |-------------------------------------------------------------- |Solvated complex topology file: ras-raf_solvated.prmtop |Complex topology file: ras-raf.prmtop |Receptor topology file: ras.prmtop |Ligand topology file: raf.prmtop |Initial mdcrd(s): prod.mdcrd | |Best guess for receptor mask: ":1-166" |Best guess for ligand mask: ":167-242" |Calculations performed using 50 frames. |Poisson Boltzmann calculations performed using internal PBSA solver in sander. | |All units are reported in kcal/mole. ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- GENERALIZED BORN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1863.7944 17.1704 2.4283 EEL -17200.7297 75.9366 10.7391 EGB -3249.6511 65.2075 9.2217 ESURF 91.3565 1.3938 0.1971 G gas -19064.5240 77.8536 11.0102 G solv -3158.2946 65.2224 9.2238 TOTAL -22222.8186 51.0216 7.2155 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1268.1888 14.2342 2.0130 EEL -11557.0773 71.7127 10.1417 EGB -2532.0669 57.7003 8.1600 ESURF 64.2843 1.1143 0.1576 G gas -12825.2661 73.1118 10.3396 G solv -2467.7826 57.7110 8.1616 TOTAL -15293.0487 35.3527 4.9996 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -529.3090 9.4198 1.3322 EEL -4684.4720 36.1449 5.1117 EGB -1688.9631 26.5353 3.7527 ESURF 37.0493 0.6185 0.0875 G gas -5213.7811 37.3522 5.2824 G solv -1651.9138 26.5425 3.7537 TOTAL -6865.6949 25.8878 3.6611 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -66.2966 4.2751 0.6046 EEL -959.1803 34.9190 4.9383 EGB 971.3789 33.0497 4.6739 ESURF -9.9770 0.3759 0.0532 DELTA G gas -1025.4769 35.1797 4.9752 DELTA G solv 961.4018 33.0518 4.6742 DELTA G binding = -64.0750 +/- 6.3729 0.9013 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- POISSON BOLTZMANN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1863.7944 17.1704 2.4283 EEL -17200.7297 75.9366 10.7391 EPB -3207.7160 66.4023 9.3907 ECAVITY 67.8762 0.7818 0.1106 G gas -19064.5240 6061.1875 857.1813 G solv -3139.8399 66.4069 9.3914 TOTAL -7686.8660 52.5400 7.4303 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1268.1888 14.2342 2.0130 EEL -11557.0773 71.7127 10.1417 EPB -2483.7242 56.4551 7.9840 ECAVITY 47.1495 0.4737 0.0670 G gas -12825.2661 5345.3320 755.9441 G solv -2436.5747 56.4571 7.9842 TOTAL -5250.2060 38.5188 5.4474 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -529.3090 9.4198 1.3322 EEL -4684.4720 36.1449 5.1117 EPB -1670.4169 27.6694 3.9131 ECAVITY 28.0328 0.4133 0.0584 G gas -5213.7811 1395.1865 197.3092 G solv -1642.3841 27.6725 3.9135 TOTAL -2350.3020 25.1197 3.5525 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -66.2966 4.2751 0.6046 EEL -959.1803 34.9190 4.9383 EPB 946.4251 34.5128 4.8808 ECAVITY -7.3062 0.3004 0.0425 DELTA G gas -1025.4769 1237.6138 175.0250 DELTA G solv 939.1189 34.5141 4.8810 DELTA G binding = -86.3579 +/- 8.3264 1.1775 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- WARNINGS: igb=2 should be used with mbondi2 pbradii set. Yours are modified Bondi radii (mbondi) The beginning of the statistics file includes the date/time, any warnings based on the values and files given, the mmpbsa.in text, the files used by the script, the number of frames analyzed, and which PB solver (if any) was used. The rest of the statistics file includes all the average energies, standard deviations, and standard error of the mean for GB followed by PB. After each section, the ΔG of binding is given along with the error values. The meaning of the different terms in this file is as follows: VDWAALS = van der Waals contribution from MM. EEL = electrostatic energy as calculated by the MM force field. EPB/EGB = the electrostatic contribution to the solvation free energy calculated by PB or GB respectively. ECAVITY = nonpolar contribution to the solvation free energy calculated by an empirical model. DELTA G binding = final estimated binding free energy calculated from the terms above. (kCal/mol) Note that the total gas phase energy has not been reported because the values of the bonded potential terms for the receptor and ligand should exactly cancel those for the complex using the single trajectory approach. An error message will result if the energies do not cancel within an allowed tolerance. One would typically expect to find an extremely favorable electrostatic energy and a unfavorable solvation free energy. This symbolises the energy that one has to use to de-solvate the binding particles and to align their binding interfaces. From the negative total binding free energy -86.36 kcal/mol we clearly see that this is a favorable protein-protein complex in pure water but keep in mind that the result does not equal the real binding free energy since we did not estimate the (unfavorable) entropy contribution to binding. Note that the GB approach gives a slightly lower binding energy but still suggests that this is a favorable bound state. (Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.) Copyright McGee, Miller, and Swails 2009 AMBER ADVANCED TUTORIALS TUTORIAL 3 - SECTION 3.2 Python Script MMPBSA.py Dwight McGee, Bill Miller III, and Jason Swails 1) Build the starting structure and run a simulation to obtain an equilibrated system. The system we will model in this simulation is the protein-ligand complex between the Estrogen Receptor protein and Raloxifene ligand. Here is a pre-prepared pdb file of the complex. Estrogen_Receptor-Raloxifene.pdb This structure contains the Estrogen Receptor protein along with a ligand called Raloxifene that has been previously docked to the protein as illustrated in the figure below: This system was constructed in a similar manner to the Ras-Raf system in Section 1. For directions on how to build the starting structure and run a simulation to obtain an equilibrated system, please refer to Section 1 and Section 2 Note that you must also use antechamber to get the correct parameters for raloxifene. See the Sustiva Tutorial (Basic 4) for detailed instructions. The important files from the MD simuation for calculating the binding free energy using MMPBSA.py are the topology files and the mdcrd file (Est_Rec_top_mdcrd.tgz) 2) Calculate the binding free energy of the Estrogen Receptor and Raloxifene We will now calculate the interaction energy and solvation free energy for the complex, receptor and ligand and average the results to obtain an estimate of the binding free energy. Please note that we will not perform a calculation of the entropy contribution to binding in this part of the tutorial and so strictly speaking our result will not be a true free energy but could be used to compare against similar systems. See Section 3.5 for an example of using Normal Mode Analysis (Nmode) to calculate the entropy contribution for a system or uncomment out the last line in the general section to perform a Quasi-Harmonic entropy calculation using the ptraj module in AMBER. We will carry out the binding energy calculation using both the MM-GBSA method and the MM-PBSA method for comparison. This is accomplished with the following input file for MMPBSA.py: mmpbsa.in Input file for running PB and GB &general endframe=50, keep_files=2, / &gb igb=2, saltcon=0.100, / &pb istrng=0.100, / The input files for MMPBSA.py are designed to be similar to the setup of an mdin file used in the sander module of AMBER. The start of each namelist is designated by an ampersand (&) followed by the name of the namelist. Furthermore, a backslash (/) or '&end' can be used to end the namelist. For a complete list of all variables please see the User's Manual here. This input file is divided into three namelists: general, pb, and gb. The general namelist is designed to specify variables that are not specific to a particular part of the calculation, but rather to all parts. In this setup the Estrogen Receptor is the receptor and Raloxifene is the ligand. The 'endframe' variable sets which frame of the mdcrd to stop on. The presence of the '&gb' and '&pb' namelists let the script know to perform MM-GBSA and MM- PBSA calculations with the given input values. The four python scripts (MMPBSA.py, utils.py, alamdcrd.py, and inputparse.py) should have been placed in $AMBERHOME/bin/ during installation. The script can be initiated (using the above input file) using analagous command-line flags to those used by sander and pmemd. $AMBERHOME/bin/MMPBSA.py -O -i mmpbsa.in -o FINAL_RESULTS_MMPBSA.dat -sp 1err.solvated.prmtop -cp complex.prmtop -rp receptor.prmtop -lp ligand.prmtop -y *.mdcrd This will run the script interactively and print the progress of the calculation to STDOUT and any errors or warnings to STDERR. Finally, timings will be printed once the calculation has completed showing the time taken during each step of the calculation. Command-line arguments can be given with shell-recognized wildcards (i.e. * and ? for bash). For example, the '-y *.mdcrd' on the command line tells the script to read in all files in the working directory that end in '.mdcrd' and use them as the trajectories to be analyzed. With 'keep_files=2', here are all the output files: pb_gb_output2.tgz. The script creates three unsolvated mdcrd files (complex, receptor, and ligand) using ptraj that are the coordinates analyzed during the GB and PB calculations. The *.mdout files contain the energies for all frames specified. A PDB file of the average structure is created align (via RMS) all snapshots to prepare for a quasi- harmonic entropy calculation with ptraj if one is requested. All files created by MMPBSA.py should begin with the prefix '_MMPBSA_' except for the final output file. FINAL_RESULTS_MMPBSA.dat FINAL_RESULTS_MMPBSA.dat | Run on Thu Feb 11 12:44:26 EST 2010 |Input file: |-------------------------------------------------------------- |Input file for running PB and GB |&general | endframe=50, keep_files=2, |/ |&gb | igb=2, saltcon=0.100, |/ |&pb | istrng=0.100, |/ |-------------------------------------------------------------- |Solvated complex topology file: 1err.solvated.prmtop |Complex topology file: complex.prmtop |Receptor topology file: receptor.prmtop |Ligand topology file: ligand.prmtop |Initial mdcrd(s): 1err_prod.mdcrd | |Best guess for receptor mask: ":1-240" |Best guess for ligand mask: ":241" |Ligand residue name is "RAL" | |Calculations performed using 50 frames. |Poisson Boltzmann calculations performed using internal PBSA solver in sander. | |All units are reported in kcal/mole. ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- GENERALIZED BORN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -2013.3801 20.3021 2.8712 EEL -16938.6450 85.7631 12.1287 EGB -3507.0086 67.7839 9.5861 ESURF 97.5448 1.3301 0.1881 G gas -18952.0251 88.1333 12.4639 G solv -3409.4639 67.7969 9.5879 TOTAL -22361.4889 47.1982 6.6748 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1955.2272 19.2311 2.7197 EEL -16895.0354 85.5797 12.1028 EGB -3528.7276 68.3585 9.6673 ESURF 101.2613 1.3071 0.1849 G gas -18850.2626 87.7138 12.4046 G solv -3427.4663 68.3710 9.6691 TOTAL -22277.7288 48.1057 6.8032 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1.8595 2.0516 0.2901 EEL -5.5796 2.0333 0.2876 EGB -28.4863 0.6040 0.0854 ESURF 4.4326 0.0462 0.0065 G gas -7.4391 2.8885 0.4085 G solv -24.0538 0.6058 0.0857 TOTAL -31.4929 5.0748 0.7177 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -56.2934 2.9265 0.4139 EEL -38.0300 3.2114 0.4542 EGB 50.2053 2.5869 0.3658 ESURF -8.1491 0.2589 0.0366 DELTA G gas -94.3234 4.3449 0.6145 DELTA G solv 42.0562 2.5999 0.3677 DELTA G binding = -52.2672 +/- 2.4568 0.3475 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- POISSON BOLTZMANN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -2013.3801 20.3021 2.8712 EEL -16938.6450 85.7631 12.1287 EPB -3329.1708 67.0354 9.4802 ECAVITY 68.2656 0.5195 0.0735 G gas -18952.0251 7767.4837 1098.4881 G solv -3260.9052 67.0374 9.4805 TOTAL -5265.0831 49.0426 6.9357 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1955.2272 19.2311 2.7197 EEL -16895.0354 85.5797 12.1028 EPB -3355.4746 67.3299 9.5219 ECAVITY 70.1184 0.5285 0.0747 G gas -18850.2626 7693.7163 1088.0558 G solv -3285.3562 67.3320 9.5222 TOTAL -5279.4509 50.4067 7.1286 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1.8595 2.0516 0.2901 EEL -5.5796 2.0333 0.2876 EPB -31.3364 0.6953 0.0983 ECAVITY 3.1896 0.0288 0.0041 G gas -7.4391 8.3434 1.1799 G solv -28.1468 0.6959 0.0984 TOTAL 56.0934 5.0476 0.7138 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -56.2934 2.9265 0.4139 EEL -38.0300 3.2114 0.4542 EPB 57.6402 3.0642 0.4333 ECAVITY -5.0423 0.1683 0.0238 DELTA G gas -94.3234 18.8778 2.6697 DELTA G solv 52.5978 3.0688 0.4340 DELTA G binding = -41.7256 +/- 2.9618 0.4189 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- The beginning of the statistics file includes the date/time, any warnings based on the values and files given, the mmpbsa.in text, the files used by the script, the number of frames analyzed, and which PB solver (if any) was used. The rest of the statistics file includes all the average energies, standard deviations, and standard error of the mean for GB followed by PB. After each section, the ΔG of binding is given along with the error values. The meaning of the different terms in this file is as follows: VDWAALS = van der Waals contribution from MM. EEL = electrostatic energy as calculated by the MM force field. EPB/EGB = the electrostatic contribution to the solvation free energy calculated by PB or GB respectively. ECAVITY = nonpolar contribution to the solvation free energy calculated by an empirical model. DELTA G binding = final estimated binding free energy calculated from the terms above. (kCal/mol) Note that the total gas phase energy has not been reported because the values of the bonded potential terms for the receptor and ligand should exactly cancel those for the complex using the single trajectory approach. An error message will result if the energies do not cancel within an allowed tolerance. One would typically expect to find an extremely favorable electrostatic energy and a unfavorable solvation free energy. This symbolises the energy that ones has to use to de-solvate the binding particles and to align their binding interfaces. From the negative total binding free energy -41.73 kcal/mol we clearly see that this is a favorable protein-protein complex in pure water but keep in mind that the result does not equal the real binding free energy since we did not estimate the (unfavorable) entropy contribution to binding. Note that the PB approach gives a slightly lower binding energy but still suggests that this is a favorable bound state. (Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.) Copyright McGee, Miller, and Swails 2009 AMBER ADVANCED TUTORIALS TUTORIAL 3 - SECTION 3.3 Python Script MMPBSA.py Dwight McGee, Bill Miller III, and Jason Swails 1) Set up the pdb files to be leap-ready. The system we will model in this simulation is the complex between the human H- Ras protein and the Ras-binding domain of C-Raf1 (Ras-Raf), which is central to the signal transduction cascade. Here is a partially equilibrated, pre-prepared pdb file of the RAS-RAF complex. ras-raf.pdb This structure contains the ras and raf proteins and also a physiologically necessary GTP nucleotide as illustrated in the figure below: We must now prepare the mutant pdb files to be read by tleap. We strongly suggest preparing this pdb and topology of the files of the mutant along with the initial topology files created in Section 1 prior to running any simulations in order to guarantee consistent prmtop files. For this tutorial we have chosen to mutate residue 21 (Isoleucine, I21) to alanine because this is a residue that is found at the interface between the receptor and ligand and should have a noticeable effect on the binding energy. Note the current version of the code will support mutations to alanine only. Since I21 is found only in the receptor, we do not need to make a mutant ligand pdb file. Thus, we only need to change ras-raf.pdb and ras.pdb. To do so, you must know something about the structure of the amino acids that are involved. The side chain of isoleucine is -CH(CH 3 )CH 2 CH 3 . The side chain of alanine is -CH 3 . Since the side chain of isoleucine has more atoms than the side chain of alanine, we know that we must remove atoms and their corresponding information (name, number, coordinates, etc.) from the pdb files. This mutation involves removing all side chain atoms except the beta-carbon (CB). In I21, this means that we must remove the lines in the Ras-Raf and Ras pdb files corresponding to atoms 294 through 305. We do not need to add the beta-hydrogen (HB) atoms because tleap will add those in the proper locations based on the particular library files you choose for your system. Finally, change the residue name from "ILE" to "ALA" for all remaining atoms of residue 21. This procedure will yield two mutant pdb files: ras- raf_mutant.pdb and ras_mutant.pdb. The I21A mutation in RAS-RAF is depicted below. Other mutations will be able to follow similar procedures where the group of atoms after CB but before the carbonyl-carbon (C) may be removed from the pdb file. Note that only one mutation can be performed during a single calculation. 2) Build the starting topology and coordinate files and run a simulation to obtain an equilibrated system. Now that the pdb files have been made, we need to make the corresponding topology and coordinate files for these structures using tleap. First, we will make the files corresponding to the non-mutant complex: > $AMBERHOME/exe/tleap -s -f $AMBERHOME/dat/leap/cmd/leaprc.ff99SB com = loadpdb ras-raf.pdb ras = loadpdb ras.pdb raf = loadpdb raf.pdb saveamberparm com ras-raf.prmtop ras-raf.inpcrd saveamberparm ras ras.prmtop ras.inpcrd saveamberparm raf raf.prmtop raf.inpcrd You should also create the solvated complex for running the MD simulation: charge com > Total unperturbed charge: -0.000000 > Total perturbed charge: -0.000000 (Hence there is no need to add counter ions) solvatebox com TIP3PBOX 12.0 saveamberparm com ras-raf_solvated.prmtop ras-raf_solvated.inpcrd Now before you quit tleap you should create the topology and coordinate files from the mutant pdb files you just created: com_mut = loadpdb ras-raf_mutant.pdb ras_mut = loadpdb ras_mutant.pdb saveamberparm com_mut rasraf_mutant.prmtop rasraf_mutant.inpcrd saveamberparm ras_mut ras_mutant.prmtop ras_mutant.inpcrd quit We just created 12 files (six .prmtop files and six .inpcrd files). The non-mutant .prmtop and .inpcrd files have been used to run a Molecular Dynamics (MD) simulation to obtain an equilibrated system using the procedures outlined in Section 1 and Section 2. The important files for calculating the binding free energy using MMPBSA.py are the topology files (non-mutant and mutant) and the mdcrd file ran using the non- mutant topology and coordinate files (ras-raf_alascan.tgz) 3) Perform an alanine scanning calculation on the binding free energy of Ras- Raf. We will now calculate the interaction energy and solvation free energy for the complex, receptor and ligand and average the results to obtain an estimate of the binding free energy. Then the same calculation will be performed on the mutated structure after the coordinates in the mdcrd file(s) have been mutated for comparison to the "wild-type" structure. Please note that we will not perform a calculation of the entropy contribution to binding in this part of the tutorial and so strictly speaking our result will not be a true free energy but could be used to compare against similar systems. See Section 3.5 for an example of using Normal Mode Analysis (Nmode) to calculate the entropy contribution for a system or uncomment out the last line in the general section to perform a Quasi-Harmonic entropy calculation using the ptraj module in AMBER. We will carry out the binding energy calculation using both the MM-GBSA method and the MM-PBSA method for comparison. This is accomplished with the following input file for MMPBSA.py: mmpbsa.in sample input file for running alanine scanning &general startframe=1, endframe=50, interval=1, verbose=1, / &gb saltcon=0.1 / &pb istrng=0.100 / &alanine_scanning / The input files for MMPBSA.py are designed to be similar to the setup of an mdin file used in the sander module of AMBER. The start of each namelist is designated by an ampersand (&) followed by the name of the namelist. Furthermore, a backslash (/) or '&end' can be used to end the namelist. For a complete list of all variables please see the User's Manual here. This input file is divided into four namelists: general, pb, gb, and alanine_scanning. The general namelist is designed to specify variables that are not specific to a particular part of the calculation, but rather to all parts. In this setup we have defined RAS to be the receptor and RAF to be the ligand. The 'verbose' variable allows the user to specify what files are removed at the end of the calculation. For more information on the mpi commands, please see the manual or Section 3.4. The '&gb' and '&pb' namelist markers let the script know to perform MM-GBSA and MM-PBSA calculations with the given values defined within those namelists. The 'alanine_scanning' namelist marker initializes alanine scanning in the script. The only recognized input variable in the &alanine_scanning namelist is "mutant_only" which is described in more detail in the manual. The four python scripts (MMPBSA.py, utils.py, alamdcrd.py, and inputparse.py) should be placed in $AMBERHOME/bin/. After doing this, the script can be initiated (using the above input file) using the same flags as in AMBER 9 or 10: $AMBERHOME/bin/MMPBSA.py -O -i mmpbsa.in -sp ras-raf_solvated.prmtop -cp rasraf.prmtop -rp ras.prmtop -lp raf.prmtop -y *.mdcrd -mc rasraf_mutant.prmtop -mr ras_mutant.prmtop This will run the script interactively and print the progress of the calculation to STDOUT and any errors or warnings to STDERR. Finally, timings will be printed once the calculation has completed showing the time taken during each step of the calculation. The '-y *.mdcrd' on the command line tells the script to read in all files in the working directory that end in '.mdcrd' and use them as the trajectories to be analyzed. Here are all the output files: ALASCAN_output.tgz. The script creates three unsolvated mdcrd files (complex, receptor, and ligand) using ptraj that are the coordinates analyzed during the GB and PB calculations. The *.mdout files contain the energies for all frames specified. An average pdb file is created as a structure for minimization if entropy calculations are performed. All files created by MMPBSA.py should begin with the prefix '_MMPBSA_' except for the final output file, FINAL_RESULTS_MMPBSA.dat FINAL_RESULTS_MMPBSA.dat | Run on Thu Feb 11 13:11:48 EST 2010 |Input file: |-------------------------------------------------------------- |sample input file for running alanine scanning | &general | startframe=1, endframe=50, interval=1, | verbose=1, |/ |&gb | saltcon=0.1 |/ |&pb | istrng=0.100 |/ |&alanine_scanning |/ |-------------------------------------------------------------- |Solvated complex topology file: ras-raf_solvated.prmtop |Complex topology file: rasraf.prmtop |Receptor topology file: ras.prmtop |Ligand topology file: raf.prmtop |Initial mdcrd(s): bigprod.mdcrd |Mutant complex topology file: rasraf_mutant.prmtop |Mutant receptor topology file: ras_mutant.prmtop |Mutant ligand topology file: raf.prmtop | |Best guess for receptor mask: ":1-166" |Best guess for ligand mask: ":167-242" |Calculations performed using 50 frames. |Poisson Boltzmann calculations performed using internal PBSA solver in sander. | |All units are reported in kcal/mole. ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- GENERALIZED BORN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1863.7944 17.1704 2.4283 EEL -17200.7297 75.9366 10.7391 EGB -3142.2247 63.1977 8.9375 ESURF 91.3565 1.3938 0.1971 G gas -19064.5240 77.8536 11.0102 G solv -3050.8682 63.2131 8.9397 TOTAL -22115.3922 51.5332 7.2879 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1268.1888 14.2342 2.0130 EEL -11557.0773 71.7127 10.1417 EGB -2444.8629 54.9156 7.7662 ESURF 64.2843 1.1143 0.1576 G gas -12825.2661 73.1118 10.3396 G solv -2380.5786 54.9269 7.7678 TOTAL -15205.8447 36.8422 5.2103 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -529.3090 9.4198 1.3322 EEL -4684.4720 36.1449 5.1117 EGB -1661.8286 26.5442 3.7539 ESURF 37.0493 0.6185 0.0875 G gas -5213.7811 37.3522 5.2824 G solv -1624.7794 26.5514 3.7549 TOTAL -6838.5604 25.6515 3.6277 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -66.2966 4.2751 0.6046 EEL -959.1803 34.9190 4.9383 EGB 964.4668 32.9201 4.6556 ESURF -9.9770 0.3759 0.0532 DELTA G gas -1025.4769 35.1797 4.9752 DELTA G solv 954.4898 32.9223 4.6559 DELTA G binding = -70.9871 +/- 6.6875 0.9457 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- I21A MUTANT: GENERALIZED BORN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1855.4226 17.0765 2.4150 EEL -17210.2882 75.8866 10.7320 EGB -3145.1010 63.2477 8.9446 ESURF 91.8639 1.3913 0.1968 G gas -19065.7108 77.7842 11.0003 G solv -3053.2370 63.2630 8.9467 TOTAL -22118.9478 50.9582 7.2066 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1261.9126 14.1817 2.0056 EEL -11566.4419 71.5475 10.1183 EGB -2447.4831 55.0008 7.7783 ESURF 64.5090 1.1105 0.1570 G gas -12828.3545 72.9394 10.3152 G solv -2382.9741 55.0120 7.7799 TOTAL -15211.3287 36.2055 5.1202 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -529.3090 9.4198 1.3322 EEL -4684.4720 36.1449 5.1117 EGB -1661.8286 26.5442 3.7539 ESURF 37.0493 0.6185 0.0875 G gas -5213.7811 37.3522 5.2824 G solv -1624.7794 26.5514 3.7549 TOTAL -6838.5604 25.6515 3.6277 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -64.2010 4.0841 0.5776 EEL -959.3742 34.9114 4.9372 EGB 964.2108 32.9092 4.6541 ESURF -9.6943 0.3800 0.0537 DELTA G gas -1023.5752 35.1495 4.9709 DELTA G solv 954.5164 32.9114 4.6544 DELTA G binding = -69.0588 +/- 6.5302 0.9235 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- RESULT OF ALANINE SCANNING: (I21A MUTANT:) DELTA DELTA G binding = 1.9283 +/- 9.3470 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- POISSON BOLTZMANN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1863.7944 17.1704 2.4283 EEL -17200.7297 75.9366 10.7391 EPB -3227.2145 64.4523 9.1149 ECAVITY 68.4754 0.7567 0.1070 G gas -19064.5240 6061.1875 857.1813 G solv -3158.7391 64.4568 9.1156 TOTAL -7522.2032 51.2973 7.2545 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1268.1888 14.2342 2.0130 EEL -11557.0773 71.7127 10.1417 EPB -2485.3559 54.5638 7.7165 ECAVITY 47.5088 0.4610 0.0652 G gas -12825.2661 5345.3320 755.9441 G solv -2437.8471 54.5658 7.7168 TOTAL -5118.8075 38.9610 5.5099 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -529.3090 9.4198 1.3322 EEL -4684.4720 36.1449 5.1117 EPB -1684.5802 28.2572 3.9962 ECAVITY 28.1687 0.3939 0.0557 G gas -5213.7811 1395.1865 197.3092 G solv -1656.4114 28.2599 3.9966 TOTAL -2313.4381 24.9082 3.5225 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -66.2966 4.2751 0.6046 EEL -959.1803 34.9190 4.9383 EPB 942.7215 33.8861 4.7922 ECAVITY -7.2022 0.3069 0.0434 DELTA G gas -1025.4769 1237.6138 175.0250 DELTA G solv 935.5194 33.8875 4.7924 DELTA G binding = -89.9575 +/- 8.2480 1.1664 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- I21A MUTANT: POISSON BOLTZMANN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1855.4226 17.0765 2.4150 EEL -17210.2882 75.8866 10.7320 EPB -3229.1405 64.8100 9.1655 ECAVITY 68.5521 0.7596 0.1074 G gas -19065.7108 6050.3755 855.6523 G solv -3160.5884 64.8144 9.1661 TOTAL -7520.1586 50.6710 7.1660 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1261.9126 14.1817 2.0056 EEL -11566.4419 71.5475 10.1183 EPB -2487.5603 54.6289 7.7257 ECAVITY 47.6466 0.4632 0.0655 G gas -12828.3545 5320.1609 752.3844 G solv -2439.9137 54.6309 7.7260 TOTAL -5118.8820 38.4370 5.4358 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -529.3090 9.4198 1.3322 EEL -4684.4720 36.1449 5.1117 EPB -1684.5802 28.2572 3.9962 ECAVITY 28.1687 0.3939 0.0557 G gas -5213.7811 1395.1865 197.3092 G solv -1656.4114 28.2599 3.9966 TOTAL -2313.4381 24.9082 3.5225 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -64.2010 4.0841 0.5776 EEL -959.3742 34.9114 4.9372 EPB 942.9999 34.0350 4.8133 ECAVITY -7.2632 0.3107 0.0439 DELTA G gas -1023.5752 1235.4872 174.7243 DELTA G solv 935.7367 34.0364 4.8135 DELTA G binding = -87.8385 +/- 8.0665 1.1408 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- RESULT OF ALANINE SCANNING: (I21A MUTANT:) DELTA DELTA G binding = 2.1190 +/- 11.5368 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- The beginning of the statistics file includes the date/time, any warnings based on the values and files given, the mmpbsa.in text, the files used by the script, the number of frames analyzed, and which PB solver (if any) was used. The rest of the statistics file includes all the average energies, standard deviations, and standard error of the mean for GB followed by PB. After each section, the ΔG of binding is given along with the error values. After each method, the ΔΔG of binding is reported that demonstrates the relative affect the mutation has on the ΔG of binding for the complex. The specific mutation is also printed at the end of the file. In this case, we mutated residue 21 from an isoleucine to an alanine (i.e. I21A). The meaning of the different energy terms in this file is as follows: VDWAALS = van der Waals contribution from MM. EEL = electrostatic energy as calculated by the MM force field. EPB/EGB = the electrostatic contribution to the solvation free energy calculated by PB or GB respectively. ECAVITY = nonpolar contribution to the solvation free energy calculated by an empirical model. DELTA G binding = final estimated binding free energy calculated from the terms above. (kCal/mol) Note that the total gas phase energy has not been reported because the values of the bonded potential terms for the receptor and ligand should exactly cancel those for the complex using the single trajectory approach. An error message will result if the energies do not cancel within an allowed tolerance. One would typically expect to find an extremely favorable electrostatic energy and a unfavorable solvation free energy. This symbolises the energy that ones has to use to de-solvate the binding particles and to align their binding interfaces. (Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.) Copyright McGee, Miller, and Swails 2009 AMBER ADVANCED TUTORIALS TUTORIAL 3 - SECTION 4 MMPBSA.py Dwight McGee, Bill Miller III, and Jason Swails 1) Build the starting structure and run a simulation to obtain an equilibrated system. The system we will model in this simulation is the complex between the human H- Ras protein and the Ras-binding domain of C-Raf1 (Ras-Raf), which is central to the signal transduction cascade. Here is a partially equilibrated, pre-prepared pdb file of the RAS-RAF complex. ras-raf.pdb This structure contains the ras and raf proteins and also a physiologically necessary GTP nucleotide as illustrated in the figure below: For directions on how to build the starting structure and run a simulation to obtain an equilibrated system, please refer to Section 1 and Section 2. The important files for calculating the binding free energy using MMPBSA.py are the topology files and the mdcrd file (ras-raf_top_mdcrd.tgz) 2) Calculate the binding free energy of Ras-Raf in parallel. We will now calculate the interaction energy and solvation free energy for the complex, receptor and ligand and average the results to obtain an estimate of the binding free energy. Please note that we will not perform a calculation of the entropy contribution to binding in this part of the tutorial and so strictly speaking our result will not be a true free energy but could be used to compare against similar systems. See Section 5 for an example of using Normal Mode Analysis (Nmode) to calculate the entropy contribution for a system or uncomment out the last line in the general section to perform a Quasi-Harmonic entropy calculation using the ptraj module in AMBER. We will carry out the binding energy calculation using both the MM-GBSA and the MM-PBSA methods in parallel for comparison with the results obtained from Section 3.1. MMPBSA.py.MPI parallelizes the calculation by assigning an equal number of frames to each thread (process). Thus, it operates most efficiently when the number of frames processed is a multiple of the number of threads started. However, this is not a requirement. If the number of frames is not a multiple of the number of threads, the "leftover" frames will be evenly distributed amongst a subset of the number of threads started. For example, running 50 frames on 3 threads will cause 2 threads to calculate 17 frames and the last thread to calculate only 16. Thus, the third thread will have to wait for the first two to finish their calculations before they can progress. For this reason, 5 threads, for instance, would be a wiser choice (as each thread takes 10 frames). However, the number of threads cannot exceed the number of frames processed or MMPBSA.py.MPI will terminate with an error message. The input file for MMPBSA.py.MPI is identical to the input file for MMPBSA.py: mmpbsa.in Input file for running PB and GB &general endframe=50, verbose=1, # entropy=1, / &gb igb=2, saltcon=0.100 / &pb istrng=0.100, / The input files for MMPBSA.py.MPI are designed to be similar to the setup of an mdin file used in the sander module of AMBER. The start of each namelist is designated by an ampersand (&) followed by the name of the namelist. Furthermore, a backslash (/) or '&end' can be used to end the namelist. For a complete list of all variables please see the User's Manual here. This input file is divided into three namelists: general, pb, and gb. The general namelist is designed to specify variables that are not specific to a particular part of the calculation, but instead to all parts. In this setup we have defined RAS to be the receptor and RAF to be the ligand. The 'endframe' variable sets what frame of the mdcrd to stop on. The '&gb' and '&pb' namelist markers let the script know to perform MM-GBSA and MM-PBSA calculations with the given values defined within those namelists. mpirun -np 4 $AMBERHOME/bin/MMPBSA.py.MPI -O -i mmpbsa.in -o FINAL_RESULTS_MMPBSA.dat -sp ras-raf_solvated.prmtop -cp ras-raf.prmtop -rp ras.prmtop -lp raf.prmtop -y *.mdcrd > progress.log 2>&1 or by submitting a script to a queuing system, such as PBS with qsub parallel.job The script, parallel.job, would look something like this using a bash shell: parallel.job #!/bin/sh #PBS -N rasraf_parallel #PBS -o parallel.out #PBS -e parallel.err #PBS -m abe #PBS -M [email protected] #PBS -q brute #PBS -l nodes=1:node:ppn=4 #PBS -l pmem=900mb cd $PBS_O_WORKDIR mpirun -np 4 MMPBSA.py.MPI -O -i mmpbsa.in -o FINAL_RESULTS_MMPBSA.dat - sp ras-raf_solvated.prmtop \ -cp ras-raf.prmtop -rp ras.prmtop -lp raf.prmtop -y bigprod.mdcrd > progress.log 2>&1 This will print the progress of the calculation to the file progress.log. All errors during the calculation will be printed to this file, as well (that is the purpose of 2>&1). Finally, timings will be printed once the calculation has completed showing the time taken during each step of the script. progress.log MMPBSA.py.MPI being run on 4 processors ptraj found! Using /scr/arwen_3/swails/i686/amber11/exe/ptraj sander found! Using /scr/arwen_3/swails/i686/amber11/exe/sander Preparing trajectories with ptraj... 50 frames were read in and processed by ptraj for use in calculation. Starting calculations Starting gb calculation... Starting pb calculation... Calculations complete. Writing output file(s)... Timing: Processing Trajectories With Ptraj: 0.126 min. Total GB Calculation Time (sander): 4.782 min. Total PB Calculation Time (sander): 28.407 min. Output File Writing Time: 0.053 min. Total Time Taken: 33.379 min. MMPBSA Finished. Thank you for using. Please send any bugs/suggestions/comments to [email protected] The '-y *.mdcrd' on the command line tells the script to read in all files in the working directory that end in '.mdcrd' and use them as the trajectories to be analyzed. With 'keep_files' set to its default value of 1, here are all the output files: Parallel_output.tgz. The script creates three unsolvated mdcrd files (complex, receptor, and ligand) using ptraj that are the coordinates analyzed during the GB and PB calculations. The *.mdout files contain the energies for all frames specified. An average pdb file is created as an average structure if quasi-harmonic entropy calculations are performed with ptraj. All files created by MMPBSA.py.MPI should begin with the prefix '_MMPBSA_' except for the final output files, FINAL_RESULTS_MMPBSA.dat FINAL_RESULTS_MMPBSA.dat | Run on Sun Feb 14 19:10:43 EST 2010 |Input file: |-------------------------------------------------------------- |Input file for running PB and GB in serial |&general | endframe=50, verbose=1, | mpi_cmd='mpirun -np 3', nproc=3 |/ |&gb | igb=2, saltcon=0.100 |/ |&pb | istrng=0.100, |/ |-------------------------------------------------------------- |Solvated complex topology file: ras-raf_solvated.prmtop |Complex topology file: ras-raf.prmtop |Receptor topology file: ras.prmtop |Ligand topology file: raf.prmtop |Initial mdcrd(s): bigprod.mdcrd | |Best guess for receptor mask: ":1-166" |Best guess for ligand mask: ":167-242" |Calculations performed using 50 frames. |Poisson Boltzmann calculations performed using internal PBSA solver in sander. | |All units are reported in kcal/mole. ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- GENERALIZED BORN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1863.7944 17.1704 2.4283 EEL -17200.7297 75.9366 10.7391 EGB -3249.6511 65.2075 9.2217 ESURF 91.3565 1.3938 0.1971 G gas -19064.5240 77.8536 11.0102 G solv -3158.2946 65.2224 9.2238 TOTAL -22222.8186 51.0216 7.2155 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1268.1888 14.2342 2.0130 EEL -11557.0773 71.7127 10.1417 EGB -2532.0669 57.7003 8.1600 ESURF 64.2843 1.1143 0.1576 G gas -12825.2661 73.1118 10.3396 G solv -2467.7826 57.7110 8.1616 TOTAL -15293.0487 35.3527 4.9996 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -529.3090 9.4198 1.3322 EEL -4684.4720 36.1449 5.1117 EGB -1688.9631 26.5353 3.7527 ESURF 37.0493 0.6185 0.0875 G gas -5213.7811 37.3522 5.2824 G solv -1651.9138 26.5425 3.7537 TOTAL -6865.6949 25.8878 3.6611 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -66.2966 4.2751 0.6046 EEL -959.1803 34.9190 4.9383 EGB 971.3789 33.0497 4.6739 ESURF -9.9770 0.3759 0.0532 DELTA G gas -1025.4769 35.1797 4.9752 DELTA G solv 961.4018 33.0518 4.6742 DELTA G binding = -64.0750 +/- 6.3729 0.9013 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- POISSON BOLTZMANN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1863.7944 17.1704 2.4283 EEL -17200.7297 75.9366 10.7391 EPB -3207.7160 66.4023 9.3907 ECAVITY 67.8762 0.7818 0.1106 G gas -19064.5240 6061.1875 857.1813 G solv -3139.8399 66.4069 9.3914 TOTAL -7686.8660 52.5400 7.4303 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1268.1888 14.2342 2.0130 EEL -11557.0773 71.7127 10.1417 EPB -2483.7242 56.4551 7.9840 ECAVITY 47.1495 0.4737 0.0670 G gas -12825.2661 5345.3320 755.9441 G solv -2436.5747 56.4571 7.9842 TOTAL -5250.2060 38.5188 5.4474 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -529.3090 9.4198 1.3322 EEL -4684.4720 36.1449 5.1117 EPB -1670.4169 27.6694 3.9131 ECAVITY 28.0328 0.4133 0.0584 G gas -5213.7811 1395.1865 197.3092 G solv -1642.3841 27.6725 3.9135 TOTAL -2350.3020 25.1197 3.5525 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -66.2966 4.2751 0.6046 EEL -959.1803 34.9190 4.9383 EPB 946.4251 34.5128 4.8808 ECAVITY -7.3062 0.3004 0.0425 DELTA G gas -1025.4769 1237.6138 175.0250 DELTA G solv 939.1189 34.5141 4.8810 DELTA G binding = -86.3579 +/- 8.3264 1.1775 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- WARNINGS: igb=2 should be used with mbondi2 pbradii set. Yours are modified Bondi radii (mbondi) The beginning of the statistics file includes the date/time, a copy of the input file, the best guess for the ligand and receptor masks, the files used by the script, the number of frames analyzed, and which PB solver (if any) was used. The rest of the statistics file includes all the average energies, standard deviations, and standard error of the mean for GB followed by PB. After each section, the ΔG of binding is given along with the error values. The meaning of the different terms in this file is as follows: VDWAALS = van der Waals contribution from MM. EEL = electrostatic energy as calculated by the MM force field. EPB/EGB = the electrostatic contribution to the solvation free energy calculated by PB or GB respectively. ECAVITY = nonpolar contribution to the solvation free energy calculated by an empirical model. DELTA G binding = final estimated binding free energy calculated from the terms above. (kCal/mol) Note that the total gas phase energy has not been reported because the values of the bonded terms for the receptor and ligand should exactly cancel those for the complex using the single trajectory approach. An error message will result if the energies do not cancel to within an allowed tolerance (0.001 kcal/mol). One would typically expect to find an extremely favorable electrostatic energy and a unfavorable solvation free energy. This symbolises the energy that ones has to use to de-solvate the binding particles and to align their binding interfaces. From the negative total binding free energy -86.36 kcal/mol we clearly see that this is a favorable protein-protein complex in pure water but keep in mind that the result does not equal the real binding free energy since we did not estimate the (unfavorable) entropy contribution to binding. Note that the GB approach gives a slightly lower binding energy but still suggests that this is a favorable bound state. (Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.) Copyright McGee, Miller, and Swails 2009 AMBER ADVANCED TUTORIALS TUTORIAL 3 - SECTION 3.5 MMPBSA.py Dwight McGee, Bill Miller III, and Jason Swails 0) Setting up for normal mode calculations Note that as of May, 2010, normal mode calculations are done via a nab program, mmpbsa_py_nabnmode, compiled by nab during the normal installation process described here. This program must be present in the PATH or in $AMBERHOME/exe to run calculations. The main differences between mmpbsa_py_nabnmode and nmode is that the nab program can calculate normal modes in Generalized Born solvent, so two more input variables have been added (see nmode_igb and nmode_istrng in the manual). This implementation should also improve minimization convergence problems seen in earlier versions of the script. 1) Build the starting structure and run a simulation to obtain an equilibrated system. The system we will model in this simulation is the protein-ligand complex between the Estrogen Receptor protein and Raloxifene ligand. Here is a pre-prepared pdb file of the complex. Estrogen_Receptor-Raloxifene.pdb This structure contains the Estrogen Receptor protein along with a ligand called Raloxifene that has been previously docked to the protein as illustrated in the figure below: This system was constructed in a similar manner to the Ras-Raf system in Section 1. For directions on how to build the starting structure and run a simulation to obtain an equilibrated system, please refer to Section 1 and Section 2 Note that you must also use antechamber to get the correct parameters for raloxifene. See the Sustiva Tutorial (Basic 4) for detailed instructions. The important files from the MD simuation for calculating the binding free energy using MMPBSA.py are the topology files and the mdcrd file (Est_Rec_top_mdcrd.tgz) 2) Calculate the binding entropy using Normal Mode Analysis (normal mode) We will now calculate the normal modes for the complex, receptor and ligand and average the results to obtain an estimate of the binding entropy. Please note that the entropy contribution for a system can be calculated by uncommenting out the last line in the general section to perform a Quasi-Harmonic entropy calculation using the ptraj program in AmberTools. We will carry out the normal mode calculation using the following input file for MMPBSA.py: mmpbsa.in Input file for running entropy calculations using NMode &general endframe=50, keep_files=2, / &nmode nmstartframe=1, nmendframe=50, nminterval=5, nmode_igb=1, nmode_istrng=0.1, / The input files for MMPBSA.py are designed to be similar to the setup of an mdin file used in the sander module of AMBER. The start of each section is designated by an ampersand (&) followed by the name of the section. Furthermore, a backslash (/) or '&end' can be used to end the section. For a complete list of all variables please see the User's Manual here. This input file is divided into two namelists: &general and &nmode. The &general namelist specifies variables that are not specific to any particular calculation, but instead to all calculations. In this setup the Estrogen Receptor is the receptor and Raloxifene is the ligand. The 'endframe' variable sets what frame of the mdcrd to stop on when making the dry mdcrd files using ptraj. The 'keep_files' variable allows the user to specify what files are removed at the end of the calculation. The nmode section is used to define the variables specific to the normal mode calculation. The 'nmstartframe' variable defines the frame from the dry mdcrd where normal mode analysis begins. The 'nmendframe' and 'nminterval' setup the end frame and interval for normal mode analysis of the dry mdcrd, respectively. Note that the nmstartframe/endframe/interval variables correspond to the 'trajectory' of snapshots extracted by startframe, endframe, and interval defined in &general. Thus, only a subset of these frames will be chosen for normal mode calculations (at most, each frame in _MMPBSA_complex.mdcrd) $AMBERHOME/bin/MMPBSA.py -O -i mmpbsa.in -o FINAL_RESULTS_MMPBSA.dat -sp 1err.solvated.prmtop -cp complex.prmtop -rp receptor.prmtop -lp ligand.prmtop -y *.mdcrd > progress.log or by submitting a script to a queuing system, such as PBS with qsub nmode.job The script, nmode.job, would look something like this using a bash shell: nmode.job #!/bin/sh #PBS -N nmode #PBS -o nmode.out #PBS -e nmode.err #PBS -m abe #PBS -M [email protected] #PBS -q brute #PBS -l nodes=1:surg:ppn=3 #PBS -l pmem=1450mb cd $PBS_O_WORKDIR mpirun -np 3 MMPBSA.py.MPI -O -i mmpbsa_nm.in -o FINAL_RESULTS_MMPBSA.dat -sp 1err.solvated.prmtop -cp complex.prmtop \ -rp receptor.prmtop -lp ligand.prmtop -y 1err_prod.mdcrd > progress.log This will print the progress of the calculation to the file progress_nm.log. All errors during the calculation will be printed to this file, as well. Finally, timings will be printed once the calculation has completed showing the time taken during each step of the script. Running this calculation will take approximately 40 hours on 10 frames if run in serial. Note that MMPBSA.py.MPI can be used to run in parallel (see Section 3.4 for details about running this). This will divide the number of frames evenly among the number of threads started as described in Section 3.4. The size of your system significantly affects the computation time and required memory (RAM). Segmentation faults (segfaults) are usually caused by insufficient RAM in your system. progress_nm.log ptraj found! Using /share/local/lib/amber10/i686/bin/ptraj sander found! Using /share/local/lib/amber10/i686/bin/sander (serial only!) nmode found! Using /share/local/lib/amber10/i686/bin/nmode Preparing trajectories with ptraj... 50 frames were read in and processed by ptraj for use in calculation. Starting sander calls Starting nmode calculations... Timing: Processing Trajectories With Ptraj: 0.240 min. Total Harmonic nmode Calculation Time: 2363.906 min. Output File Writing Time: 0.018 min. Total Time Taken: 2364.165 min. MMPBSA Finished. Thank you for using. Please send any bugs/suggestions/comments to [email protected] The '-y *.mdcrd' on the command line tells the script to read in all files in the working directory that end in '.mdcrd' and use them as the trajectories to be analyzed. With 'keep_files' set to 2, here are all the output files: Nmode_output.tgz. The script creates three unsolvated mdcrd files (complex, receptor, and ligand) using ptraj that are the coordinates analyzed during the entropy calculations. Normal mode calculations are performed using PDB files for each snapshot, so 10 PDB files are created from the dry mdcrd file for the complex, receptor, and ligand. An average pdb file is created as an average structure for aligning the trajectory for a quasi-harmonic entropy calculation performed by ptraj if entropy == 1. All files created by MMPBSA.py should begin with the prefix '_MMPBSA_' except for the final output file: FINAL_RESULTS_MMPBSA.dat FINAL_RESULTS_MMPBSA.dat | Run on Thu Feb 11 12:44:26 EST 2010 |Input file: |-------------------------------------------------------------- |Input file for running entropy calculations using NMode |&general | endframe=50, keep_files=2, |/ |&nmode | nmstartframe=1, nmendframe=50, | nminterval=5, |/ |-------------------------------------------------------------- |Solvated complex topology file: 1err.solvated.prmtop |Complex topology file: complex.prmtop |Receptor topology file: receptor.prmtop |Ligand topology file: ligand.prmtop |Initial mdcrd(s): 1err_prod.mdcrd | |Best guess for receptor mask: ":1-240" |Best guess for ligand mask: ":241" |Ligand residue name is "RAL" | |Calculations performed using 50 frames. |Poisson Boltzmann calculations performed using internal PBSA solver in sander. | |All units are reported in kcal/mole. ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- ENTROPY RESULTS (HARMONIC APPROXIMATION) CALCULATED WITH NMODE: Complex: Entropy Term Average Std. Dev. ----------------------------------------------------------- Translational: 16.9389 0.0000 Rotational: 17.3953 0.0038 Vibrational: 2784.7967 2.3982 Total: 2819.1307 2.3964 Receptor: Entropy Term Average Std. Dev. ----------------------------------------------------------- Translational: 16.9233 0.0000 Rotational: 17.3911 0.0045 Vibrational: 2755.5693 3.0352 Total: 2789.8840 3.0342 Ligand: Entropy Term Average Std. Dev. ----------------------------------------------------------- Translational: 13.2972 0.0000 Rotational: 11.4991 0.0496 Vibrational: 33.7549 0.0442 Total: 58.5511 0.0058 DELTA S total= -30.0192 +/- 3.4064 NOTE: All entropy results have units kcal/mol. (Temperature has already been multiplied in as 300. K) ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- The beginning of the output file includes various details about the calculation. The rest of the output file includes the averages, standard deviations, and standard error of the mean for each of the translational, rotational, vibrational, and total entropy contributions. After those sections, the ΔS is given along with the standard deviation and std. error of the mean. One would typically expect to find a negative ΔS value for a biological complex. This symbolizes the decrease in available microstates as the protein and ligand bind to make the complex. The decrease in available microstates mainly arises from the ligand being trapped and having limited mobility while being bound to the protein. From the negative ΔS value -30.02 kcal/mol we clearly see that this is an entropically unfavorable protein-ligand complex in pure water but keep in mind that the result does not equal the real binding free energy since we did not estimate the (favorable) enthalpic contribution to binding. (Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.) Copyright McGee, Miller, and Swails 2009 AMBER ADVANCED TUTORIALS TUTORIAL 3 - SECTION 3.6 Python Script MMPBSA.py Dwight McGee, Bill Miller III, and Jason Swails The important files for calculating the binding free energy using MMPBSA.py are the topology files and the mdcrd file (ras-raf_top_mdcrd.tgz) Decomposing the free energy contributions to the binding free energy of Ras-Raf in a per-residue or pairwise per-residue basis (amber11 only!) We will now perform free energy decomposition on the Ras-Raf system demonstrated in Section 3.1. Amber supports two types of decomposition: pairwise and per-residue. Per-residue decomposition calculates the energy contribution of single residues by summing its interactions over all residues in the system. Pairwise decomposition calculates the interaction energy between pairs of residues in the system. We will carry out examples of both types below. Note that obtaining DELTA contributions on a per-residue basis will ONLY work if MMPBSA.py correctly guesses your mask. You will have to manually add the residue cards to the input files if you input your own masks. a) Per-residue free energy decomposition To run decomposition, the &decomp namelist must be specified in the input file for MMPBSA.py. Furthermore, the variable idecomp must be specified (there is no default value). Failure to assign a value to this variable will result in the program terminating with an informative error message. There are 4 allowed values for idecomp, two of them for per-residue decomposition and the other two for pairwise decomposition. The values 1 and 2 result in a per-residue decomposition scheme. Selecting 1 will add the 1-4 non-bonded interaction energies (1-4 EEL and 1-4 VDW) to the internal potential terms. Selecting 2 will add the 1-4 EEL interaction energies to the electrostatic potential term and the 1-4 VDW interaction energies to the van der Waals potential term. The following MMPBSA.py input file will be used to perform per-residue decomposition using both PB and GB implicit solvent models: (NOTE: PB nonpolar solvation energies are currently not decomposable) mmpbsa_per_res_decomp.in Per-residue GB and PB decomposition &general endframe=50, verbose=1, / &gb igb=5, saltcon=0.100, / &pb istrng=0.100, / &decomp idecomp=1, print_res="5; 30-40; 170-200" dec_verbose=1, / The input files for MMPBSA.py are designed to be similar to the setup of an mdin file used in the sander module of AMBER. The start of each namelist is designated by an ampersand (&) followed by the name of the namelist. Furthermore, a backslash (/) or '&end' can be used to end the namelist. For a complete list of all variables please see the User's Manual here. This input file is divided into four namelists: &general, &pb, &gb, and &decomp.. The &general namelist is designed to specify variables that are not specific to a particular part of the calculation, but to all parts. In this setup we have defined RAS to be the receptor and RAF to be the ligand. The 'endframe' variable sets what frame of the mdcrd to stop on. The '&gb' and '&pb' namelist markers let the script know to perform MM-GBSA and MM- PBSA calculations with the given values defined within those namelists. The 'verbose' variable allows the user to specify how much output is written to the output file while the 'dec_verbose' variable allows the user to specify how much output is written to the decomp output file. $AMBERHOME/bin/MMPBSA.py -O -i mmpbsa.in -o FINAL_RESULTS_MMPBSA.dat -do FINAL_DECOMP_MMPBSA.dat -sp ras-raf_solvated.prmtop -cp ras-raf.prmtop -rp ras.prmtop -lp raf.prmtop -y *.mdcrd Note that this can be run in parallel using MMPBSA.py.MPI. See Section 3.4 for more details. This will run the script interactively and print the progress of the calculation to STDOUT and any errors or warnings to STDERR. Finally, timings will be printed once the calculation has completed showing the time taken during each step of the calculation. Command-line arguments can be given with shell-recognized wildcards (i.e. * and ? for bash). For example, the '-y *.mdcrd' on the command line tells the script to read in all files in the working directory that end with '.mdcrd' and use them as the trajectories to be analyzed. Here are all the output files created by this script: per_res_output.tgz. The script creates three unsolvated mdcrd files (complex, receptor, and ligand) using ptraj that are the coordinates analyzed during the GB and PB calculations. The *.mdout files contain the energies for all frames specified. An average pdb file is created as an average structure for minimization if entropy calculations are performed. All files created by MMPBSA.py should begin with the prefix '_MMPBSA_' except for the final output files: FINAL_RESULTS_MMPBSA.dat and FINAL_DECOMP_MMPBSA.dat FINAL_RESULTS_MMPBSA.dat | Run on Thu May 20 14:55:43 EDT 2010 |Input file: |-------------------------------------------------------------- |Per-residue GB and PB decomposition |&general | endframe=50, verbose=1, |/ |&gb | igb=5, saltcon=0.100, |/ |&pb | istrng=0.100, |/ |&decomp | idecomp=1, print_res="5; 30-40; 170-200" | dec_verbose=1, |/ |-------------------------------------------------------------- |Complex topology file: ras-raf.prmtop |Receptor topology file: ras.prmtop |Ligand topology file: raf.prmtop |Initial mdcrd(s): prod.mdcrd | |Best guess for receptor mask: ":1-166" |Best guess for ligand mask: ":167-242" |Calculations performed using 50 frames. |Poisson Boltzmann calculations performed using internal PBSA solver in sander. | |All units are reported in kcal/mole. ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- GENERALIZED BORN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1863.7944 16.9979 2.4039 EEL -17200.7297 75.1734 10.6311 EGB -2918.9628 65.1000 9.2065 ESURF 92.2138 0.9782 0.1383 G gas -19064.5240 77.0712 10.8995 G solv -2826.7490 65.1073 9.2076 TOTAL -21891.2730 52.3724 7.4066 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1268.1888 14.0912 1.9928 EEL -11557.0773 70.9920 10.0398 EGB -2314.8693 56.2410 7.9537 ESURF 64.4513 0.6128 0.0867 G gas -12825.2661 72.3770 10.2356 G solv -2250.4181 56.2443 7.9542 TOTAL -15075.6842 36.8322 5.2089 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -529.3090 9.3251 1.3188 EEL -4684.4720 35.7816 5.0603 EGB -1587.3051 26.8494 3.7971 ESURF 38.5992 0.5158 0.0730 G gas -5213.7811 36.9768 5.2293 G solv -1548.7058 26.8544 3.7978 TOTAL -6762.4869 26.1943 3.7044 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -66.2966 4.2321 0.5985 EEL -959.1803 34.5681 4.8887 EGB 983.2116 33.0175 4.6694 ESURF -10.8367 0.3832 0.0542 DELTA G gas -1025.4769 34.8262 4.9252 DELTA G solv 972.3749 33.0197 4.6697 DELTA G binding = -53.1020 +/- 6.8437 0.9678 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- POISSON BOLTZMANN: Complex: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1863.7944 16.9979 2.4039 EEL -17200.7297 75.1734 10.6311 EPB -3216.4587 65.8638 9.3146 ECAVITY 67.8762 0.7739 0.1094 G gas -19064.5240 77.0712 10.8995 G solv -3148.5825 65.8684 9.3152 TOTAL -22213.1066 51.7402 7.3172 Receptor: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -1268.1888 14.0912 1.9928 EEL -11557.0773 70.9920 10.0398 EPB -2489.5955 55.9343 7.9103 ECAVITY 47.1495 0.4689 0.0663 G gas -12825.2661 72.3770 10.2356 G solv -2442.4460 55.9363 7.9106 TOTAL -15267.7121 38.0243 5.3774 Ligand: Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -529.3090 9.3251 1.3188 EEL -4684.4720 35.7816 5.0603 EPB -1673.2574 27.4055 3.8757 ECAVITY 28.0328 0.4091 0.0579 G gas -5213.7811 36.9768 5.2293 G solv -1645.2246 27.4085 3.8761 TOTAL -6859.0057 24.7882 3.5056 Differences (Complex - Receptor - Ligand): Energy Component Average Std. Dev. Std. Err. of Mean ------------------------------------------------------------------------------- VDWAALS -66.2966 4.2321 0.5985 EEL -959.1803 34.5681 4.8887 EPB 946.3942 34.1674 4.8320 ECAVITY -7.3062 0.2973 0.0420 DELTA G gas -1025.4769 34.8262 4.9252 DELTA G solv 939.0881 34.1687 4.8322 DELTA G binding = -86.3888 +/- 8.1817 1.1571 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- WARNINGS: igb=5 should be used with either mbondi2 or bondi pbradii set. Yours are modified Bondi radii (mbondi) FINAL_DECOMP_MMPBSA.dat | Run on Thu May 20 14:55:43 EDT 2010 idecomp = 1: Decomposition per-residue adding 1-4 interactions added to Internal. Energy Decomposition Analysis (All units kcal/mol): Generalized Born solvent DELTAS: Total Energy Decomposition: Residue | Location | Internal | van der Waals | Electrostatic | Polar Solvation | Non-Polar Solv. | TOTAL ------------------------------------------------------------------------- ------------------------------------------------------------------------- ----- LYS 5 | R LYS 5 | 0.000 +/- 4.870 | -0.156 +/- 1.465 | 69.267 +/- 9.154 | -67.061 +/- 9.601 | -0.009 +/- 0.156 | 2.040 +/- 14.208 ASP 30 | R ASP 30 | 0.000 +/- 5.623 | -0.065 +/- 0.961 | - 52.559 +/- 11.072 | 52.622 +/- 9.530 | 0.000 +/- 0.084 | -0.003 +/- 15.684 GLU 31 | R GLU 31 | 0.000 +/- 5.174 | -0.247 +/- 1.099 | - 79.946 +/- 10.550 | 80.630 +/- 9.693 | -0.227 +/- 0.125 | 0.210 +/- 15.272 TYR 32 | R TYR 32 | 0.000 +/- 4.615 | -0.290 +/- 1.515 | 0.639 +/- 4.431 | -0.076 +/- 3.229 | -0.012 +/- 0.175 | 0.261 +/- 7.327 ASP 33 | R ASP 33 | 0.000 +/- 4.464 | -0.556 +/- 1.073 | - 103.116 +/- 5.820 | 103.788 +/- 5.821 | -0.459 +/- 0.094 | -0.343 +/- 9.426 PRO 34 | R PRO 34 | 0.000 +/- 3.388 | -1.829 +/- 0.869 | - 3.383 +/- 2.647 | 3.854 +/- 1.944 | -0.308 +/- 0.130 | -1.666 +/- 4.800 THR 35 | R THR 35 | 0.000 +/- 5.702 | -1.829 +/- 1.049 | 0.376 +/- 4.365 | 0.947 +/- 2.028 | -0.204 +/- 0.070 | -0.709 +/- 7.536 ILE 36 | R ILE 36 | 0.000 +/- 4.650 | -2.987 +/- 1.918 | 0.092 +/- 2.149 | 0.991 +/- 0.697 | -0.377 +/- 0.072 | -2.282 +/- 5.515 GLU 37 | R GLU 37 | 0.000 +/- 5.221 | -1.627 +/- 1.388 | - 126.728 +/- 6.441 | 120.528 +/- 4.686 | -0.745 +/- 0.048 | -8.573 +/- 9.624 ASP 38 | R ASP 38 | 0.000 +/- 3.750 | -1.583 +/- 1.560 | - 104.899 +/- 6.925 | 99.370 +/- 5.710 | -0.254 +/- 0.037 | -7.367 +/- 9.852 SER 39 | R SER 39 | 0.000 +/- 3.447 | -2.184 +/- 1.086 | - 13.696 +/- 3.959 | 8.918 +/- 1.800 | -0.504 +/- 0.035 | -7.466 +/- 5.655 TYR 40 | R TYR 40 | 0.000 +/- 4.687 | -4.403 +/- 1.682 | - 3.076 +/- 2.884 | 1.652 +/- 1.092 | -0.366 +/- 0.042 | -6.193 +/- 5.858 ARG 170 | L ARG 4 | 0.000 +/- 4.987 | -0.094 +/- 1.646 | - 86.951 +/- 10.352 | 82.074 +/- 6.005 | -0.147 +/- 0.073 | -5.118 +/- 13.069 VAL 171 | L VAL 5 | 0.000 +/- 3.812 | -0.183 +/- 1.390 | 2.128 +/- 2.460 | -2.010 +/- 0.555 | 0.000 +/- 0.008 | -0.065 +/- 4.778 PHE 172 | L PHE 6 | 0.000 +/- 4.289 | -0.217 +/- 0.944 | 0.037 +/- 1.743 | 0.132 +/- 0.939 | 0.000 +/- 0.064 | -0.048 +/- 4.818 LEU 173 | L LEU 7 | 0.000 +/- 4.907 | -0.398 +/- 1.241 | - 0.940 +/- 3.050 | 1.683 +/- 1.446 | 0.000 +/- 0.022 | 0.345 +/- 6.084 PRO 174 | L PRO 8 | 0.000 +/- 3.433 | -0.188 +/- 1.422 | 2.303 +/- 3.219 | -2.589 +/- 1.289 | 0.000 +/- 0.051 | -0.474 +/- 5.083 ASN 175 | L ASN 9 | 0.000 +/- 4.796 | -1.671 +/- 1.017 | - 1.833 +/- 4.740 | 4.535 +/- 2.409 | -0.354 +/- 0.119 | 0.678 +/- 7.233 LYS 176 | L LYS 10 | 0.000 +/- 4.403 | -1.848 +/- 0.810 | - 33.879 +/- 7.269 | 36.798 +/- 6.704 | -0.315 +/- 0.107 | 0.756 +/- 10.856 GLN 177 | L GLN 11 | 0.000 +/- 4.261 | -3.791 +/- 1.560 | - 1.910 +/- 3.338 | 4.530 +/- 2.016 | -0.359 +/- 0.050 | -1.530 +/- 5.983 ARG 178 | L ARG 12 | 0.000 +/- 6.180 | -2.462 +/- 1.321 | - 77.671 +/- 6.496 | 73.669 +/- 4.608 | -0.386 +/- 0.076 | -6.850 +/- 10.167 THR 179 | L THR 13 | 0.000 +/- 4.716 | -1.277 +/- 1.200 | - 10.976 +/- 3.020 | 9.344 +/- 0.977 | -0.158 +/- 0.031 | -3.068 +/- 5.810 VAL 180 | L VAL 14 | 0.000 +/- 4.196 | -3.837 +/- 1.389 | - 3.014 +/- 2.541 | 2.972 +/- 0.804 | -0.501 +/- 0.041 | -4.379 +/- 5.161 VAL 181 | L VAL 15 | 0.000 +/- 4.333 | -1.791 +/- 1.119 | - 3.565 +/- 2.809 | 3.472 +/- 0.656 | -0.155 +/- 0.055 | -2.039 +/- 5.324 ASN 182 | L ASN 16 | 0.000 +/- 4.282 | -1.978 +/- 0.859 | - 3.199 +/- 5.507 | 3.645 +/- 2.886 | -0.369 +/- 0.085 | -1.900 +/- 7.598 VAL 183 | L VAL 17 | 0.000 +/- 4.088 | -0.187 +/- 1.149 | 1.057 +/- 4.557 | -0.672 +/- 2.388 | 0.000 +/- 0.073 | 0.199 +/- 6.671 ARG 184 | L ARG 18 | 0.000 +/- 4.797 | -0.183 +/- 1.450 | - 90.812 +/- 7.977 | 87.306 +/- 6.336 | -0.335 +/- 0.109 | -4.023 +/- 11.353 ASN 185 | L ASN 19 | 0.000 +/- 5.744 | -0.018 +/- 0.966 | - 0.268 +/- 7.498 | 0.303 +/- 4.029 | 0.000 +/- 0.099 | 0.017 +/- 10.315 GLY 186 | L GLY 20 | 0.000 +/- 2.371 | -0.008 +/- 0.701 | - 0.334 +/- 2.324 | 0.379 +/- 1.810 | 0.000 +/- 0.057 | 0.037 +/- 3.846 MET 187 | L MET 21 | 0.000 +/- 3.770 | -0.156 +/- 1.254 | - 1.692 +/- 3.999 | 1.697 +/- 2.588 | -0.031 +/- 0.089 | -0.181 +/- 6.204 SER 188 | L SER 22 | 0.000 +/- 5.828 | -0.013 +/- 1.008 | 2.808 +/- 4.910 | -2.793 +/- 1.893 | 0.000 +/- 0.061 | 0.002 +/- 7.917 LEU 189 | L LEU 23 | 0.000 +/- 4.943 | -0.021 +/- 1.312 | 1.683 +/- 2.195 | -1.464 +/- 0.671 | 0.000 +/- 0.013 | 0.197 +/- 5.606 HIP 190 | L HIP 24 | 0.000 +/- 5.252 | -0.024 +/- 1.131 | - 43.617 +/- 5.567 | 43.652 +/- 4.925 | 0.000 +/- 0.083 | 0.011 +/- 9.172 ASP 191 | L ASP 25 | 0.000 +/- 3.724 | -0.058 +/- 0.723 | 62.413 +/- 8.165 | -61.719 +/- 8.199 | 0.000 +/- 0.107 | 0.636 +/- 12.178 CYS 192 | L CYS 26 | 0.000 +/- 5.318 | -0.098 +/- 1.398 | 1.937 +/- 3.894 | -1.552 +/- 1.485 | 0.000 +/- 0.042 | 0.287 +/- 6.900 LEU 193 | L LEU 27 | 0.000 +/- 4.324 | -0.108 +/- 1.390 | 0.884 +/- 2.119 | -0.740 +/- 0.648 | 0.000 +/- 0.006 | 0.036 +/- 5.054 ... cut off 250 lines The beginning of the output file lists general details about the calculation. The rest of the output file includes all the average energies, standard deviations, and standard error of the mean for GB followed by PB. After each section, the ΔG of binding is given along with the error values. The meaning of the different terms in this file is as follows: VDWAALS = van der Waals contribution from MM. EEL = electrostatic energy. EPB/EGB = the electrostatic contribution to the solvation free energy calculated by PB or GB respectively. ESURF/ECAVITY/ENPOLAR = nonpolar contribution to the solvation free energy calculated by an empirical model. DELTA G binding = final estimated binding free energy calculated from the terms above. (kcal/mol) The FINAL_DECOMP_MMPBSA.dat output file contains information regarding the interaction of each residue with the rest of the system broken down into component parts: internal (potential terms consisting of bond, angle, dihedral, and 1-4 interactions for idecomp=1), van der Waals (VDW and 1-4 VDW for idecomp=2), electrostatic (EEL and 1-4 EEL for idecomp=2), polar solvation, and non-polar solvation. This file is broken down into several sections as described below: The decomposition energies for each residue in the complex, receptor, ligand, and DELTA (defined by complex - receptor - ligand) are printed in their own section. Furthermore, each of these are further broken down into backbone, sidechain, and total contributions to their decomposition energies. "Backbone" is the interaction energy between the backbone atoms with every other atom in the system. "Sidechain" is the interaction energy between the sidechain atoms with every other atom in the system. "Total" is the interaction energy between every atom in the residue with every other atom in the system (and is thus the sum of the Backbone and Sidechain values for that residue). Each residue term is broken down into its component parts, described above, with the average value of the interaction +/- the standard deviation of that term. The DELTA section contains an extra column, called "Location", that lists where the specific residue in the complex is found ('R' for receptor, 'L' for ligand). The variable dec_verbose controls how much is printed to the decomp output file (see the manual for details). b) Pairwise per-residue free energy decomposition NOTE: In our experience, pairwise decomposition analysis done with the PB implicit solvent model takes very long to accomplish. The below analysis, of 50 frames using both GB and PB, took 61 hours on 9 processors (9 separate, 32-bit, single-core 2.8 GHz Xeon processors). The GB analysis took 3 minutes, so if you choose to do PB pairwise decomposition, be prepared for a lengthy simulation time. In this section, we will modify the input file to perform pairwise per-residue energy decomposition. This will be mostly the same as the per-residue section above with slight differences. The pairwise decomposition input file is shown below: mmpbsa_pairwise_decomp.in Pairwise GB and PB decomposition &general endframe=50, verbose=1, / &gb igb=5, saltcon=0.100, / &pb istrng=0.100, / &decomp idecomp=1, print_res="5; 30-40; 170-200" dec_verbose=0, / The same command is used to start MMPBSA.py as was used for per-residue decomposition. However, more care must be used in defining print_res in the &decomp namelist for pairwise decomposition. The number of terms that need to be evaluated for pairwise decomposition scales as n 2 , where n is the number of residues specified by print_res. By default, print_res corresponds to every residue in the complex, which for Ras-Raf will create a decomp output file around 65 MB (over 450,000 lines). Moreover, the mdout files created by sander will also be very large (several GB depending on how many frames and pairs are analyzed), and the memory/time requirements for the parser become substantial (i.e. it may take several minutes just to parse the output). The pairs calculated correspond to the residues specified in print_res with each other residue specified in print_res. See the manual for description of print_res syntax. Part of the output file is shown below: FINAL_DECOMP_MMPBSA.dat | Run on Sun May 23 05:36:28 EDT 2010 idecomp = 3: Pairwise decomposition adding 1-4 interactions added to Internal. Pairwise Energy Decomposition Analysis (All units kcal/mol): Generalized Born solvent DELTAS: Total Energy Decomposition: Resid 1 | Resid 2 | Internal | van der Waals | Electrostatic | Polar Solvation | Non-Polar Solv. | TOTAL ---------------------------------------------------------------------------------------------------- ------------------------------------------------- LYS 5 | LYS 5 | 0.000 +/- 0.000 | 0.000 +/- 1.075 | 0.000 +/- 3.408 | 1.601 +/- 7.229 | 0.000 +/- 0.051 | 1.601 +/- 8.064 LYS 5 | ASP 30 | 0.000 +/- 0.000 | 0.000 +/- 0.000 | 0.000 +/- 0.341 | -0.000 +/- 0.339 | 0.000 +/- 0.000 | -0.000 +/- 0.480 LYS 5 | GLU 31 | 0.000 +/- 0.000 | 0.000 +/- 0.000 | 0.000 +/- 0.350 | 0.000 +/- 0.348 | 0.000 +/- 0.000 | 0.000 +/- 0.494 LYS 5 | TYR 32 | 0.000 +/- 0.000 | 0.000 +/- 0.001 | 0.000 +/- 0.071 | 0.000 +/- 0.070 | 0.000 +/- 0.000 | 0.000 +/- 0.099 LYS 5 | ASP 33 | 0.000 +/- 0.000 | 0.000 +/- 0.000 | 0.000 +/- 0.434 | 0.000 +/- 0.431 | 0.000 +/- 0.000 | 0.000 +/- 0.612 LYS 5 | PRO 34 | 0.000 +/- 0.000 | 0.000 +/- 0.000 | 0.000 +/- 0.064 | 0.000 +/- 0.064 | 0.000 +/- 0.000 | 0.000 +/- 0.090 LYS 5 | THR 35 | 0.000 +/- 0.000 | 0.000 +/- 0.000 | 0.000 +/- 0.189 | 0.008 +/- 0.183 | 0.000 +/- 0.000 | 0.008 +/- 0.262 LYS 5 | ILE 36 | 0.000 +/- 0.000 | 0.000 +/- 0.001 | 0.000 +/- 0.120 | 0.021 +/- 0.130 | 0.000 +/- 0.000 | 0.021 +/- 0.177 ... cut 1800 lines ARG 200 | LEU 197 | 0.000 +/- 0.000 | 0.000 +/- 0.423 | 0.000 +/- 0.779 | -0.226 +/- 0.442 | 0.000 +/- 0.022 | -0.226 +/- 0.991 ARG 200 | LYS 198 | 0.000 +/- 0.000 | 0.000 +/- 0.284 | 0.000 +/- 0.793 | 0.237 +/- 0.572 | 0.000 +/- 0.011 | 0.237 +/- 1.018 ARG 200 | VAL 199 | 0.000 +/- 0.000 | 0.000 +/- 0.291 | 0.000 +/- 0.832 | 1.460 +/- 0.587 | 0.000 +/- 0.019 | 1.460 +/- 1.059 ARG 200 | ARG 200 | 0.000 +/- 0.000 | 0.000 +/- 0.562 | 0.000 +/- 3.888 | 14.394 +/- 2.388 | -0.000 +/- 0.035 | 14.394 +/- 4.598 idecomp = 3: Pairwise decomposition adding 1-4 interactions added to Internal. Pairwise Energy Decomposition Analysis (All units kcal/mol): Poisson Boltzmann solvent DELTAS: Total Energy Decomposition: Resid 1 | Resid 2 | Internal | van der Waals | Electrostatic | Polar Solvation | Non-Polar Solv. | TOTAL ---------------------------------------------------------------------------------------------------- ------------------------------------------------- LYS 5 | LYS 5 | 0.000 +/- 0.000 | 0.000 +/- 1.075 | 0.000 +/- 3.408 | 0.670 +/- 7.128 | 0.000 +/- 0.000 | 0.670 +/- 7.974 LYS 5 | ASP 30 | 0.000 +/- 0.000 | 0.000 +/- 0.000 | 0.000 +/- 0.341 | 0.007 +/- 0.334 | 0.000 +/- 0.000 | 0.007 +/- 0.477 LYS 5 | GLU 31 | 0.000 +/- 0.000 | 0.000 +/- 0.000 | 0.000 +/- 0.350 | 0.009 +/- 0.344 | 0.000 +/- 0.000 | 0.009 +/- 0.491 LYS 5 | TYR 32 | 0.000 +/- 0.000 | 0.000 +/- 0.001 | 0.000 +/- 0.071 | 0.002 +/- 0.069 | 0.000 +/- 0.000 | 0.002 +/- 0.098 LYS 5 | ASP 33 | 0.000 +/- 0.000 | 0.000 +/- 0.000 | 0.000 +/- 0.434 | 0.007 +/- 0.423 | 0.000 +/- 0.000 | 0.007 +/- 0.606 LYS 5 | PRO 34 | 0.000 +/- 0.000 | 0.000 +/- 0.000 | 0.000 +/- 0.064 | -0.000 +/- 0.063 | 0.000 +/- 0.000 | -0.000 +/- 0.090 LYS 5 | THR 35 | 0.000 +/- 0.000 | 0.000 +/- 0.000 | 0.000 +/- 0.189 | 0.024 +/- 0.175 | 0.000 +/- 0.000 | 0.024 +/- 0.257 LYS 5 | ILE 36 | 0.000 +/- 0.000 | 0.000 +/- 0.001 | 0.000 +/- 0.120 | 0.009 +/- 0.102 | 0.000 +/- 0.000 | 0.009 +/- 0.158 LYS 5 | GLU 37 | 0.000 +/- 0.000 | 0.000 +/- 0.008 | 0.000 +/- 1.649 | -0.223 +/- 1.624 | 0.000 +/- 0.000 | -0.223 +/- 2.315 LYS 5 | ASP 38 | 0.000 +/- 0.000 | 0.000 +/- 0.009 | 0.000 +/- 1.802 | -0.191 +/- 1.383 | 0.000 +/- 0.000 | -0.191 +/- 2.272 ... cut 1800 lines Note that the FINAL_RESULTS_MMPBSA.dat will be exactly the same as the one for the per-residue decomposition, since the energy decomposition scheme does not affect the total values. Thus, to avoid redundancy that file is omitted here. Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.) Copyright Ross Walker 2006 AMBER ADVANCED TUTORIALS TUTORIAL 3 - SECTION 3 Perl Script mm_pbsa.pl By Ross Walker & Thomas Steinbrecher 3) Calculate the binding free energy and analyse the results. (All calculations performed using mm_pbsa.pl here) We now need to extract snapshots (without the water) from our production runs for use in the MM-PBSA calculation. The mm_pbsa.pl script (in the $AMBERHOME/bin directory) automates this extraction process for us. Note if you don't have gzcat installed you will need to unzip the trajectory files before running mm_pbsa. We have to provide the following input file (see the linked file for an explanation of each of the various terms): extract_coords.mmpbsa @GENERAL PREFIX snapshot PATH ./ COMPLEX 1 RECEPTOR 1 LIGAND 1 COMPT ./ras-raf.prmtop RECPT ./ras.prmtop LIGPT ./raf.prmtop GC 1 AS 0 DC 0 MM 0 GB 0 PB 0 MS 0 NM 0 @MAKECRD BOX YES NTOTAL 42193 NSTART 1 NSTOP 200 NFREQ 1 NUMBER_LIG_GROUPS 1 LSTART 2622 LSTOP 3862 NUMBER_REC_GROUPS 1 RSTART 1 RSTOP 2621 @TRAJECTORY TRAJECTORY ./prod1.mdcrd TRAJECTORY ./prod2.mdcrd TRAJECTORY ./prod3.mdcrd TRAJECTORY ./prod4.mdcrd @PROGRAMS This just specifies which atoms are part of the receptor, ligand and complex as well as specifying the prmtop files corresponding to the unsolvated structures, the total number of snapshots in the trajectores, the stride length and the names of the trajectory files. We have specified that each of the output files should be written with the prefix 'snapshot'. In this setup we have defined RAS to be the receptor and RAF to be the ligand. This is purely a naming convention. $AMBERHOME/bin/mm_pbsa.pl extract_coords.mmpbsa > extract_coords.log This will take a few minutes to run. Here are the relevant output files: extract_coords.tar.gz (14 MB) You should check the log file for any errors. Also make sure the box sizes look reasonable. If there are any strange box sizes this is likely a problem with the atom numbers you selected or a corruption in one of your trajectory files. Starting with the snapshots we just extracted, we will now calculate the interaction energy and solvation free energy for complex, receptor, and ligand and average the results to obtain an estimate of the binding free energy. Please note that we will not perform a calculation of the entropy contribution to binding in this tutorial and so strictly speaking our result will not be a true free energy but could be used to compare against similar systems. E.g. one could carry out an analysis of the effect of amino acid point mutations along the binding interface. A common approach to this is referred to as alanine scanning. For demonstration purposes we will carry out the binding energy calculation using both the MM_PBSA method and the MM_GBSA method. This is accomplished with the following input file for mm_pbsa.pl (see the linked file for comments): binding_energy.mmpbsa VERBOSE 0 PARALLEL 0 PREFIX snapshot PATH ./ START 1 STOP 200 OFFSET 1 COMPLEX 1 RECEPTOR 1 LIGAND 1 COMPT ./ras-raf.prmtop RECPT ./ras.prmtop LIGPT ./raf.prmtop GC 0 AS 0 DC 0 MM 1 GB 1 PB 1 MS 1 NM 0 @PB PROC 2 REFE 0 INDI 1.0 EXDI 80.0 SCALE 2 LINIT 1000 ISTRNG 0.0 RADIOPT 0 ARCRES 0.0625 INP 1 SURFTEN 0.005 SURFOFF 0.00 IVCAP 0 CUTCAP -1.0 XCAP 0.0 YCAP 0.0 ZCAP 0.0 @MM DIELC 1.0 @GB IGB 2 GBSA 1 SALTCON 0.00 EXTDIEL 80.0 INTDIEL 1.0 SURFTEN 0.005 SURFOFF 0.00 @MS PROBE 0.0 @PROGRAMS The various portions of this input file specify which calculations to run. On which files to run them and any special parameters necessary to calculate the different contributions to the binding free energy. If you open the linked file above you will find explanations of the different terms. Values for the different parameters are determined based on empirical data and are subject to on-going research. Please find here a list of the currently recommended settings for the calculation of the nonpolar solvation free energy contribution. Example input files for binding free energy calculations with common parameter settings can be found in the $AMBERHOME/AmberTools/src/mm_pbsa/Examples/TEMPLATE_INPUT_SCRIPTS directo ry. Refer to publications in the field for more information. Note early versions of AMBER required an external poisson Boltzmann solver such as Delphi but as of AMBER 8 the internal PBSA program can be used. This program provides a significant speedup and easier integration into the AMBER framework. You can run the calculation with: $AMBERHOME/bin/mm_pbsa.pl binding_energy.mmpbsa > binding_energy.log You can watch the progress of your calculation by: tail -f binding_energy.log This calculation will take around 2 hours to run (P4 3.2 GHz). The PBSA part of the calculation on each of the 600 snapshots is what takes the majority of the time. The GBSA part of the calculation is done in seconds. To speedup the calculation you can also request parallel execution of mm_pbsa analyses for more than one snapshot at a time by specifying the number of available processors under PARALLEL. When finished you should find the following output files: binding_energy.log, snapshot_statistics.out, snapshot_com.all.out, snapshot_r ec.all.out, snapshot_lig.all.out The log file just shows if the calculation completed successfully. The all.out files give the individual energy contributions for each of the snapshots for each of the species while the statistics.out file contains the final averaged binding free energy results: snapshot_statistics.out # COMPLEX RECEPTOR LIGAND # ----------------------- ----------------------- ---------------------- - # MEAN STD MEAN STD MEAN STD # ======================= ======================= ======================= ELE -8656.78 70.18 -5602.09 63.10 -2102.25 52.57 VDW -984.99 24.34 -661.18 20.33 -256.02 12.93 INT 5085.33 50.22 3449.57 38.65 1635.76 29.42 GAS -4556.44 75.96 -2813.70 65.21 -722.52 53.50 PBSUR 65.09 1.05 45.25 0.64 27.24 0.46 PBCAL -3223.64 58.68 -2490.86 48.73 -1671.27 47.46 PBSOL -3158.55 58.26 -2445.62 48.45 -1644.03 47.31 PBELE -11880.42 34.25 -8092.96 29.34 -3773.52 17.30 PBTOT -7714.99 48.25 -5259.32 36.97 -2366.55 26.61 GBSUR 65.09 1.05 45.25 0.64 27.24 0.46 GB -3407.82 58.49 -2631.83 50.08 -1731.06 47.68 GBSOL -3342.74 58.15 -2586.58 49.83 -1703.82 47.55 GBELE -12064.60 26.94 -8233.92 23.57 -3833.31 13.40 GBTOT -7899.17 47.07 -5400.28 35.65 -2426.34 26.80 # DELTA # ----------------------- # MEAN STD # ======================= ELE -952.43 44.10 VDW -67.79 5.18 INT -0.00 0.00 GAS -1020.22 44.58 PBSUR -7.40 0.41 PBCAL 938.50 42.51 PBSOL 931.09 42.31 PBELE -13.94 9.43 PBTOT -89.13 7.94 GBSUR -7.40 0.41 GB 955.07 41.30 GBSOL 947.66 41.10 GBELE 2.63 7.41 GBTOT -72.56 6.40 The meaning of the different terms in this file is as follows: ELE = electrostatic energy as calculated by the MM force field. VDW = van der Waals contribution from MM. INT = internal energy arising from bond, angle, and dihedral terms in the MM force field. (this term always amounts to zero in the single trajectory approach). GAS = total gas phase energy (sum of ELE, VDW, and INT). PBSUR/GBSUR = non-polar contribution to the solvation free energy calculated by an empirical model. PBCAL/GB = the electrostatic contribution to the solvation free energy calculated by PB or GB, respectively. PBSOL/GBSOL = sum of non-polar and polar contributions to solvation. PBELE/GBELE = sum of the electrostatic solvation free energy and MM electrostatic energy. PBTOT/GBTOT = final estimated binding free energy calculated from the terms above. (kcal/mol) There are several things to note here. Normally the ELE, PBCAL/GBCAL, and VDW parts make up the major contributions to the binding free energy and the first two of these terms tend to approximately cancel each other out which can be checked by the value of PBELE/GBELE which should be much smaller than the contributions to it. One would typically expect to find an extremely favourable electrostatic energy and a dis-favourable solvation free energy. This symbolises the energy that ones has to use to de-solvate the binding particles and to align their binding interfaces. From the negative total binding free energy -89.13 kcal/mol we clearly see that this is a favourable protein-protein complex in pure water but keep in mind that the result does not equal the real binding free energy since we did not estimate the (dis- favourable) entropy contribution to binding. Note that the GB approach gives a slightly lower binding energy but still suggests that this is a favourable bound state. To extend this tutorial you could investigate what happens if you change the salt content of the solvent, modify specific residues or choose different starting geometries. You could also look into running the nmode calculations for finding the entropy but be aware that for a complex of this size this will be extremely memory and computationally expensive.

Comments

Description