Tutorialprotein

Step 2. Generate topology file of protein

In the previous step, the CG PDB (system.cg.pdb) was obtained from the AA PDB (step5_assemply.pdb) using a python script and json file. As seen in the tutorial for lipid membrane systems, we have to prepare topology files to generate input files for LAMMPS using a command of spica-tools, called setup_lmp. Although lipid topology files are included in the force field file used in tutorial for lipid membrane systems or generated with a spica-tools command, json2top, protein topology files are basically prepared using another spica-tools command, ENM. The command is needed to apply an elastic network to fix secondary structures of proteins in CG-MD simulations.

To perform the python script, we must prepare CG PDB file including only the protein of the CG initial configuration. For example, we can use linux commands to extract the protein data from the CG initial configuration (system.cg.pdb), in which we have DOPC lipids, TIP3 water, CLA chloride ions with protein:

$ grep -v DOPC system.cg.pdb | grep -v TIP3 | grep -v CLA > protein.cg.pdb

An example file of the CG protein PDB file (protein.cg.pdb) is given here. Because the bead types of backbones are determined depending on secondary structures of the protein in the SPICA FF ver2, we must also prepare the all-atom pdb file of the protein (protein.aa.pdb). Also, we need to install the DSSP program for assignment of secondary structures (see here for installation) and create a PATH to the dssp binary program. After preparing the CG and AA protein PDB file, we will apply the ENM command to generate the topology file of the CG protein. The command line will be:

$ cg_spica ENM -aapdb protein.aa.pdb protein.cg.pdb protein.cg.top

protein.cg.top is the CG topology file of the protein, with elastic network bonds with equilibrium lengths taken from the input PDB file. This file should look like the following:

atom 1 VAL GBTL GBTL 56.0385 0.1118 P
atom 2 VAL VAL VAL 43.0883 0.0000 P
atom 3 ARG GBML GBML 56.0385 0.0000 P
...

bondparam 3 14 1.195000 8.852091 # GBM-GBM
bondparam 9 20 1.195000 8.033982 # GBM-GBM
...

bond 1 3 # GBM-GBM
bond 3 6 # GBM-GBM
...

angleparam 1 3 6 10.0 130.0000# GBM GBM GBM
...

dihedralparam 10 11 12 13 50.0 1 180 0.0 # PH1-PH2-PH3-PH4
...

For rows where the first column is bondparam, the second and third columns specify the backbone pair indices of the applied elastic network bond, and the forth and fifth columns show a force constant and an equilibrium lenth of the bond, respectively. This can be downloaded here as an example.

Note that, if you want to prepare the topology file of the protein for SPICA FF ver1, in which the backbone bead types are not dependent on the secondary structures of the protein, add the flag "-v1" in the command line:

$ cg_spica ENM -v1 protein.cg.pdb protein.cg.v1.top

An example topology file of ver1 is given here.

Note that the python script generates a topology file for the entire protein (backbone segment) contained in the input PDB files. Thus, the input PDB file should contain only a single protein molecule (monomer). If there are several different types of protein molecules in the system, the script must be applied to each protein molecule separately to obtain the topology file. The obtained topology file should be renamed, because the script produces and overwrites the topology file, "protein.cg.top". If you have multiple protein molecules in the input PDB file, the script will generate a topology file with an elastic network all over the protein molecules including the intermolecular elastic networks. (You may do this on the purpose, though this is not a typical case.)

Top

SPICA Force Field

Okayama University

SPICA: Surface Property fItting Coarse grAined model

Tutorialprotein

Step 2. Generate topology file of protein

SPICA Force Field