Step 2. Generate topology file of protein
In the previous step, we obtained the CG PDB (system.cg.pdb) from the AA PDB (step5_assemply.pdb) using the python script and json file.
As seen in tutorial for lipid membrane systems
, we have to prepare topology files to generate LAMMPS input files using a tool, called "setup_lammps".
Although lipid topology files are included in the force field file used in tutorial for lipid membrane systems
, protein topology files are basically prepared using a python script "gen_top_elastic_network.py".
The script is needed to apply an elastic network to fix secondary structures
of proteins in CG-MD simulations, and the script works with Python 2.
download PYTHON script to generate protein topology file
To perform the python script, we must prepare CG PDB file including only the protein of the CG initial configuration. For example, we can use linux commands to extract the protein data from the CG initial configuration (system.cg.pdb):
$ grep -v DOPC system.cg.pdb | grep -v TIP3 | grep -v CLA > protein.cg.pdb
A example file of the CG protein PDB file (protein.cg.pdb) is given here
. After preparing the CG protein PDB file, we will apply the python script to generate the topology file of the CG protein. The command for the python script will be:
$ gen_top_elastic_network.py protein.cg.pdb protein.cg.top
The "protein.cg.top" is the CG topology file of the protein and has elastic network bonds
whose equilibrium lengths are taken from the input PDB file. This file should look like as follows:
atom 1 VAL GBT GBT 56.0385 0.1118 P
atom 2 VAL VAL VAL 43.0883 0.0000 P
atom 3 ARG GBM GBM 56.0385 0.0000 P
bondparam 3 14 1.195000 8.852091 # GBM-GBM
bondparam 9 20 1.195000 8.033982 # GBM-GBM
bond 1 3 # GBM-GBM
bond 3 6 # GBM-GBM
angle 1 3 6 # GBM GBM GBM
dihedralparam 10 11 12 13 50.0 1 180 0.0 # PH1-PH2-PH3-PH4
In the lines where the first column shows "bondparam", the second
and third columns specify backbone pair indices of the applied elastic
network bond. The forth and fifth columns show a force constant and an
equilibrium lenth of the bond. This can be downloaded here
as an example.
Note that the python script will generate the topology file for a whole protein (backbone
segments) in the input PDB files
. So the input PDB file should contain only a single protein molecule (monomer), even though you might have several different types of protein molecules in your system. In that case, the script should be separately applied to each different protein molecule to obtain the topology file. You should rename the topology file, because this script always overwrite "protein.cg.top".
If you have multiple protein molecules in the input PDB, the script will
generate a topology file with an elastic network all over the protein molecules
including the intermolecular elastic networks. (You may do this on the
purpose, though this is not a typical case.)