How To Make Multiple Fasta-->Pdb Conversions For Short Peptides (4 Or 5 Aminoacid Long)
4
0
Entering edit mode
12.1 years ago
Onat • 0

Hi, I would like to perform a virtual screening for tetramers on a protein of interest. For this purpose, I randomly generate thousands of different tetramer peptide sequences in FASTA format and I need to convert them into PDB format. With Swiss PDB Viewer, it is only possible to perform the FASTA-->PDB conversion one by one as the program does not allow to upload more than one sequence at once. I am looking for a script or Unix commands to perform multiple FASTA-->PDB conversions but I have not found a solution yet. So I am wondering if it is possible to convert FASTA files into PDB format in a script, or can I execute the "load sequence from aminoacids" and then "save current layer" functions by using for instance shell.exec () function? I appreciate your help. Thank you. Best Regards

fasta pdb conversion short • 14k views
ADD COMMENT
0
Entering edit mode

I know this conversation happened a long time ago, but this is the first time I have ever used Swiss PDB viewer and I actually would like to know how to take a FASTA sequence and convert it to a pdb. I realize you guys are on a totally different level, lol, based on the fact that you made this comment "With Swiss PDB Viewer, it is only possible to perform the FASTA-->PDB conversion one by one as the program does not allow to upload more than one sequence at once." that is exactly what I want to do, I just want to convert ONE FASTA sequence to a PDB file so I can run a simulation with it...I have no idea how tho...any help would be appreciated...

ADD REPLY
0
Entering edit mode

Hi cpapamit, open a new question please, easier to answer and also for others to find it later. Also, use the search function in the forum to get older questions with information related to your problem. Try looking for 'protein structure prediction'.

ADD REPLY
1
Entering edit mode
12.1 years ago
João Rodrigues ★ 2.5k

You can use Pymol with the build_seq.py script.

EDIT: Actually, you can easily build a pipeline using Biopython to parse the FASTA files (here), then pass this to pymol and create the 3D structure in whatever secondary structure you want (coil for you I guess) using the build_seq.py script. This should be a 20 line script or so, we use it in the lab to generate peptide structures from sequence.

ADD COMMENT
0
Entering edit mode

Thanks a lot. But I am wondering what the output is. I need PDB files to do the virtual screening.

ADD REPLY
0
Entering edit mode

PDB files of course. Pymol will build you the PDB file.

ADD REPLY
0
Entering edit mode

And i could not load the fasta files into PyMol, how can i do this? thank you.

ADD REPLY
0
Entering edit mode

By using the script I can generate secondary structures by typing "build_seq SLGQ, ss=helix" for example but i could not upload fasta files to Pymol. i need to this for thousands of different tetramers

ADD REPLY
0
Entering edit mode

Use biopython to parse the fasta files, pass the seq info to pymol and then output the pdb file. It's not a one program solution, but it's very good and very simple.

ADD REPLY
0
Entering edit mode

Hello again, I have managed to do the line by line parsing of sequences and I am wondering how I can do the save as pdb step. manually it is just save .pdb, first_residue but i could not do inside my script. how can i reach the .pdb objects formed after running the build_seq command?

ADD REPLY
0
Entering edit mode

I would refer to the PymolWiki for this kind of questions. For a simple introduction to the Pymol API check this post I made a while ago. It should get you on the right path.

http://www.pymolwiki.org/index.php/Save#PYMOL_API

http://doeidoei.wordpress.com/2009/02/11/pymol-api-simple-example/

ADD REPLY
0
Entering edit mode

The following script works for me to convert multiple sequence FASTA to PDB files, I hope it works for someone else. The file.fasta contains all the fasta sequences. This script is saved with .pym extension. The build_seq.py and seq_convert.py files can be saved in the PyMol directory.

import build_seq
from Bio import SeqIO
document = "_document"
for seq_record in SeqIO.parse("file.fasta", "fasta"):
    build_seq.build_seq(seq_record.seq)
    cmd.select(document,"all")
    cmd.save seq_record.id+".pdb", document, -1, 'pdb')
    cmd.delete("all")
ADD REPLY
0
Entering edit mode
12.1 years ago
Onat ▴ 40

but Pymol can create pdb file after get the command "build_seq " ",ss=helix" and I should do this inside a loop for thousands of sequences. and i need to save them as separate pdb files.

ADD COMMENT
1
Entering edit mode

Ok now I am becoming familiar with Phyton. I hope I can write the necessary script soon. Thank you.

ADD REPLY
0
Entering edit mode

Pymol allows for python scripting. Therefore, as I said before, it is very easy to feed sequences in a for loop to pymol and have it output the corresponding structures. The first answer I gave you had all the links necessary to write the script you want.

ADD REPLY
0
Entering edit mode
12.1 years ago
Woa ★ 2.9k

I'm just curious about one thing: Do you need to generate all the possible 20^4 (1,60000) possible tetra peptides and get their structures from Seq->PDB, ( with an energy minimization possibly), or, you wish to get all the tetra peptide structures that can be be found in protein structures that are available in (non-redundant,highresolution ) PDB?

ADD COMMENT
0
Entering edit mode

On topic, Molsoft ICM can do that along with energy minimization with its scripting engine, but that's commercial, and I think you've already figured it out with Pymol

ADD REPLY
0
Entering edit mode

I want to have the structures and thus PDB files for each of the tetramers. Pymol is capable to do this job actually. Thanks

ADD REPLY
0
Entering edit mode

Hi Onat, I'm facing the same situation that you pass when you ask here about building peptide 3D structure from short peptide sequences. I'm a begginer in programing(python) and bioinformatics. I have some questions about the details of the operations describe here:

What tipe of output from parsing can I do? I think I must to separate the sequences from the headers and save the sequences in a new file..this is right?

After I have to parsing to write before my sequence "build_seq SLGQ,ss=helix" ..this is right?

So I must to copy all the text in file and paste in the pymol comand line.. is this it?

Thanks in advance

ADD REPLY
0
Entering edit mode
8.2 years ago
Onat • 0

Hi Jgribeiro3,

Sorry for the late reply, I have just seen your question. Firstly I created random short peptide sequences by R programming. The random peptide sequences were written in a txt. file. By using the "build_seq" script, I was able to create short peptide sequences written in the txt. file. I created the PDB structures for each sequence afterwards. So you can write a short python script to read the sequences from txt.file and to run the "build_seq" script automatically for each peptide sequence.

I hope this answer would be helpful.

ADD COMMENT
0
Entering edit mode

Hello Onat Im new in python. I have about 17000 peptide sequences that I need their PDB structures separately, the same as you. All of my sequences are in a file named seq.fasta. Could you please explain more how to do this by PyMOL? Tnx

ADD REPLY

Login before adding your answer.

Traffic: 2127 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6