1.pita_101

Question

I need help to create 2 SNPEFF databases (v1.01/v2.01) of Pinus taeda

0

Entering edit mode

5.9 years ago

nie shuai • 0

In recent days, I have been trying to create two databases (v1.01/v2.01) of Pinus taeda , but I don’t know why it failed. So ,I need your help to create their database.

The required information is as follows：

v1.01: Name of the species Pinus taeda L Reference genome version v101 A link to the reference genome sequence https://treegenesdb.org/FTP/Genomes/Pita/v1.01/ptaeda.v1.01.scaffolds.fasta.gz A link to a gene and transcript definition file low quality:https://treegenesdb.org/FTP/Genomes/Pita/v1.01/annotation/pita.LQgenes.gff3.gz 　 high quality:https://treegenesdb.org/FTP/Genomes/Pita/v1.01/annotation/pita.HQgenes.gff3.gz A link to the protein sequences low quality:https://treegenesdb.org/FTP/Genomes/Pita/v1.01/annotation/pita.LQgenes.proteins.fasta.gz 　 high quality:https://treegenesdb.org/FTP/Genomes/Pita/v1.01/annotation/pita.HQgenes.proteins.fasta.gz Codon table information The Standard Code

v2.01: Name of the species Pinus taeda L Reference genome version v201 A link to the reference genome sequence https://treegenesdb.org/FTP/Genomes/Pita/v2.01/genome/Pita.2_01.fa.gz A link to a gene and transcript definition file https://treegenesdb.org/FTP/Genomes/Pita/v2.01/annotation2.0/pita2.01_allGeneModels_renamed.gff3 A link to the protein sequences https://treegenesdb.org/FTP/Genomes/Pita/v2.01/annotation2.0/pita2.01_allGeneModels_prots_renamed.fasta Codon table information The Standard Code

It's my code to build databases, and I don't know why it failed:

1.pita_101

1.1.Configure

vim snpEff.config

pita,version pita_101

pita_101.genome:pita

1.2. build path

mkdir ~/src/snpEff/snpEff/data
mkdir ~/src/snpEff/snpEff/data/genomes
mkdir ~/src/snpEff/snpEff/data/pita_101

1.3.Get the reference genome sequence (it's so large that I used soft connections. )

cd ~/src/snpEff/snpEff/data/genomes
ln -s mypath/ptaeda.v1.01.scaffolds.fasta pita_101.fa

1.4.Get genome annotations.we merge HQgenes and LQgenes into pita_gene.

gff

cd ~/src/snpEff/snpEff/data/pita_101
cp mypath/pita_gene.gff3 ~/src/snpEff/snpEff/data/pita_101
mv pita_gene.gff3 genes.gff

protein.fa

cp mypath/pita.proteins.fasta ~/src/snpEff/snpEff/data/pita_101
mv pita.proteins.fasta protein.fa

1.5.create the database

cd ~/src/snpEff/snpEff
java -jar snpEff.jar build -gff3 -v pita_101

1.6.run

java -Xmx30g -jar snpEff.jar pita_101

ERROR while connecting to http://downloads.sourceforge.net/project/snpeff/databases/v4_3/snpEff_v4_3_pita_101.zip

2.pita_201

2.1.Configure

cd ~/src/snpEff/snpEff/ vim snpEff.config

pita,version pita_201

pita_201.genome:pita

2.2. build path

mkdir ~/src/snpEff/snpEff/data/pita_201

2.3.Get the reference genome sequence (it's so large that I used soft connections. )

cd ~/src/snpEff/snpEff/data/genomes
ln -s mypath/pita_2.0/Pita.2_01.fa pita_201.fa

2.4.Get genome annotations.we merge HQgenes and LQgenes into pita_gene.

gff

cd ~/src/snpEff/snpEff/data/pita_201
wget http://treegenesdb.org/FTP/Genomes/Pita/v2.01/annotation/Pita.2_01.gff.gz
gzip -d Pita.2_01.gff.gz
mv Pita.2_01.gff genes.gff

2.5.create the database

cd ~/src/snpEff/snpEff
java -Xmx64g -jar snpEff.jar build -gff3 -v pita_201

2.6.run

java -Xmx30g -jar snpEff.jar pita_201

ERROR while connecting to http://downloads.sourceforge.net/project/snpeff/databases/v4_3/snpEff_v4_3_pita_201.zip

SNPEFF • 1.7k views

ADD COMMENT • link updated 5.8 years ago by Biostar 20 • written 5.9 years ago by nie shuai • 0

0

Entering edit mode

Please, add proper formatting to your post, in particular (but not limited to) by using the code button. I've done some formatting for you, but your post still needs attention.

Be aware BioStars uses markdown, so some characters add formatting to the post - as you may have noted. For example, # at the beginning of a line is not a comment, it formats text as heading:

bigger text here (here I wrote `# bigger text here`)

And not:

# comment (here I wrote \# comment)

P.S.: you are trying to perform a task twice and failing both times for the same reason, you could keep your post shorter and easier to read by posting the problem for only one database - once you solve for one case, you will solve for the other.

ADD REPLY • link 5.9 years ago by h.mon 35k

0

Entering edit mode

Try addressing:

ERROR while connecting to http://downloads.sourceforge.net/project/snpeff/databases/v4_3/snpEff_v4_3_pita_101.zip

ADD REPLY • link 5.8 years ago by cpad0112 21k

1.pita_101

1.1.Configure

pita,version pita_101

1.2. build path

1.3.Get the reference genome sequence (it's so large that I used soft connections. )

1.4.Get genome annotations.we merge HQgenes and LQgenes into pita_gene.

gff

protein.fa

1.5.create the database

1.6.run

2.pita_201

2.1.Configure

pita,version pita_201

2.2. build path

2.3.Get the reference genome sequence (it's so large that I used soft connections. )

2.4.Get genome annotations.we merge HQgenes and LQgenes into pita_gene.

gff

2.5.create the database

2.6.run

bigger text here (here I wrote # bigger text here)

bigger text here (here I wrote `# bigger text here`)