Get part of sequence from genome, given a start and stop position with Java.
1
0
Entering edit mode
5.0 years ago

I've got VCF-like files with start, stop, REF and ALT columns. I need to check that the REF position from the variants are the same as the one in the genome, to check if they're from the same built. I also need the surrounding nucleotides of the given position. Also, some of the REF columns are empty and because of this, it is not an appropriate VCF file.

I've got a fasta file which has the genome for chromosome 1, and I was wondering if there's a library available to get a part of the genome in nucleotides, given a start- and stop position. For example, if you've got the genome AACCGGTT, that given a start position of 1 and a stop position of 4 it returns AACC. I could write such a parser myself, but I'd rather use a library which has the edge-cases covered.

I'd rather have something locally than use the API of NCBI, which also makes this possible.

genome java vcf • 1.1k views
ADD COMMENT
0
Entering edit mode

Hi, You can use bedtools getfasta .

Best

ADD REPLY
0
Entering edit mode

samtools faidx, pyfaidx, bedtools getfasta can all retrieve parts of fasta sequence given a start and stop. While not libraries they may be an option to consider.

@Pierre has his Javarkit which may have something that will work (if you must use Java): http://lindenb.github.io/jvarkit/

ADD REPLY
0
Entering edit mode

If it's anything like BioPython and you absolutely must use Java, there's no doubt something in BioJava which you could use.

I know less than nothing about Java specifically though so can't offer any practical code for this.

ADD REPLY
1
Entering edit mode
5.0 years ago

use the htsjdk library and the class IndexedFastaSequenceFile https://samtools.github.io/htsjdk/javadoc/htsjdk/htsjdk/samtools/reference/IndexedFastaSequenceFile.html

(...)
faidx =new IndexedFastaSequenceFile(fastaFile);
sub = faidx.getSubsequenceAt("chr1",10,20).getBaseString();
(...)
ADD COMMENT
0
Entering edit mode

Yes, thank you! I was just looking at this library, but couldn't find the right function.

ADD REPLY
0
Entering edit mode

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.

Upvote|Bookmark|Accept

ADD REPLY

Login before adding your answer.

Traffic: 2122 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6