Obtaining Random Sequences From Genbank
Entering edit mode
13.3 years ago
Anima Mundi ★ 2.9k


I would like to know if there is a way to obtain random sequences from Genbank's RefSeqs. I also would like to know if there is somewhere a list of valid IDs of different classes.

random sequence genbank refseq list • 2.8k views
Entering edit mode

Please clarify "random sequences". Do you want to retrieve sequences at random? If so, why? Or do you want to generate random sequence using a RefSeq sequence as a seed?

Entering edit mode
13.3 years ago
Bio_X2Y ★ 4.4k

The most comprehensive way of getting a full list of RefSeq IDs and sequences would be to download the large release files from RefSeq via their FTP site. Beware - the file structure is complex, and you will need to do some background reading to figure it out. You will also need to decide things like:

  • do you want all sequences, or just one species, e.g. human?
  • do you care which version of RefSeq you are looking at?

A simpler (but less comprehensive) method is to download a file from the UCSC Table Browser. e.g. select "RefSeq Genes" in the track field, "refGene" in the table field, and click get output. This will generate a file listing about 40,000 human NM and NR sequences. (You can get other species by selecting different options). This list will be a subset of the full RefSeq release, but should be good enough for most purposes.

Once you have the list of sequences, you can look up the corresponding sequences via the RefSeq interface.

Entering edit mode
13.2 years ago
Anima Mundi ★ 2.9k

Yes, I want to retrieve random sequences. Saying random, I mean sequences chosen from a group of RefSeq genes in a way as unbiased as possible. I am interested in generating lists of rundomly grouped genes to use as a control.

The Table Browser method helps, thank you. Indeed, I am interested only in species-specific sequences, and I find easy to repeat the download periodically.

Entering edit mode

Hi wiee, two quick tips - if you want to clarify a question, you should either (a) edit your original question, or (b) post a comment under your question. You should not post your clarification as an answer since this is confusing for other readers. Also, I see you have now opened at least 6 accounts in BioStar. The intended way to use BioStar is to reuse a single account - this way you accumulate reputation, and other people can get to know you as part of the community.

Entering edit mode

I have never meant to disturb here. The reason why I have several accounts here is that I have never really registered as user; every time my cookies change, the account does the same, but I thought I was a in legal status: I just filled the "Ask question"'s form, the multiple accounts result from the forum's technical organization. For the same reason, when the site does not recognise me as wiee I loose the faculty of making comments, and that forces me to answer other's comment in a new answer.

Entering edit mode

I really appreciate this site, but if you think that registration is a duty, in my opinion, you should not permit the guests to post. Using the same nick repeatedly, I intended to be somehow recognisable, but I feel it to be a courtesy, not a duty. Sorry for the frankness, and thank you anyway for the time you spent helping me.


Login before adding your answer.

Traffic: 938 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6