I need to download all/many sequences of a specific bacterial gene from Genbank nuccore database from entries limited to complete genome sequences. I prefer using R. Querying: 'Bacteria[ORNG] AND gyrB[GENE] AND complete genome[TI] ' in web interface results in >10k hits. I do not want to download whole genome sequences but only extracted gyrB sequences to make a local database. I tried
library(rentrez): db = "nuccore" query = "Bacteria[ORGN] AND gyrB[GENE] AND complete[TI]" found = entrez_search(db, query, config = NULL, retmode = "xml", use_history = FALSE, retmax = 90000)
but this fetch ids for whole genome sequences. Is it possible to get fasta sequences for gryB genes or at least gyrB coordinates however I'm not into downloading whole genome sequences of thousands of genomes.