Last step of metagenome analysis before visualization
0
0
Entering edit mode
13 days ago
Ayda Ecem • 0

I am trying to make a metagenome analysis for plant species. Since qiime2 uses Silva database and that specific database is commonly used for bacteria I customized all of my codes. Rn I have app. 11k row taxon ids that I get from NCBI database , but Im a having trouble doing a taxonomy match with those taxonomy ids. I need to match the taxonomy and filter the plant species and plot a pie chart for those plant species. I am told that NCBI does not have an API to use it to get the taxonomy names.

How can I solve my problem? Also, my code can be found below:

import pandas as pd import bs4

from Bio import Entrez

Initialize the NCBI email account

Entrez.email = "email_address"

def get_taxonomic_info(accession_number): """ Queries the NCBI database for taxonomic information of a given accession number.

Parameters:
- accession_number (str): The NCBI accession number.

Returns:
- str: The taxonomic information as a string.
"""
handle = Entrez.efetch(db="nuccore", id=accession_number, rettype="gb", retmode="text")
record = SeqIO.read(handle, "genbank")
handle.close()

# Extracting the taxonomic information
taxonomic_lineage = ""
for feature in record.features:
    if feature.type == "source":
        taxonomic_lineage = feature.qualifiers["db_xref"][0].split(":")[1]
        break

return taxonomic_lineage

def main():

# Load the Excel file
df = pd.read_excel(r"file_path")

# Extract the accession numbers
accession_numbers = df.iloc[:, 1].tolist()  # Assuming the accession numbers are in the second column

# Prepare the output file
with open(r"output_path", "w") as outfile:
    for accession_number in accession_numbers:
        taxonomic_info = get_taxonomic_info(accession_number)
        outfile.write(f"{accession_number}\t{taxonomic_info}\n")

if __name__ == "__main__": main()

metagenome python analyis • 100 views
ADD COMMENT

Login before adding your answer.

Traffic: 2628 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6