Question

Off topic:IndexError: list index out of range

0

Entering edit mode

4.9 years ago

flogin ▴ 280

Hello guys, I have that file:

Input Data File: GSTE1_EXON.AB.02.fas
   Variable (polymorphic) sites: 0   (Total number of mutations: 0)
Input Data File: GSTE1_EXON.AB.02.fas
 Total number of mutations, Eta: 0
 Theta (per site) from S, Theta-W: 0,0000000000
    Variance of theta (no recombination): 0,0000000
    Variance of theta (free recombination): 0,0000000
 Average number of nucleotide differences, k: 0,000
 Theta (per sequence) from S, Theta-W: 0,000
    Variance of theta (no recombination): 0,000
    Variance of theta (free recombination): 0,000
 Input Data File: GSTE1_EXON.AB.02.fas
 Number of pairwise comparisons: 0
 Number of significant pairwise comparisons by Fisher's exact test: 0
    Number of significant comparisons using the Bonferroni procedure: 0
 Number of significant pairwise comparisons by chi-square test: 0
    Number of significant comparisons using the Bonferroni procedure: 0
Fu and Li's test
 Input Data File: GSTE1_EXON.AB.02.fas
 Total number of mutations, Eta: 0
 Fu and Li's D* test statistic: 0,00000
     Statistical significance: Not significant, P > 0.10
 Fu and Li's F* test statistic: 0,00000
     Statistical significance: Not significant, P > 0.10
 Input Data File: GSTE1_EXON.AB.02.fas
 Total number of mutations, Eta: 0
 Input Data File: GSTE1_EXON.AR.01.fas
   Variable (polymorphic) sites: 0   (Total number of mutations: 0)
 Input Data File: GSTE1_EXON.AR.01.fas
 Total number of mutations, Eta: 0
 Theta (per site) from S, Theta-W: 0,0000000000
    Variance of theta (no recombination): 0,0000000
    Variance of theta (free recombination): 0,0000000
 Average number of nucleotide differences, k: 0,000
 Theta (per sequence) from S, Theta-W: 0,000
    Variance of theta (no recombination): 0,000
    Variance of theta (free recombination): 0,000
 Input Data File: GSTE1_EXON.AR.01.fas
 Number of pairwise comparisons: 0
 Number of significant pairwise comparisons by Fisher's exact test: 0
    Number of significant comparisons using the Bonferroni procedure: 0
 Number of significant pairwise comparisons by chi-square test: 0
    Number of significant comparisons using the Bonferroni procedure: 0
Fu and Li's
 Input Data File: GSTE1_EXON.AR.01.fas
 Total number of mutations, Eta: 0
 Fu and Li's D* test statistic: 0,00000
     Statistical significance: Not significant, P > 0.10
 Fu and Li's F* test statistic: 0,00000
     Statistical significance: Not significant, P > 0.10
 Input Data File: GSTE1_EXON.AR.01.fas
 Total number of mutations, Eta: 0
 Input Data File: GSTE1_EXON.CA.02.fas
   Variable (polymorphic) sites: 0   (Total number of mutations: 0)
 Input Data File: GSTE1_EXON.CA.02.fas
 Total number of mutations, Eta: 0
 Theta (per site) from S, Theta-W: 0,0000000000
    Variance of theta (no recombination): 0,0000000
    Variance of theta (free recombination): 0,0000000
 Average number of nucleotide differences, k: 0,000
 Theta (per sequence) from S, Theta-W: 0,000
    Variance of theta (no recombination): 0,000
    Variance of theta (free recombination): 0,000
 Input Data File: GSTE1_EXON.CA.02.fas
 Number of pairwise comparisons: 0
 Number of significant pairwise comparisons by Fisher's exact test: 0
    Number of significant comparisons using the Bonferroni procedure: 0
 Number of significant pairwise comparisons by chi-square test: 0
    Number of significant comparisons using the Bonferroni procedure: 0
 Input Data File: GSTE1_EXON.CA.02.fas
 Total number of mutations, Eta: 0
 Fu and Li's D* test statistic: 0,00000
     Statistical significance: Not significant, P > 0.10
 Fu and Li's F* test statistic: 0,00000
     Statistical significance: Not significant, P > 0.10
 Input Data File: GSTE1_EXON.CA.02.fas
 Total number of mutations, Eta: 0

And I create this script:

# -*- coding: utf-8 -*-
import pandas as pd # to convert the list of dictionaries to a data frame
file = open("GSTE1.txt.add","r")
# creating a list of terms that I want
keys_order = ["Input Data File","Average number of nucleotide differences, k","Total number of mutations, Eta","Theta (per sequence) from S, Theta-W","Theta (per site) from S, Theta-W","Variance of theta (no recombination)","Variance of theta (free recombination)","Theta (per sequence) from S, Theta-W","Fu and Li's D* test statistic, FLD*","Fu and Li's F* test statistic, FLF*","Fu and Li's D* test statistic","Fu and Li's F* test statistic","Number of pairwise comparisons","Number of significant pairwise comparisons by Fisher's exact test","Number of significant pairwise comparisons by chi-square test","Number of significant comparisons using the Bonferroni procedure"]
dic = {} #creating a dictionary to use in first list
dictio = {} #creating a dicitonary to use in second list
big_list = [] # creating a list to put dicionaries in a first search
list_dictio = [] # creating a list to put dictionaries in a second search
aux = "" 

for line in file:
    if line.strip().split(":")[0] == "Input Data File": 
        atrib = line.strip().split(":")[1]
    if atrib == aux:
        dic[line.strip().split(":")[0]] = line.strip().split(":")[1].strip()
    else: 
        big_list.append(dic)
        aux = atrib
        dic = {} 

for fasta in big_list: 
    for i in keys_order:
        if i in fasta:
            dictio[i]=fasta[i]
        else:
            dictio[i] ='-'
    list_dictio.append(dictio) 
    dictio = dictio.fromkeys(dictio,0) 
table = pd.DataFrame.from_records(list_dictio) 
export_csv = table.to_csv(r'/home/user/Dropbox/jupyter/output.csv', index = None, header=True)

The output:

Traceback (most recent call last):
  File "DNAsp2.py", line 16, in <module>
    dic[line.strip().split(":")[0]] = line.strip().split(":")[1].strip()
IndexError: list index out of range

What doesn't make sense for me is that I run this script for other files and everything runs fine.

Can anyone help me?

python dictionary • 846 views

ADD COMMENT • link 4.9 years ago by flogin ▴ 280