Entering edit mode
4.9 years ago
flogin
▴
280
Hello guys, I have that file:
Input Data File: GSTE1_EXON.AB.02.fas
Variable (polymorphic) sites: 0 (Total number of mutations: 0)
Input Data File: GSTE1_EXON.AB.02.fas
Total number of mutations, Eta: 0
Theta (per site) from S, Theta-W: 0,0000000000
Variance of theta (no recombination): 0,0000000
Variance of theta (free recombination): 0,0000000
Average number of nucleotide differences, k: 0,000
Theta (per sequence) from S, Theta-W: 0,000
Variance of theta (no recombination): 0,000
Variance of theta (free recombination): 0,000
Input Data File: GSTE1_EXON.AB.02.fas
Number of pairwise comparisons: 0
Number of significant pairwise comparisons by Fisher's exact test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Number of significant pairwise comparisons by chi-square test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Fu and Li's test
Input Data File: GSTE1_EXON.AB.02.fas
Total number of mutations, Eta: 0
Fu and Li's D* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Fu and Li's F* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Input Data File: GSTE1_EXON.AB.02.fas
Total number of mutations, Eta: 0
Input Data File: GSTE1_EXON.AR.01.fas
Variable (polymorphic) sites: 0 (Total number of mutations: 0)
Input Data File: GSTE1_EXON.AR.01.fas
Total number of mutations, Eta: 0
Theta (per site) from S, Theta-W: 0,0000000000
Variance of theta (no recombination): 0,0000000
Variance of theta (free recombination): 0,0000000
Average number of nucleotide differences, k: 0,000
Theta (per sequence) from S, Theta-W: 0,000
Variance of theta (no recombination): 0,000
Variance of theta (free recombination): 0,000
Input Data File: GSTE1_EXON.AR.01.fas
Number of pairwise comparisons: 0
Number of significant pairwise comparisons by Fisher's exact test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Number of significant pairwise comparisons by chi-square test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Fu and Li's
Input Data File: GSTE1_EXON.AR.01.fas
Total number of mutations, Eta: 0
Fu and Li's D* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Fu and Li's F* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Input Data File: GSTE1_EXON.AR.01.fas
Total number of mutations, Eta: 0
Input Data File: GSTE1_EXON.CA.02.fas
Variable (polymorphic) sites: 0 (Total number of mutations: 0)
Input Data File: GSTE1_EXON.CA.02.fas
Total number of mutations, Eta: 0
Theta (per site) from S, Theta-W: 0,0000000000
Variance of theta (no recombination): 0,0000000
Variance of theta (free recombination): 0,0000000
Average number of nucleotide differences, k: 0,000
Theta (per sequence) from S, Theta-W: 0,000
Variance of theta (no recombination): 0,000
Variance of theta (free recombination): 0,000
Input Data File: GSTE1_EXON.CA.02.fas
Number of pairwise comparisons: 0
Number of significant pairwise comparisons by Fisher's exact test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Number of significant pairwise comparisons by chi-square test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Input Data File: GSTE1_EXON.CA.02.fas
Total number of mutations, Eta: 0
Fu and Li's D* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Fu and Li's F* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Input Data File: GSTE1_EXON.CA.02.fas
Total number of mutations, Eta: 0
And I create this script:
# -*- coding: utf-8 -*-
import pandas as pd # to convert the list of dictionaries to a data frame
file = open("GSTE1.txt.add","r")
# creating a list of terms that I want
keys_order = ["Input Data File","Average number of nucleotide differences, k","Total number of mutations, Eta","Theta (per sequence) from S, Theta-W","Theta (per site) from S, Theta-W","Variance of theta (no recombination)","Variance of theta (free recombination)","Theta (per sequence) from S, Theta-W","Fu and Li's D* test statistic, FLD*","Fu and Li's F* test statistic, FLF*","Fu and Li's D* test statistic","Fu and Li's F* test statistic","Number of pairwise comparisons","Number of significant pairwise comparisons by Fisher's exact test","Number of significant pairwise comparisons by chi-square test","Number of significant comparisons using the Bonferroni procedure"]
dic = {} #creating a dictionary to use in first list
dictio = {} #creating a dicitonary to use in second list
big_list = [] # creating a list to put dicionaries in a first search
list_dictio = [] # creating a list to put dictionaries in a second search
aux = ""
for line in file:
if line.strip().split(":")[0] == "Input Data File":
atrib = line.strip().split(":")[1]
if atrib == aux:
dic[line.strip().split(":")[0]] = line.strip().split(":")[1].strip()
else:
big_list.append(dic)
aux = atrib
dic = {}
for fasta in big_list:
for i in keys_order:
if i in fasta:
dictio[i]=fasta[i]
else:
dictio[i] ='-'
list_dictio.append(dictio)
dictio = dictio.fromkeys(dictio,0)
table = pd.DataFrame.from_records(list_dictio)
export_csv = table.to_csv(r'/home/user/Dropbox/jupyter/output.csv', index = None, header=True)
The output:
Traceback (most recent call last):
File "DNAsp2.py", line 16, in <module>
dic[line.strip().split(":")[0]] = line.strip().split(":")[1].strip()
IndexError: list index out of range
What doesn't make sense for me is that I run this script for other files and everything runs fine.
Can anyone help me?