Biostar Beta. Not for public use.
Question: How to convert NGS .tabular file to excel file?
0
Entering edit mode

I am using Galaxy for the NGS analysis. When I use tools like featureCounts, DESeq, etc., it gives output file in .tabular format. I need to see this data in a well-organized table in Microsoft Excel (xlsx) format. Could anyone help, please?

ADD COMMENTlink 17 months ago Glory Basumata • 140 • updated 15 months ago writetoroopali • 0
Entering edit mode
2

try importing it just the way you import a tab separated file in excel and also post few lines from example file, from next time Glory Basumata

ADD REPLYlink 17 months ago
cpad0112
11k
Entering edit mode
6

enter image description here

ADD REPLYlink 17 months ago
Pierre Lindenbaum
120k
Entering edit mode
0

Hi all, Thanks for the instructions. I will try this.

@Pierre: I like the cartoon icon :) Will try opening the file in R instead.

ADD REPLYlink 17 months ago
Glory Basumata
• 140
Entering edit mode
0

Look at this post to see a live example of what Excel does to some gene names.

ADD REPLYlink 17 months ago
h.mon
25k
Entering edit mode
0

Hi All, Thank you for your responses to my question. I am sharing the DESeq2 result output file so that you could try opening it. I am trying to filter out top 10 to 50 genes which are either upregulated or downregulated for this file.

Link to file: https://drive.google.com/file/d/1ItlSqE8jH645z_REd2gc_oVJo6_CaHPf/view?usp=sharing

ADD REPLYlink 17 months ago
Glory Basumata
• 140
Entering edit mode
1

Hello Glory Basumata ,

you should better start a new thread for this as it doesn't fit to your original question. But before doing it, try to solve this by yourself as this is a trivial task with unix commands sort and heador tail.

fin swimmer

ADD REPLYlink 17 months ago
finswimmer
11k
Entering edit mode
0

Hey can you please tell me how did you open the .tabular file in R or excel?

ADD REPLYlink 15 months ago
writetoroopali
• 0
Entering edit mode
0

What have you tried? Did you check with a simple read.table?

ADD REPLYlink 15 months ago
RamRS
21k
2
Entering edit mode

Hello Glory Basumata ,

first of all: You don't realy want to do this. It is very likely, that at a later points you will convert it back to something, that a too, which you like to use for further analysis, needs and then the nightmare begins ...

If you don't take care of my warning than you simply can open those files in excel. It will ask you how to convert and you have to select that those file are tab delimited. In the next step you can choose what is the decimal seperator.

But you have to be aware that excel also might try to convert some values to something it believe it could be. So e.g. a gene called SEP09can become 01.09.2009.

So again: DON'T DO THIS!

fin swimmer

ADD COMMENTlink 17 months ago finswimmer 11k
Entering edit mode
0

Hi finswimmer,

Thank you for the warning. I appreciate your concern. I understand that excel is not the right tool for NGS study. My main intention to open/visualize tabular file is to check the DESeq result and filter out the differentially expressed gene. I also want to filter out the top 10 or 20 or 50 upregulated and downregulated genes. Furthermore, I would like to do GO (Gene Ontology) enrichment using GOseq. Kindly suggest a tool to study such tabular files. Thank you.

ADD REPLYlink 17 months ago
Glory Basumata
• 140
Entering edit mode
1

All of that can and should be done in R. It has a steeper learning curve than just using Excel, but investing time now will definitely pay off in the long run.

ADD REPLYlink 17 months ago
WouterDeCoster
39k
Entering edit mode
1

You can use any tool that treats the file as a plain text file and the data in it as text. Excel does not do the latter implicitly. Read this paper to see a few examples of what can happen.

ADD REPLYlink 17 months ago
RamRS
21k
Entering edit mode
0

Thanks for sharing the paper. I'll add it to my knowledge and personal folder.

ADD REPLYlink 17 months ago
Glory Basumata
• 140
2
Entering edit mode

Hi,

here is an python script which takes as first argument TSV and second argument creates xlsx file.

import csv
import sys
reload(sys)
sys.setdefaultencoding('utf8')
from xlsxwriter.workbook import Workbook

# Add some command-line logic to read the file names.
tsv_file = sys.argv[1]
xlsx_file = sys.argv[2]

# Create an XlsxWriter workbook object and add a worksheet.
workbook = Workbook(xlsx_file)
worksheet = workbook.add_worksheet()

# Create a TSV file reader.
tsv_reader = csv.reader(open(tsv_file, 'rb'), delimiter='\t')

# Read the row data from the TSV file and write it to the XLSX file.
for row, data in enumerate(tsv_reader):
    worksheet.write_row(row, 0, data)

# Close the XLSX file.
workbook.close()

You can save this code to file as conversion.py and make executable:

chmod +x conversion.py

Then run it as:

./conversion.py input.tsv output.xlsx
ADD COMMENTlink 17 months ago Paul ♦ 1.3k
Entering edit mode
2

What is the advantage of this script over a plain import? Does this bypass the auto datatype assignment that Excel does?

ADD REPLYlink 17 months ago
RamRS
21k
Entering edit mode
0

Hi Paul, thanks for sharing the script. However, does it work with the format (.tabular)? Also how useful is this?

ADD REPLYlink 17 months ago
Glory Basumata
• 140
Entering edit mode
2

.tabular is a file extension, not a data format. Your data format is probably tab-separated values, and this script should work on any tab-separated dataset.

ADD REPLYlink 17 months ago
RamRS
21k
2
Entering edit mode

Have you tried just opening the output in Excel? It's likely just tab delimited text, Excel shouldn't have an issue with that. Just pay attention to what Excel does to your gene names.

ADD COMMENTlink 17 months ago swbarnes2 5.7k
Entering edit mode
0

Unless of course, the data has an unusable number of rows or columns. Or the process of Excel-opening it crashes the app or worse, the computer. Better off opening a 10kb chunk of it.

ADD REPLYlink 17 months ago
RamRS
21k
Entering edit mode
0

In which case, converting it's format won't help. The poster in that case will need to use R, or a database program. But the poster almost certainly does not need to actually convert their file to .xlsx format.

ADD REPLYlink 17 months ago
swbarnes2
5.7k
Entering edit mode
0

I'm not saying "converting format will help". I'm saying "Don't use Excel even for opening/viewing the file"

ADD REPLYlink 17 months ago
RamRS
21k
Entering edit mode
0

Hi swbarnes2, I tried opening it with excel but it just refuses to recognize the file extension .tabular. Howeve, this file type/extention could be imported in the program R and viewed.

ADD REPLYlink 17 months ago
Glory Basumata
• 140

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0