Combining different tasks in one line of code in terminal
2
0
Entering edit mode
5.9 years ago
Za ▴ 140

Hi,

I have many .xls files containing my read counts

I need to COMBINE these files while I must remove each row of each file containing "exon", removing columns 1, 2, 3, 5, 6, 8

and removing _mRNA from the end of each row of column 4

How I can do that in terminal?

I think doing so manually take too much time and would not be accurate enoughstrong text

sequencing Terminal • 1.8k views
ADD COMMENT
2
Entering edit mode

Good description of data. Please post example data and expected output

ADD REPLY
3
Entering edit mode
5.9 years ago

Take a look at csvtk, a cross-platform, efficient and practical CSV/TSV toolkit in Golang. It provides many subcommands for lots of csv/tsv manipulations:

  • csvtk xlsx2csv (usage), convert .xlsx (not .xls) to csv.
  • csvtk grep (usage), search (invert-search with -v) specific columns using regular expression.
  • csvtk cut (usage), select/unselect columns, e.g., csvtk cut -f -1,-2,-3,-5,-6,-8.
  • csvtk replace (usage), edit specifc row using replace expression, csvtk replace -f 4 -p "_mRNA$"
  • csvtk join (usage), to merge files according to key column(s).

csvtk supports reading from stdin, so you can use one line command pipe to finish all cleaning work.

Combining with shell for or parallel, you can easily parallelly handle multi files.

It's better if you can provide some sample data.

ADD COMMENT
2
Entering edit mode
5.9 years ago

Of course, if you want to do with your data being in the excel format and that too programatically (sensibly; synonym :D ) then may be you can use file parsers designed specially to manipulate excel files for e.g. this one.

Next option is transforming all data into simple text files and may be putting that here so that one can understand what you have and what you want. But before posting anything, try your best!

ADD COMMENT

Login before adding your answer.

Traffic: 2652 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6