Question

How can I extract each apasite possibility from this table?

0

Entering edit mode

3.5 years ago

gandrescabrera ▴ 80

Hello,

I have an output from APATrap package, which gives back the predicted APA sites from samples. I need to make a list from all the possibles APAs to cross it with miRNAs database and get known if there is a match. The problem is a have a huges lists from 16 samples, and I need a way to automatize this task for reproducibility.

I have this summed up table:

Predicted_APA.TOTAL         Loci.TOTAL

41867799,41867388,41866990  chr17:41866927-41867904

And I need something like this:

CHR Loci.UTR.TOTAL            Locus initial   Locus Final   

chr17   41866927-41867799   41866927          41867799      
chr17   41866927-41867388     41866927        41867388
chr17   41866927-41866990   41866927          41866990

Do you know any tip or clue to do it for 16 samples with 1000 results each one?

Thanks in advance, and sorry if my question isn't appropriate. I will answer any question for further people in the same situation.

RNA-Seq R rna-seq genome • 477 views

ADD COMMENT • link 3.5 years ago by gandrescabrera ▴ 80

score 3 · Accepted Answer · 2020-11-11

input:

$cat test.txt 

Predicted_APA.TOTAL Loci.TOTAL
41867799,41867388,41866990  chr17:41866927-41867904

output:

$awk -v FS='[-:\t]' -v OFS="\t"  ' NR != 1 {split($1,a,","); for (i in a) {print $2,$3"-"a[i],$3,a[i]}}' test.txt

chr17   41866927-41867799   41866927    41867799
chr17   41866927-41867388   41866927    41867388
chr17   41866927-41866990   41866927    41866990