sort and merge fasta files by chromosome number
1
0
Entering edit mode
4.8 years ago
woojoy14 ▴ 10

I have been practicing gatk best practice and got this problem stuck!

I downloaded h19.chromFa at ucsc and now am trying to sort and merge it by chromosome number.

I used the code below from this blog https://digibio.blogspot.com/2014/07/sort-and-merge-fasta-files-by.html

cat chrM.fa `ls  *.fa | sort -V | grep -i -v chrM `  > hg19.fa


>grep chr hg19.fa

>chrM
>chr1
>chr1_gl000191_random
>chr1_gl000192_random
>chr2
>chr3
>chr4
>chr4_ctg9_hap1
>chr4_gl000193_random
>chr4_gl000194_random
>chr5
>chr6
>chr6_apd_hap1
>chr6_cox_hap2
>chr6_dbb_hap3
>chr6_mann_hap4
>chr6_mcf_hap5
>chr6_qbl_hap6
>chr6_ssto_hap7
>chr7
>chr7_gl000195_random
>chr8
>chr8_gl000196_random
>chr8_gl000197_random
>chr9
>chr9_gl000198_random
>chr9_gl000199_random
>chr9_gl000200_random
>chr9_gl000201_random
>chr10
>chr11
>chr11_gl000202_random
>chr12
>chr13
>chr14
>chr15
>chr16
>chr17
>chr17_ctg5_hap1
>chr17_gl000203_random
>chr17_gl000204_random
>chr17_gl000205_random
>chr17_gl000206_random
>chr18
>chr18_gl000207_random
>chr19
>chr19_gl000208_random
>chr19_gl000209_random
>chr20
>chr21
>chr21_gl000210_random
>chr22
>chrX
>chrY

However, as you can see above, it didn't work.

How can I get rid of every file with 'hap' and 'random' and 'chrM'?

Thank you for your help in advance!

sorting fasta • 2.1k views
ADD COMMENT
3
Entering edit mode
4.8 years ago

How can I get rid of every file with 'hap' and 'random' and 'chrM'?

ls *.fa | grep -vE '_|chrM' | sort -V | xargs  cat > hg19.fasta
ADD COMMENT
0
Entering edit mode

Thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 1524 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6