6 Reading frames in R for protein from DNA sequence
1
0
Entering edit mode
8.9 years ago

Can any one help to get 6 reading frames in R for protein sequences from DNA sequence?

Thanks

Bioconductor R • 4.0k views
ADD COMMENT
0
Entering edit mode

My solution:

getTrans {seqinr} this command is useful.

ADD REPLY
0
Entering edit mode
7.9 years ago

Read a fasta file

library(seqinr) 
fast_file <- read.fasta("E:\\SAS\\Nuc\\t_GCA_000536675.1.fa",as.string = TRUE,seqtype="AA")

Sample header of the fasta file

.>adpmb-supercont1.5 dna:supercontig supercontig:GCA_000536675.1:adpmb-supercont1.5:1:221212:1 CCTTGTTAATTTTTATTTAAAGTAAAATATCCCTTTAATATCTCCTCTTAAACTACTTTA ACTACTATTATTTATTATACTATGGTTAATACATCACCGGATGGATGATTATGAACTGCG ATGATTGCATTGGCATTTTCTCTCACCGCAATACTAAAAATTTCACGTGGATGTACAATC

Input for the function

  • contig = Name of sequence in fasta file
  • file = Name of fasta formated file
  • from = Start of sequence in DNA sequence
  • to = Stop of sequence in DNA sequence
  • strand = Strand of the sequence

Examples:

res <- cont2nuc(contig="adpmb-supercont1.5",file=fast_t,from=116271,to=122902,strand=1)

Function in R :

- Name : Convert DNA sequence to Protein sequence (6 reading frame)

- R-code

cont2nuc<-function(contig="adpmb-supercont1.5",file=fast,from=116271,to=122902,strand=1)
        {
        require(seqinr)
        fasta<-file[which(contig==names(file))]
        att<-attributes(fasta[[1]])
        att2<-att[[2]]
        sequ<-substr((fasta[[1]][1]),from,to)

        strReverse <- function(x) sapply(lapply(strsplit(x, NULL), rev), paste, collapse="")

        sequ1<-strReverse(sequ)
        contig_strand<-as.numeric(as.character(substr(att2,nchar(att2),nchar(att2))))

        if((contig_strand==strand)==TRUE)
            {
            f1<-paste(getTrans(s2c(sequ),frame=0,sens="F"),collapse = '')
            f1_dna<-sequ
            f2<-paste(getTrans(s2c(sequ),frame=1,sens="F"),collapse = '')
            f2_dna<-substr(sequ,2,nchar(sequ))
            f3<-paste(getTrans(s2c(sequ),frame=2,sens="F"),collapse = '')
            f3_dna<-substr(sequ,3,nchar(sequ))

            r1<-paste(getTrans(s2c(sequ),frame=0,sens="R"),collapse = '')
            r1_dna<-paste(comp(s2c(sequ1),forceToLower = FALSE),collapse = '')
            r2<-paste(getTrans(s2c(sequ),frame=1,sens="R"),collapse = '')
            r2_dna<-paste(comp(s2c(substr(sequ1,2,nchar(sequ1))),forceToLower = FALSE),collapse = '')
            r3<-paste(getTrans(s2c(sequ),frame=2,sens="R"),collapse = '')
            r3_dna<-paste(comp(s2c(substr(sequ1,3,nchar(sequ1))),,forceToLower = FALSE),collapse = '')

            res<-list(f1,f1_dna,f2,f2_dna,f3,f3_dna,r1,r1_dna,r2,r2_dna,r3,r3_dna)
            names(res)<-c("for_1","for_1_DNA","for_2","for_2_DNA","for_3","for_3_DNA","rev_1","rev_1_DNA","rev_2","rev_2_DNA","rev_3","rev_3_DNA")
            return(res)
            #return(list(sequ,sequ1))
            }
        if ((contig_strand==strand)==FALSE)
            {note<-"CHECK"
            return(note)}
        }
ADD COMMENT

Login before adding your answer.

Traffic: 1977 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6