Split Fastq File In Small Fastq Files - Windows
5
0
Entering edit mode
11.0 years ago
deepbiofever ▴ 10

Hi All,

Is there anyway I can split large FASTQ files into small FASTQ files with defined number of reads under windows environment, I know there are multiple option for unix but did not find anything for windows ?

best Deep

split fastq windows • 9.3k views
ADD COMMENT
0
Entering edit mode

See also duplicate thread on SEQanswers: http://seqanswers.com/forums/showthread.php?t=28989

ADD REPLY
3
Entering edit mode
11.0 years ago

or use some freeware like gsplit. Be remember to give the lines in mutiples of 4 [4 lines per read].

ADD COMMENT
0
Entering edit mode

the output comes out in GSD format

ADD REPLY
1
Entering edit mode
11.0 years ago

install http://www.cygwin.org , and use split

ADD COMMENT
0
Entering edit mode
11.0 years ago
seidel 11k

If you're familiar with R, you can use the ShortRead library to break the file up into smaller files. It's only a few lines of code. The example below takes a fastq file, breaks it up into sets of 1 million reads, writing the results to incrementally named smaller files:

library(ShortRead)

# set the file (.gz files also work)
yourFile <- "foo.fastq"
fileBaseName <- sub(".fastq$","",yourFile)
# iterate over fastq file
f <- FastqStreamer(yourFile, 1000000)
file_index <- 0
while (length(fq <- yield(f))) {
  newName <- paste(fileBaseName, "_", file_index,".fastq", sep="")
  writeFastq(fq,file=newName)
  file_index <- file_index + 1
}
close(f)
ADD COMMENT
0
Entering edit mode

This is good if file is to be finally processed in R, but for a series of large fastq files, it won't be efficient like as compare to split.

ADD REPLY
0
Entering edit mode
11.0 years ago

You should be able to do this with Powershell, if you don't want to install Cygwin. Use a read count and a modulus operation on four lines, as Sukhdeep suggests.

ADD COMMENT

Login before adding your answer.

Traffic: 2116 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6