Tutorial:how to combine multiple RNAseq count files into a single dataframe in R and unix
0
2
Entering edit mode
9 days ago
Ming Tommy Tang ★ 3.9k

Hello all, I made two videos on this:

and

Happy Learning! Tommy

Unix RNAseq R • 556 views
ADD COMMENT
1
Entering edit mode

Thank you, the csvtk spread is super useful. I usually import into R using lapply then Reduce using merge but this might be easier.

ADD REPLY
0
Entering edit mode

Hi,

Just thought I'd share 2 methods for when each file contains multiple samples. Thanks to Ram for providing lapply solution.

# load libraries
library(readr)
library(dplyr)
library(purrr) # for reduce()

# create sample data
file_1 <- tibble(name = c("A", "B", "C"),
             sample_1 = c(1, 2, 3),
             sample_2 = c(4, 5, 6),
             sample_3 = c(7, 8, 9))

file_2 <- tibble(name = c("A", "B", "C"),
             sample_4 = c(10, 20, 30),
             sample_5 = c(40, 50, 60),
             sample_6 = c(70, 80, 90))

file_3 <- tibble(name = c("A", "B", "C"),
             sample_7 = c(100, 200, 300),
             sample_8 = c(400, 500, 600),
             sample_9 = c(700, 800, 900))

directory = "path/to/files"

write_tsv(file_1, paste(directory, "/", "file_1.tsv", sep = "")) 
write_tsv(file_2, paste(directory, "/", "file_2.tsv", sep = "")) 
write_tsv(file_3, paste(directory, "/", "file_3.tsv", sep = "")) 

# Store files
files <- list.files(path = directory, pattern = "*.tsv", full.names = TRUE)
files

Method 1: For Loop

# Import first file
data <- read_tsv(files[1])
data

# Remove first file
files <- files[-1] 
files

# Join remaining files
for (x in files) {
  y <- read_tsv(x)
  data <- full_join(data, y, join_by(name))
}
data

Method 2: lapply

data <- lapply(files, read_tsv) # change to read_csv as needed
data <- data |> reduce(full_join, by = "name")
data
ADD REPLY
1
Entering edit mode

Simple:

data <- lapply(files, read_tsv)
data <- Reduce(merge, data) # or Reduce(full_join, data)

ADD REPLY
1
Entering edit mode

Thanks Ram. Using merge maxed out my ram on a large list, but your suggestion directed me to purrr::reduce which works great.

ADD REPLY
0
Entering edit mode

I need assistance don't know what to do next after signing up

ADD REPLY

Login before adding your answer.

Traffic: 1648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6