Question

Clustering with Trinity suite changes order of sample names

0

Entering edit mode

7.0 years ago

ando.kelli ▴ 60

Hi all,

I've done some clustering using Trinity's define_clusters_by_cutting_tree.pl script, and am having a bit of an issue.

The data I'm analysing is from a time series: D1, D2, D3, D4, D6, D8, D10, D12, D14

When I went through the trinity pipeline, the sample order was mixed up in the matrices produced, so I went back and redid it with the names D01, D02, D03, D04, D06, D08, D10, D12, D14

Now the sample order is perfect in the matrices produced, however, the order of samples is still mixed up when I do the cluster analysis using the corrected matrix.

Any way I can do the cluster analysis and have the samples in the correct order? It really doesn't make sense to have day 2 next to day 10 for example... It produces patterns that aren't intuitive....

Cheers,

Kelli

Trinity Clustering rna-seq DE • 1.4k views

ADD COMMENT • link 7.0 years ago by ando.kelli ▴ 60

score 0 · Answer 1 · 2017-04-17

FYI

There's a parameter ' --no_column_reordering ' in recent versions of that script and that keeps the samples ordered according to how you have them listed in your samples file. As I've found, this flag can also be used if you don't use a samples file, it will keep the order of the samples in the matrix.

There is also: --order_columns_by_samples_file --> instead of clustering samples or replicates hierarchically based on gene expression patterns, but you need to use this beforehand with the analyze_diff_expr.pl script.

K