Question

Difference between tSNE and PCA analysis

9

Entering edit mode

6.2 years ago

Qingyang Xiao ▴ 160

Hello! As clustering methods, what's the main difference between tSNE and PCA analysis?

next-gen rna-seq • 24k views

ADD COMMENT • link updated 8 weeks ago by Jeremy Leipzig 22k • written 6.2 years ago by Qingyang Xiao ▴ 160

1

Entering edit mode

Other than the math?

ADD REPLY • link 6.2 years ago by Devon Ryan 104k

1

Entering edit mode

Yes, other than math, mainly for biological applications

ADD REPLY • link 6.2 years ago by Qingyang Xiao ▴ 160

1

Entering edit mode

It is used for dimensionality reduction and now depending upon variables and your interest of inferencing the applications will be considered. PCA has been a pretty favorite tool till date for RNA-Seq , ChIP-Seq and also WES data, but with incoming scRNASeq and also large scale SNPs data scoring population genetic inferencing t-SNE is coming handy as well. The links I have already given below in the answer should suffice. Now I will post here one more w.r.t Human Genetic Data. Your question is too broad so probably you need to do some background study. Rest it all depends on the data you will be using and depending on that your methods for dimensionality reduction will be coming into consideration.

ADD REPLY • link 6.2 years ago by ivivek_ngs ★ 5.2k

2

Entering edit mode

Nonsense - the question isn't too broad. If someone just asks for the "main difference" you should be able to explain it in a sentence or two instead of bombarding them with links.

ADD REPLY • link 6.2 years ago by Jeremy Leipzig 22k

0

Entering edit mode

I guess you have to check what the OP wrote in comments as biological application.

ADD REPLY • link 6.2 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

Many thanks to these fascinating answers!

ADD REPLY • link 6.2 years ago by Qingyang Xiao ▴ 160

score 9 · Answer 1 · 2018-01-25

9

Entering edit mode

6.2 years ago

Jeremy Leipzig 22k

The main difference between t-SNE (or other manifold learning methods) and PCA is that t-SNE tries to deconvolute relationships between neighbors in high-dimensional data.

A classic example is the "swiss roll". To put the difference in layman's terms: t-SNE attempts to understand the underlying structure of the swiss roll. It does this by prioritizing neighboring points. PCA doesn't get what's going on - it doesn't see that the points are actually a line that's been rolled up.

Original data:

enter image description here

This PCA sucks (it thinks yellow is close to blue when in fact they are far away):

In contrast, see how t-SNE seems to understand what's going on with this 'S'? enter image description here

ADD COMMENT • link 8 weeks ago by Jeremy Leipzig 22k

0

Entering edit mode

Jeremy Leipzig Are you able to re-upload the t-SNE picture?

Looks like the link has been changed and hence, the image is missing.

ADD REPLY • link 15 months ago by Yogi ▴ 70

1

Entering edit mode

fixed this thanks

ADD REPLY • link 8 weeks ago by Jeremy Leipzig 22k

score 3 · Answer 2 · 2018-01-25

3

Entering edit mode

6.2 years ago

ivivek_ngs ★ 5.2k

I can suggest some links that will give you the flavor of both the methods that are used in dimensionality reduction.

Link1
Link2
If w.r.t scRNA-Seq check here
For bulk RNASeq check here
If you are a fan of kaggle this link is pretty fun as well for usage understanding.

ADD COMMENT • link 6.2 years ago by ivivek_ngs ★ 5.2k

1

Entering edit mode

Thanks- I think in your point 5 you forgot to actually put the hyperlink.

ADD REPLY • link 6.2 years ago by dariober 14k

0

Entering edit mode

updated. Thanks for pointing it out.

ADD REPLY • link 6.2 years ago by ivivek_ngs ★ 5.2k

score 3 · Answer 3 · 2018-01-25

3

Entering edit mode

6.2 years ago

dariober 14k

Just a couple of comments... Neither tSNE or PCA are clustering methods even if in practice you can use them to see if/how your data form clusters. tSNE works downstream to PCA since it first computes the first n principal components and then maps these n dimensions to a 2D space. The original paper on tSNE is relatively accessible and if I remember correctly it has some discussion on PCA vs tSNE. Also, this post on tSNE is quite good, although not really about tSNE vs PCA.