My aim is to evaluate TCR status of cancer patient. Here is what I got after conducting some research:
Based on Shannon formula (α=1), Shannon index treats each TCR subclone proportionally as their relative fraction. Simpson (α=2) index however put more weights on dominate subclone.
I was prone to Shannon initially as I couldn't think of any reason not treating each subclone proportionally. However, now I think Simpson index may have its own consideration: 1: Very rare TCR sequences may be false positive due to technical error during TCR-Seq sequencing (difference between different TCR subclone may only be one nucleotide). 2: Dominate subclones may deserve more attention since they are most likely the one activated by neoantigen.
In addition for point above, TCR subclone with only one nucleotide difference may target same neoantigen. Therefore, they should not be treated seperately (I guess this is the concept of functional diversity).
Not sure if my opinion make sense. Can someone share some comments? Thanks in advance.
What are you trying to achieve when you say evaluate TCR status? If you're worried about technical errors, try technical replication. See what the error rates look like.