Question

TCR sequence analysis

6

Entering edit mode

8.8 years ago

binfeng ▴ 60

Hi,

I'm doing single-cell TCR sequencing using Mark Davis' protocol. Here is the link to Davis paper: http://www.nature.com/nbt/journal/v32/n7/abs/nbt.2938.html

The paper used vdjfasta for TCR analysis. But the vdjfasta was set up for antibody sequence analysis. When I use it for TCR sequences, I don't get exactly what described in the paper. Could anyone help me? I wonder if anyone can tell me how to change vdjfasta (probably the database or hmm file?) for TCR analysis or point me to another tool.

Thank you so much in advance!

Bin

sequence • 6.9k views

ADD COMMENT • link updated 14 months ago by mizraelson ▴ 60 • written 8.8 years ago by binfeng ▴ 60

Ram · Answer 1 · 2015-06-26

1

Entering edit mode

8.8 years ago

mikhail.shugay 3.5k

Hello!

I can recommend one of our tools, MIGEC/CdrBlast or MITCR. The latter is slightly less accurate but much faster.

Here is the full list of immune repertoire sequencing analysis tools: http://omictools.com/rep-seq-c424-p1.html

ADD COMMENT • link updated 16 months ago by Ram 43k • written 8.8 years ago by mikhail.shugay 3.5k

0

Entering edit mode

Thank you for your recommendation! I was searching online and just downloaded your Nature Methods paper on MIGEC:) Well, I'll test both.

ADD REPLY • link 8.8 years ago by binfeng ▴ 60

0

Entering edit mode

Never mind. Found it

ADD REPLY • link 8.8 years ago by binfeng ▴ 60

0

Entering edit mode

Could you add some test input files for testing for MIGEC?

Thank you!

Bin

ADD REPLY • link 8.8 years ago by binfeng ▴ 60

0

Entering edit mode

A real-world example would be quite large as it need to cover molecular tag (~one cDNA molecule, see original paper) 10-20 times for each of several clonotypes. You can still use the full dataset, see http://www.ncbi.nlm.nih.gov/sra?term=PRJNA239303 .

I've just uploaded a single sample from http://www.jimmunol.org/content/early/2015/05/08/jimmunol.1500215 with TCR beta to S3, feel free to download the read1 and read2 files, they contain the unique molecular tag sequence in header in case you need it.

ADD REPLY • link updated 16 months ago by Ram 43k • written 8.8 years ago by mikhail.shugay 3.5k

0

Entering edit mode

Thank you so much! I'll test it tomorrow morning.

Bin

ADD REPLY • link 8.8 years ago by binfeng ▴ 60

0

Entering edit mode

I just downloaded read1 as testing input file. It worked! But I still have a couple of questions. Here is the command I used:

java -jar migec-1.2.1a.jar CdrBlast -R TRB S2-1-beta_R1.fastq.gz r1.out.txt

My questions are:

In the output file (r1.out.txt), it contains all the information I need except identifiers for each TCR. In other words, there is no UMI information in the output file. Did I use your tool correctly?
Should I use Checkout and how do I use it?

Thank you so much!

Bin

ADD REPLY • link updated 16 months ago by Ram 43k • written 8.8 years ago by binfeng ▴ 60

0

Entering edit mode

Dear Bin,

Sorry for a late reply.

You do not need to use Checkout, they are already split by sample barcodes and the info is already in read header.

You need to run Assemble to assemble consensuses and then CdrBlast. It will process the UMI info from header.

ADD REPLY • link 8.8 years ago by mikhail.shugay 3.5k

0

Entering edit mode

Could you upload the barcode.txt file for read1/read2? It would be convenient for users to test Checkout etc.

Thank you so much in advance!

Bin

ADD REPLY • link updated 16 months ago by Ram 43k • written 8.8 years ago by binfeng ▴ 60

0

Entering edit mode

You don't need those barcodes as files are already de-multiplexed with adapter sequences containing sample barcode cropped. Still, if you download raw files from SRA you can use them. Here are barcodes for alpha and beta chains:

ADD REPLY • link 8.8 years ago by mikhail.shugay 3.5k

0

Entering edit mode

Thank you!

ADD REPLY • link 8.7 years ago by binfeng ▴ 60

Ram · Answer 2 · 2015-06-27

0

Entering edit mode

8.8 years ago

Charles Plessy ★ 2.9k

I would be very interested on your feedback on the tool I wrote, clonotypeR. But note that for the moment it misses a data file for human TCR analysis (draft instructions are available to make one.)

ADD COMMENT • link 8.8 years ago by Charles Plessy ★ 2.9k

0

Entering edit mode

Thank you! I'll take a look.

ADD REPLY • link 8.8 years ago by binfeng ▴ 60

0

Entering edit mode

I'd like to test clonotypeR for human TCR analysis but your "draft instructions" link points to seaview. Could you provide instruction on how to create the data file for human? Or just provide one:)

Bin

ADD REPLY • link 8.8 years ago by binfeng ▴ 60

0

Entering edit mode

Indeed, that was not the best link to start with; sorry. Have a look at http://clonotyper.branchable.com/references/README/. It briefly summarises how I prepared the reference alignments for mouse. Basically, you need to align the DNA sequence of the V segments on the conserved cysteine and the DNA sequence of the J segments on the FGxG motif. I could only do it by hand (for instance, for some V segments the conserved cysteine is not the last one), and I found that SeaView was the best tool for this.

Also, if you contact me by email, I can send you the draft paper that I wrote to describe clonotypeR more in details.

ADD REPLY • link updated 16 months ago by Ram 43k • written 8.8 years ago by Charles Plessy ★ 2.9k

0

Entering edit mode

My email address is bin@enumeral.com

I really appreciate it if you can send me your manuscript.

Bin

ADD REPLY • link updated 16 months ago by Ram 43k • written 8.8 years ago by binfeng ▴ 60

0

Entering edit mode

Hi Bin,

I am a bioinformatician in the cancer immunotherapy lab at Duke university and have just started working on a single cell TCR sequencing project. I was also trying to follow the Mark Davis paper that you referenced. It seems they had modified vdjfasta for TCR but I am not sure if the modified version is available. It would be very helpful if you could share your experience and provide me some information on what tools you used for your analysis and any details related to the analysis. Also, do you know if the raw data from the Mark Davis is available or not. I have written to him to find out but have not heard back yet.

Thanks,
- Pankaj

ADD REPLY • link updated 16 months ago by Ram 43k • written 7.5 years ago by bioinformatics.cancer ▴ 260

0

Entering edit mode

Hi, I would recommend using MiXCR for the repertoire data, as it has a specific command designed to work with libary structure from this paper:

mixcr analyze han-et-al-2014-bcr \
  --species hsa \
  input_R1.fastq.gz \
  input_R2.fastq.gz \
  result

Using that command you can easily get the results for repertoire data analysis.

You can read more about this here:

https://docs.milaboratories.com/mixcr/reference/overview-built-in-presets/#han-et-al-2014

Also, feel free to contact us on github if you have any questions or suggestions: https://github.com/milaboratory/mixcr

ADD REPLY • link 14 months ago by mizraelson ▴ 60