Biostar Beta. Not for public use.
having a more informative annotation file
0
Entering edit mode
13 months ago
F ♦ 3.4k
Iran

hello there,

i downloaded a csv annotation file from affymetrix but when i used with my citrus arrays the the resulted network just contains gene symbol, not gene name, not uniprot, nothing else...i downloaded text annotation from PLEXdb but my workbench rejects the format, do you know any other more informative csv annotation or know anyway to reformat my text to csv please

ADD COMMENTlink
1
Entering edit mode

Doesn't affymetrix contain gene information? https://www.biostars.org/p/151108/#151118

ADD REPLYlink
0
Entering edit mode

thank you,yes it does but i don't know how to mixed the gene name with annotation from affymetrix

ADD REPLYlink
0
Entering edit mode

thank you Alolex, i have an annotation text file that can 't be read by my workbench and just accepts csv...

this is a little bit of my text file annotation downloaded from PLEXdb

Annotation for selected probe sets. downloaded from PLEXdb on Jul 12 2015

Probeset Annotation_Date Consensus_ID GeneBank_Accession Blast_Date Blast_Program Ref_Desc E-value Perc_Identity
-- -- -- -- -- -- -- -- --
-- -- -- -- -- -- -- -- --
-- -- -- -- -- -- -- -- --
Cit.10074.1.S1_s_at Mar 11, 2009 CV884880 . 2010-12-10 blastx | Symbols: SIR | sulfite reductase | chr5:1319404-1322298 FORWARD LENGTH=643 7e-17 66.1
Cit.10074.1.S1_s_at Mar 11, 2009 CV884880 . 2009-09-13 blastn 0 99.9
Cit.10074.1.S1_s_at Mar 11, 2009 CV884880 . 2008-09-02 blastx 2e-51 85.2
Cit.10074.1.S1_s_at Mar 11, 2009 CV884880 . 2010-12-10 blastx PREDICTED: hypothetical protein [Vitis vinifera] 2e-51 85.2
Cit.10074.1.S1_s_at Mar 11, 2009 CV884880 . 2007-11-07 blastn Sulfite reductase [Prunus armeniaca (Apricot)] 0 100
Cit.10074.1.S1_s_at Mar 11, 2009 CV884880 . 2012-01-23 blastx Sulfite reductase [ferredoxin] n=2 Tax=Synechocystis sp. PCC 6803 RepID=SIR_SYNY3 4e-08 46.9
Cit.10076.1.S1_s_at Mar 11, 2009 CV709816 . 2010-12-10 blastx | Symbols: SIR | sulfite reductase | chr5:1319404-1322298 FORWARD LENGTH=643 3e-27 70.1
Cit.10076.1.S1_s_at Mar 11, 2009 CV709816 . 2009-09-13 blastn similar to UniRef100_A7NZP8 Cluster: Chromosome chr6 scaffold_3, whole genome shotgun sequence; n=1; Vitis vinifera|Rep: Chromosome chr6 scaffold_3, whole genome shotgun sequence - Vitis vinifera (Grape), partial (15%) 0 97.1
Cit.10076.1.S1_s_at Mar 11, 2009 CV709816 . 2008-09-02 blastx 2e-60 85.4
Cit.10076.1.S1_s_at Mar 11, 2009 CV709816 . 2010-12-10 blastx PREDICTED: hypothetical protein [Vitis vinifera] 2e-60 85.4
Cit.10076.1.S1_s_at Mar 11, 2009 CV709816 . 2007-11-07 blastn Sulfite reductase [Prunus armeniaca (Apricot)] 0 97.6
Cit.10076.1.S1_s_at Mar 11, 2009 CV709816 . 2012-01-23 blastx Sulfite reductase [ferredoxin] n=2 Tax=Synechocystis sp. PCC 6803 RepID=SIR_SYNY3 1e-16 52.5
Cit.10084.1.S1_at Mar 11, 2009 CF838391 . 2010-12-10 blastx | Symbols: ATRAB5A, ATRABF2A, RABF2A, RAB5A, RHA1, ATRAB-F2A, RAB-F2A | RAB homolog 1 | chr5:18244495-18246060 FORWARD LENGTH=201 1e-98 89.5
Cit.10084.1.S1_at Mar 11, 2009 CF838391 . 2009-09-13 blastn homologue to UniRef100_Q40570 Cluster: Ras-related GTP-binding protein; n=1; Nicotiana tabacum|Rep:

i am performing network analysis, when i am uploading normalized arrays, i need to upload an annotation file, i downloaded such a file (csv) from affymetrix but after creating the network, the nodes don't have any uniprot id, refseq id, gene name, and nothing else, just nodes showed by gene symbol equal to the probsets...then i have to perform GO annotation by symbols that all are non-significant...i need a more informative annotation file..i found such a file but in text format that can't be accepted by my workbench

ADD REPLYlink
0
Entering edit mode

Out of the columns shown, do you need all of them or just a select few? If a few, which ones? Also, do you have a linux/unix/mac or are you working on Windows? Finally, what program are you loading this file into? You have said workbench a few times, but I'm not clear on what program that is. The answer to these questions will help me figure out a solution that might work for you. Oh, and one more question--does the program you are using provide a sample input file? If yes can you post a few lines of that so I can see what the end result should be?

ADD REPLYlink
0
Entering edit mode

thanks Alolex for paying attention,

i am in windows and working with geworkbench, running ARACNe...i normalized GSE63706 as input then tool asks me a csv annotation file which i downloaded from affametrix, then i used some probsets as hubs and tools created a network that its nodes showen only with gene symbol for example LOC102577933, by which i can't perform not promoter analysis, not nothing and i need convert them, in addition i think performing GO based on probsets as nodes can't be trusted because each probes mapped to more than one gene...then i downloaded a text annotation (mentioned above but a few of columns not all columns i pasted) from PLEXdb that i think i need essential information such as gene name, entrez id, Consensus_ID, GeneBank_Accession and so one but workbench just needs csv...i described some more duobts in https://www.biostars.org/t/myposts/

ADD REPLYlink
0
Entering edit mode

I looked up the geworkbench tool you are using as I am not familiar with it, and have not used it. From the documentation here ( http://wiki.c2b2.columbia.edu/workbench/index.php/ARACNe#Setting_up_an_ARACNe_run ) it seems the program is doing what it is supposed to be doing. It looks like it requires the Affy csv file if you are using an Affy array ( follow the directions in the "Example of running ARACNe" section ). It also says the following about how it uses the annotation file with affy arrays. From your explanation I am guessing that you are selecting the "merge multiple probesets" option. In this case you will only get one node per gene. If you need to probe sets intact you need to unselect this option (see last part below). I'm thinking the display of just gene symbols is what the program is designed to do. If you need it to provide more information I would suggest you post this question on the geworkbench end-user forum ( http://wiki.c2b2.columbia.edu/workbench/index.php/Community ) if you can't find what you need in the help documents as I'm not familiar with the application. Hopefully this helps you somewhat.

"Load the microarray dataset into the Workspace. If available, associate a gene annotation file with the dataset. This will allow the results to be displayed in consolidated fashion in Cytoscape by gene rather than by marker (individual probeset) name.

......

"Merge multiple probesets

Checking this box will cause interactions to be summarized at the gene level for each hub marker. The links to individual probesets will not be retained. Thus when this option is selected, the adjacency matrix will contain a single line per hub gene. This option depends on an annotation file being loaded along with the microarray dataset.

....

On a microarray analysis platform, genes may be represented by more than one marker (probeset). The mapping between markers and genes is specified in the annotation file, if it is read in at the time that the data is loaded. The ARACNe analysis in geWorkbench is performed at the level of probesets. In some cases, an interaction between two genes may be represented by more than one edge, each such edge involving an alternate probeset for at least one of the genes.

When the "Merge multiple probesets" option is not chosen, the full ARACNe adjacency matrix, as calculated at the probeset level, will be retained and placed as a data node in the Workspace."

ADD REPLYlink
0
Entering edit mode

thank you very much Alolex

ADD REPLYlink
1
Entering edit mode
14 months ago
alolex • 890
United States

csv just means "comma separated value", so just take your text file (delimited by tabs maybe?) and replace all the column separators by commas. Then change the file extension to .csv. Also, I think if you have your annotations in Excel you can just do "save as" and select csv format for the current workbook.

ADD COMMENTlink
0
Entering edit mode

thank you Alolex, the file is too dense to be reformat by myself and not in excel

ADD REPLYlink
1
Entering edit mode

Can you post a sample of the file you are working with and then explain what you want extracted/reformatted? If you can't open it in a text format and you have a mac or linux machine you can just copy and paste the output of >> head myfile.txt here.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1