Generating A Citation Graph From A Set Of Pdfs
4
12
Entering edit mode
13.0 years ago

I have a set of articles in Zotero and pdf format and would like to generate a citation graph from them, ideally with some statistics about how often the articles were cited in total and the possibility to fill in missing links.

Is there any tool that is able to do this?- I've heard that Mendeley is/was capable of extracting references from pdf, but don't know what the status of this feature is. Other suggestions are welcome.

Alternatively, are there tools that just extract references from text, e.g. a collection of regular expressions for different journals? I've got a little experience in Cytoscape plugin development and could code the visualization myself (if anyone is interested in this and would like to help this is also welcome).

Related: this question and Chris Miller on friendfeed, both without a satisfying answer. Maybe even Maltego could be used for this, but I don't know much about the software.

visualization • 6.8k views
ADD COMMENT
4
Entering edit mode
13.0 years ago

There are tools like cbib for parsing references from PDF files. But the real challenge is not that, it is proper record linkage and de-duplication to identify which records point to the same publication. That is an open research problem with few if any easily applicable tools.

ADD COMMENT
0
Entering edit mode

Thanks for the cbib link. For the "real challenge", I think PubMed or Google Scholar would already have solved that for me (?)

ADD REPLY
0
Entering edit mode

It should help to some extent. I think it will all depend on to the size, quality and diversity of the corpus.

ADD REPLY
3
Entering edit mode
13.0 years ago

I don't know of a tool that does it all. So +1 for your question. I would do it as follows:

  1. I would use Hubmed's citation finder http://www.hubmed.org/citation.htm. to extract the references from the pdf.
  2. Subsequently I would use graphviz dot or cytoscape to draw the citation networks by providing the linked pmid's in a text file. Very interesting if this would be possible through a cytoscape plugin. Please mention your cytoscape plugin once it is finished.
ADD COMMENT
3
Entering edit mode
13.0 years ago

If you have access to Thomson's Web of Science (especially the API) you might be able to use that. Unless you want your results to be public I guess. They do not only know how often publications were cited but also from where. So they must already have collected the information you need.

ADD COMMENT
0
Entering edit mode
11.3 years ago
sahar • 0

Hi Michael, I have the same problem and I want to generate a citation graph for my pdfs, have you found any solution?

ADD COMMENT

Login before adding your answer.

Traffic: 2475 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6