PDF cataloging
< Next Topic | Back to topic list | Previous Topic >
Posted by CRC
Aug 18, 2016 at 06:21 PM
Acrobat Pro can build a fully searchable full text index for a group of PDFs. It does a good job and the searches are fast and complete.
Posted by Dr Andus
Aug 18, 2016 at 09:06 PM
You could check if any of these academic software could be adopted for your needs:
“Comparison of Docear with Zotero and Mendeley”
http://www.docear.org/software/details/#Comparison_with_Zotero_and_Mendeley
Posted by Graham Rhind
Aug 19, 2016 at 12:55 PM
Thanks Dr Andus.
It looks like Mendeley and Zotero just allow search across documents, which is not ideal as explained previously. Docear looks interesting - that’s much more what I’m thinking of .... but it doesn’t allow collaboration….
Posted by dan7000
Aug 20, 2016 at 12:34 AM
I think qiqqa does what you want. I used it on a project for a while. You can add an annotation to either a document or to a specific location in a document. Then you can search those annotations. You can also tag both documents and locations in documents if you want to use tags instead of free text annotations.
I used a qiqqa “local library” for security reasons - I could not put the material on the cloud. Unfortunately, the software was painfully, painfully slow for me. It might have been my machine or it might have been the local database. I would be interested to know if it performs better with a cloud-based library.
Graham Rhind wrote:
Thanks Madaboutdana.
>
>What I don’t see Reader helping with (I haven’t looked at Qiqqa yet) is
>the metadata issue.
>
>If I have a dozen pdfs each with 1000 pages, results of searches for
>almost anything are going to be full of noise. I want a solution that
>starts off being useful enough for people to use. So I want the pdfs
>themselves to be accompanied by a searchable metadata catalogue.
>
>Let’s say there’s an annual sales documentation and between pages 230
>and 240 there’s a section related to widget sales in Antarctica in 1989.
>I would like that information to get stored in a catalogue which would
>then allow people to search, for example, for widget sales figures for
>Antarctica but not South Africa in 1989, and this to bring up the “index
>card” with a link to that section of the pdf, wherever it may be. If I
>were just to search all the pdfs for “Antarctica and 1989” there would
>probably be too many results in the collection of pdfs, each which would
>need checking to ascertain its relevance, and would put people off using
>the system.
>
>This is basic database stuff which I can do, but only if each pdf gets
>read (or at least skimmed) and the relevant information entered into a
>catalogue.
>
>I wondered if there was an easier way.