CRIMP Alert: A Compiled List of PDF Managing and Search Tools
< Next Topic | Back to topic list | Previous Topic >
Posted by Dominik Holenstein
Feb 7, 2008 at 09:46 AM
Derek,
Thank you for your kind words!
I will go for the ‘combined approach’:
Storing pdfs and other files in the appropriate place in UltraRecall where necessary. Thanks to logical linking I can ‘store’ oder better, ‘link’ the files to different places. This allows me to store the original pdf files in one folder on my USB drive.
After evaluating different tools for searching text in files I decided to buy Archivarius 3000. I think it has the best ratio in regard of price and functionality.
Dominik
Posted by Derek Cornish
Feb 7, 2008 at 06:13 PM
Dominik,
I have tended to use a combined approach, too, although I can’t claim that the resulting way of working is as efficient as it probably could be. It looks more like a series of workarounds developed over time in order to cope with the limitations of the software I have used.
Yes, Archivarius 3000 looks like a good buy. If I had alx’s language requirements, and didn’t already use dtSearch (and Wilbur for some tasks), I’d probably choose it. As it is, I am tempted by its claims to index Zoot’s zot files - but not quite enough - yet - to buy it.
Derek
Posted by Derek Cornish
Feb 8, 2008 at 01:46 AM
Alexander Deliyannis wrote:
>I have little
>difficulty in deciding myself. I have always opted for leaving them in (a few)
>permanent windows folders and indexing/linking to them from database programs such
>as UltraRecall or whatever. The sheer size of the files is such that it would make no
>sense to include them _within_ a file.
The difficulty in deciding doesn’t arise in connection with the notes/ideas manager I use. For example, I use Zoot as my ideas database and as a capture tool for text snippets extracted from the web and from files. Like you, I have tended to store pdf, doc, htm files in organized Windows folders and link them to Zoot items. (This is in any case not a matter of choice as, unlike UltraRecall, Zoot cannot store these types of files as it currently only accepts plain text. OTOH, using Zoot avoids the temptation of loading one’s notes/idea manager with wodges of irrelevant information.)
Storing the ‘real’ files in in the Windows folder system also has the advantage of keeping them available for indexing and searching using any competent desktop search engine. Zoot databases can themselves be indexed and searched by the same software, although they first have to be converted into (large) htm files - unless one is using Archivarius, apparently. (Does Archivarius index and search UR files yet?)
For me, the difficulty in deciding arises at the point of downloading the pdf, htm, etc. files from the web, and this is where the question of whether to store these types of files in the Windows folder system or in dedicated web capture database software comes in. For a long time I used Net Snippets, which uses the Windows folder system to store its files and so allows desktop search engines to index and search them. But if you want to organize your downloaded files in more complex ways - for example, by using keywords or multiple categories - then you may have to look to dedicated web capturing tools like Surfulater or Web Research - even though their content may not be easily accessed by desktop search engines, nor easily linked to one’s chosen information manager.
Web Research (WR) is currently my main web capture tool for certain purposes - e.g., for pdf, htm, doc, and image files connected with particular projects, and for files I am keeping for semi-permanent reference purposes (e.g., software specifications, manuals, and so on). I can hyperlink from WR to Zoot and vice versa, so from that point of view it works well. The downside is that my search engine of choice, dtSearch, can only index and search the htm files stored in WR. Maybe I could persuade the Archivarius developers to take a look at WR’s database file format…
Given their potential drawbacks, why would one ever want to use dedicated web capture tools in preference to simply downloading files into windows folders and linking to Zoot? I think there are a number of reasons: (1) Quicker real-time saving and categorizing - or dumping first and classifying later; (2) Easy re-organization of imported files via categories/keywords when necessary; (3) Highlighting, metadata; (4) An intermediate store for files en route to the Windows folder system; (5) A useful place in which to browse through files.
>Apart for the size issue, I
>believe that files as such will be accesible for quite some time, whereas database
>programs come and go. Think of the time involved in importing such files to a database
>and then exporting them to one’s next information manager.
I think this is a good argument for not downloading files into one’s notes/ideas manager - where, in any case, they may just clog things up - but less valid in the case of web capture tools as these have some value as both temporary and permanent storage sites for particular projects or purposes - and usually offer quite good bulk exporting features these days.
>That said, information
>is not knowledge. A library of references makes little sense unless one invests in
>slowly building their comprehension of the ideas within that material, i.e. their
>knowledge, whether visually (with mind maps etc) or textually (with a classic
>outliner). For this I find many of the tools we discuss in this forum absolutely
>invaluable.
Absolutely agree on this.
>
>An indexing program complements the building of such an ‘idea
>structure’ by helping reference and support themes, once one knows what they are
>after. Personally, I was attracted to Archivarius by its support for an amazing
>multitude of file formats, as well as for my own working language which is often
>unsupported by anglosaxon made/oriented software.
Can’t argue with that :-).
Derek
Posted by Ike Washington
Feb 8, 2008 at 01:42 PM
Derek
I index my zoot databases using dtSearch directly, without converting them first into html files. Works okay - some garbage indexed too. Having searched within dtSearch, I launch the file containing my search term from there; the correct Zoot database opens; I search within it to locate exactly whatever I’m looking for.
Where to store data? I switched from Net Snippets to Scrapbook/Firefox. A real delight to use - html, pdf, txt, doc, jpg, gif. One of the main reasons why I use Firefox. And dtSearch indexes scrapbook files perfectly, Zoot links, perfectly.
Ike
Derek Cornish wrote:
.... Zoot databases can themselves be indexed and searched by the
>same software, although they first have to be converted into (large) htm files -
>unless one is using Archivarius, apparently.
.... For a long time I used Net Snippets, which uses the Windows folder
>system to store its files and so allows desktop search engines to index and search
>them. But if you want to organize your downloaded files in more complex ways - for
>example, by using keywords or multiple categories - then you may have to look to
>dedicated web capturing tools like Surfulater or Web Research - even though their
>content may not be easily accessed by desktop search engines, nor easily linked to
>one’s chosen information manager.
>
>Web Research (WR) is currently my main web
>capture tool for certain purposes - e.g., for pdf, htm, doc, and image files connected
>with particular projects, and for files I am keeping for semi-permanent reference
>purposes (e.g., software specifications, manuals, and so on). I can hyperlink from
>WR to Zoot and vice versa, so from that point of view it works well. The downside is that
>my search engine of choice, dtSearch, can only index and search the htm files stored in
>WR. Maybe I could persuade the Archivarius developers to take a look at WR’s database
>file format…
Posted by Stephen Zeoli
Feb 8, 2008 at 07:33 PM
I use OneNote to archive my PDFs. This is doable for me for two reasons: I don’t have hundreds of PDF files to worry about, and most of the ones I do want to store are single pages (price quotes from printers, for instance). So OneNote is a great solution because I can drop a particular PDF into the appropriate notebook. ON stores the original and optionally includes a searchable printout of the PDF in the notebook. Very handy.
Steve z.