Large databases in UltraRecall?
< Next Topic | Back to topic list | Previous Topic >
Posted by Jon Polish
Aug 30, 2010 at 12:54 PM
I don’t know if this helps, but I have no trouble with my almost 8GB database. It has 12,364 items, with a large percentage of these items PDF files. Most are stored in UR, but some are linked. All non-image pdf files are indexed (I differentiate because pdf’s that are scans (no OCR performed) are not indexable). I do not have performance problems other than importing large amounts of data all at once. I have experienced slow import on small and large databases, so I don’t think size is a factor.
Jon
Posted by Chris Thompson
Aug 30, 2010 at 03:14 PM
Thanks for the feedback. It sounds like it’s doable, though perhaps with a very large database file.
To answer Quant’s question, I’m doing some consulting for an organization that has a large, baroque paper filing system (more than 250,000 documents, though I’m primarily concerned with a smaller core set). In order to handle what PIM users would consider “cloning”, they would copy a document and file it in multiple places, sometimes five or six (occasionally a dozen) places. The locations of those “clones” are themselves metadata, because someone some time ago had inspected the document and found it to be relevant to the given categories. This system was effective at storage, but it’s inefficient to extract meaning from the document set. One can browse files and end up looking at documents multiple times in multiple locations. Thus, search is essential, but I’d like a PIM that supports cloning to mirror the underlying physical filing system.
—Chris
Posted by Alexander Deliyannis
Aug 30, 2010 at 06:59 PM
Jon Polish wrote:
>All non-image pdf files are indexed (I differentiate because
>pdf’s that are scans (no OCR performed) are not indexable).
A very important point.
For the record, neither Archivarius indexes image PDFs. However, the Evernote Premium service does it, as does Onenote (which I don’t use, so I can’t tell about its performance).
Regarding Chris’ cloning approach, I assume an alternative is to use keywords/tags to replicate the folder structure, as entries can be tagged under multiple categories.
Posted by Chris Thompson
Aug 30, 2010 at 10:13 PM
Unfortunately, OneNote doesn’t index PDF attachments. If you place a PDF on a page, it *will* index the bitmap page images that it creates, but then there’s no way to actually get the original PDF out again. This is probably my biggest frustration with OneNote.
—Chris
Alexander Deliyannis wrote:
>For the record, neither Archivarius indexes image PDFs. However, the
>Evernote Premium service does it, as does Onenote (which I don’t use, so I can’t tell
>about its performance).