File Searching Blues
Posted by ureadit
on 9/15/2003
ureadit
9/15/2003 7:35 pm
All,
I expect that, like me, some of you use text searching software (MS or third-party) to search through multiple files. Recently, many PIMs have incorporated file compression to reduce the size of files, particularly if they contain graphics.
As I just discovered, even if a search engine can read zipped files, none that I am aware of can read automatically compressed PIM files. Even if you use only one PIM, its built-in search function cannot simultaneously search multiple files. (InfoSelect may be an exception.) This is a disaster for those of us who have a LOT of data in MANY files.
The only solution I can think of is for there to become some software standardization. In particular, if vendors of software used for text information storage could create dll, plug-in files that would "tell" search programs how to read their compressed files, and if file search vendors would modify their software to accept such plug-ins, then the problem would be solved.
Anyway, I intend to inform all of my search and PIM software vendors of this problem and my potential "solution." But first, I thought I'd give you an opportunity to comment and make suggestions--please!
Thanks,
-Steve C.
I expect that, like me, some of you use text searching software (MS or third-party) to search through multiple files. Recently, many PIMs have incorporated file compression to reduce the size of files, particularly if they contain graphics.
As I just discovered, even if a search engine can read zipped files, none that I am aware of can read automatically compressed PIM files. Even if you use only one PIM, its built-in search function cannot simultaneously search multiple files. (InfoSelect may be an exception.) This is a disaster for those of us who have a LOT of data in MANY files.
The only solution I can think of is for there to become some software standardization. In particular, if vendors of software used for text information storage could create dll, plug-in files that would "tell" search programs how to read their compressed files, and if file search vendors would modify their software to accept such plug-ins, then the problem would be solved.
Anyway, I intend to inform all of my search and PIM software vendors of this problem and my potential "solution." But first, I thought I'd give you an opportunity to comment and make suggestions--please!
Thanks,
-Steve C.
100341.2151
9/25/2003 10:33 pm
Steve,
I don't know how much time I've spent fruitlessly (well, laboriously) trying to get around these problems. Proprietory formats make many PIM and free-form database programs - and many other programs - hard or impossible to index or to view. Nowadays I take a good look at such factors before deciding whether or not to buy them.
And it's not just PIMS, etc. E-mail programs' mailboxes can also be a problem.
For a long time I continued using Lotus Magellan - a DOS program - simply because it was so good at indexing and viewing the other programs I used (and still use) daily - Grandview, Agenda, Tapcis, etc. For those files it couldn't handle - htm and pdf, for example - I just converted to text and indexed that.
Now I'm in an all-windows environment where there is quite a lot of choice for (indexed) searchers, and I use Wilbur (free) and Search32 ($$). They can, at least, handle htm, and their use of pdf2txt to index and view pdf files is much quicker than the clunking invocation of the Acrobat Reader used by DTSearch.
But as you say, the problem of proprietory formats remains a real stumbling-block. I got used to having to do without a Magellan-type viewer for my old Agenda files; they have a lot of text in them and no compression - just extraneous garbage. Zoot and askSam file, too, can usually be indexed and viewed after a fashion. But programs like ContentSaver, MyBase, Outlook, OE, etc. are pretty much impossible to index unless the program has special facilities to do so (X1, for example, with Outlook Express).
For sound commercial reasons, program designers like to make it easy to get your data into their programs, and jolly hard to get it out again. This, I guess, is part of the problem. I like your idea, though, which harks back to the good old days of Magellan's and XTree's viewers.
Incidentally, the file manager Total Commander (which I love) uses plug-ins contributed by users to handle some of the problems of proprietory formats. Plug-ins do seem to be the way to go...
Derek
I don't know how much time I've spent fruitlessly (well, laboriously) trying to get around these problems. Proprietory formats make many PIM and free-form database programs - and many other programs - hard or impossible to index or to view. Nowadays I take a good look at such factors before deciding whether or not to buy them.
And it's not just PIMS, etc. E-mail programs' mailboxes can also be a problem.
For a long time I continued using Lotus Magellan - a DOS program - simply because it was so good at indexing and viewing the other programs I used (and still use) daily - Grandview, Agenda, Tapcis, etc. For those files it couldn't handle - htm and pdf, for example - I just converted to text and indexed that.
Now I'm in an all-windows environment where there is quite a lot of choice for (indexed) searchers, and I use Wilbur (free) and Search32 ($$). They can, at least, handle htm, and their use of pdf2txt to index and view pdf files is much quicker than the clunking invocation of the Acrobat Reader used by DTSearch.
But as you say, the problem of proprietory formats remains a real stumbling-block. I got used to having to do without a Magellan-type viewer for my old Agenda files; they have a lot of text in them and no compression - just extraneous garbage. Zoot and askSam file, too, can usually be indexed and viewed after a fashion. But programs like ContentSaver, MyBase, Outlook, OE, etc. are pretty much impossible to index unless the program has special facilities to do so (X1, for example, with Outlook Express).
For sound commercial reasons, program designers like to make it easy to get your data into their programs, and jolly hard to get it out again. This, I guess, is part of the problem. I like your idea, though, which harks back to the good old days of Magellan's and XTree's viewers.
Incidentally, the file manager Total Commander (which I love) uses plug-ins contributed by users to handle some of the problems of proprietory formats. Plug-ins do seem to be the way to go...
Derek
