Global Searching Across Databases Question
< Next Topic | Back to topic list | Previous Topic >
Posted by MadaboutDana
Nov 2, 2011 at 12:43 PM
Have to agree that Agent Ransack / FileLocator Lite is very capable - the Pro version is also very capable, incidentally. But even the Lite version can search through MS Office files, and supports UTF-8 text (Unicode).
Posted by Fredy
Nov 2, 2011 at 09:15 PM
Madaboutdana, following your general assertion / assumption that AR processes UTF-8 files well, and having made the last search with it yesterday, with the usual phenomenon there that it does NOT find European chars, neither in UR nor in AO, and does presents finds within their context (which is outstanding) containing the usual “chaos” chars when there are “European chars”, I checked for updates but didn’t find any.
So don’t let us create hopes that ain’t fulfilled yet, it’s the finest program of its kind even without treating UTF-8 encoding well in any circonstances ; it’s very well possible that it processes it correctly in SOME file formats, while not being able to do so in others. (BTW, some progs have a box to check for that UTF-8, whilst AR and its paid brother don’t have such boxes yet ; and btw, it’s not the UTF-8 format the important thing, but the correct rendering of “European chars”, or the other way round, why UTF-8 if “European chars” ain’t correctly processed ? Again, in UR and in AO AR fails for that, but in your files it certainly does well with this format.)
Posted by Fredy
Nov 2, 2011 at 09:17 PM
so, “and supports Unicode THERE” - agreed. They all run after the industry leaders, whilst we poor “other prog users” are left behind… ;-)
Posted by Alexander Deliyannis
Nov 2, 2011 at 09:38 PM
Fredy wrote:
>it’s not the UTF-8 format the
>important thing, but the correct rendering of “European chars”, or the other way
>round, why UTF-8 if “European chars” ain’t correctly processed ?
I haven’t tried Agent Ransack, but I will. However, I use some European character sets all the time and can say this, without becoming too technical (which I can’t): UTF/Unicode is becoming more and more the standard in anything from documents to HTML to email; if a program can’t read it (it uses two bytes per character instead of one as in ANSI) then that’s that. If a program _can_ read it, then there may still be issues in displaying, but at least the program ‘comprehends’ (processes) the info correctly.
For proper display, a Unicode font like Arial Unicode Ms must be used. Now, here’s the catch: for aesthetic reasons, a program may have hard-coded a font which is not Unicode, and then that’s that once again.
I can think of several programs all the way back to Idealist (long before Unicode) that I was never able to use because they assumed that knowledge in this world is written exclusively in the latin character set…
Posted by MadaboutDana
Nov 3, 2011 at 04:04 PM
Hi Fredy,
Sorry, just caught sight of your post about UTF-8: interesting point. And in fact, to an extent, you’re right - Agent Ransack doesn’t “see” European characters correctly in e.g. RTF files (despite the fact I’ve got iFilters installed, as recommended by the developers). I’m not sure if that’s still true when you use the FileLocator Pro version, which appears to have its own preview (hence presumably also filtering) facility.
BUT Agent Ransack DOES see/find European characters correctly in HTML files (I’ve just run searches across e.g. Spanish, French and Russian), which is why I was pleased, because I hold 99% of my important foreign-language material in HTML (or PDF) files, but by no means all search engines can find them properly (due to various encoding issues).
So yes, Agent Ransack/FileLocator Lite CAN find Unicode - but so far I’ve only absolutely confirmed it can in HTML files. I’ll need to run more tests to find out how well it handles other files - or indeed if changing to an alternative font (such as Arial Unicode) might make a difference (yes, Agent Ransack does support that!).
FWIW, I’ve never had any difficulties finding European characters in either UltraRecall or Smereka TreeProjects: both of them have very powerful search engines which appear to be fully Unicode-compliant. However, just because it works for me doesn’t necessarily mean it works for you ;-)
Another very promising application I’ve just found is MetaProducts Inquiry, which appears to have a very good search engine (there’s a nice free version for those who wish to try it out).