Which is best at saving Web pages?

Started by Cassius on 7/10/2007
Cassius 7/10/2007 5:13 am
From what I know, there are three products for saving Web (& other HTML, etc.) pages that also have some PIMmish properties: MyBase, Surfulater, and Web Research. I own the first two and am playing with a trial copy of the third.

I assume that people who use these in place of or in addition to other PIM products do so because of their ability to accurately save Web pages. It seems reasonable to me that the ability to accurately save Web pages would rank high in the choice of which of these three products one might use.

I have just begun a partial test of this. To wit, if MyBase+WebCollect [my current choice] does not save a page properly, I try saving it using both Surfulater and Web Research. I also try using both Firefox and IE7. (In my first test, WR worked best.)

This test is only partial, because it does not test cases in which Surfulater or WR might not properly save a page, but MB might properly save it.

I would like to propose a small, minimal effort experiment. During the next 2 - 4 weeks, each of us who uses one of these packages (or others), and who is willing, do the following for those pages that their package of choice does not save properly:

1. Copy the URL of the mis-saved page
2. Note whether the copying was from Firefox or IEx AND the product (MB, Surf, WR) used
3. In 2-3 words, note what the problem was (e.g., "multiple copies of images")
4. After acquiring several examples, post the information (1,2 & 3) for each mis-saved result under this topic.

Each of us who use another product can then test these URLs with our product of choice and report on the results.

Perhaps one product will prove "best," and, if a developer is following this forum, our results may encourage it to improve its product.

-c

Jan Rifkinson 7/10/2007 1:04 pm
How about Net Snippets?
Cassius 7/10/2007 1:26 pm


Jan Rifkinson wrote:
How about Net Snippets?

By all means! Glad to have you join the effort!

Thanks!

-c
Jan Rifkinson 7/10/2007 2:14 pm
and I guess there's also Scrapbook if one uses FF
JJ 7/10/2007 2:17 pm


Jan Rifkinson wrote:
How about Net Snippets?


FYI... NetSnippets has been discontinued. I would not invest time in a product that is no longer available.

I have had great success with Web Research.

I have found that it does the best job at capturing web pages.

Here is just one example of a test I conducted:

Go to orbitz.com and make a reservation.

Using Web Research, the page was captured perfectly.

I then tried UltraRecall, which only captured parts of the page.

Try this and you can see for yourself.

NOTE: I've been trying to move to 1 product (see my thread "Web Research -->> WOW!")
The more I use WR the more I like it BUT, now that I'm "pushing" the product, I'm finding it's weaknesses. I'll be starting a new thread soon with my thoughts.

For just web clipping, I believe WR is the great... (ymmv) (The main weakness I'm finding is the importing of other files into WR i.e. avi files...)

-jj




Cassius 7/10/2007 3:10 pm
JJ wrote:

I have had great success with Web Research.

I have found that it does the best job at capturing web pages.

Here is just one example of a test I conducted:

Go to orbitz.com and make a reservation.

Using Web Research, the page was captured perfectly.

I then tried UltraRecall, which only captured parts of the page.

Try this and you can see for yourself.

-jj

Well, I tried it, except that I didn't go all the way...I didn't want to actually make a reservation and Southwest is cheaper and is nicer to its customers (except for no in-flight meals).

I stopped at the page where I was to enter my name, card number etc. I filled in a name and then saved the page using Web Research, Surfulater, and MyBase.

MB & WR saved the page perfectly, EXCEPT the filled-in blanks were blank. That is, information entered into the blanks was not saved.

Surfulater saved a version of a DIFFERENT page--a somewhat mangled version of the opening ORBITZ page. This is more than strange, as I have had another page that Surfulater saved better than did MB.

Apparently S does worse on some pages and better on others than does MB.

So, we now have a little info, but hardly enough to make an intelligent decision on which is best at saving pages. Hope more will join this effort.

-c


Jan Rifkinson 7/10/2007 5:10 pm
How does Scrapbook stack up w these results?

Yes, it's true that Net Snippets is not being developed any more but it woks quite well for what is being discussed in this thread.
Jan Rifkinson 7/10/2007 5:22 pm
BTW, I found the following from the Macropool website somewhat amusing:

Tip: For better results, additionally install the Scrapbook Firefox Extension, which the Web Research Firefox extension will then utilize for saving pages.
Ike Washington 7/10/2007 11:54 pm
Jan Rifkinson wrote:
How does Scrapbook stack up w these results?

Yes, it's true that Net Snippets is not
being developed any more but it woks quite well for what is being discussed in this
thread.

From the right-hand panel in my Firefox browser, Scrapbook displays a perfect copy of an Orbitz.com search result.

I can add notes and highlights to the page. I can get rid of sections I don't want. I can combine the page with other pages, move it to other Scrapbooks. Not bad for a free application. The search is so-so, getting slower as the data increases since it doesn't index first. I use dt search here so this isn't an issue for me.

Net Snippets - never really used it to save whole pages. I remember some pages looking not quite right. I used it to store extracts taken from files already on my hard drive. Since development stopped, I've switched to Surfulater and EverNote for this.

I do miss Net Snippet pro's outputting abilities. Add your own logo, create design templates, get html documents of your research project which look good in a business setting.

And since it launched as a left-hand panel on my screen, I had a pretty nifty work flow: copy articles, pdfs into Firefox Scrapbook on the left, work in a wiki in the center, send finished work over to Net Snippets on the right. Pity it's not being developed any further; but it's fine for storing odds and ends.

Ike
Ike Washington 7/11/2007 12:05 am


Cassius wrote:
JJ wrote:
>
>I have had great success with Web Research.
>
>I have found that it does
the best job at capturing web pages.
>
>Here is just one example of a test I
conducted:
>
>Go to orbitz.com and make a reservation.
>
>Using Web Research, the
page was captured perfectly.
>
>I then tried UltraRecall, which only captured
parts of the page.
>
>Try this and you can see for yourself.
>
>-jj

Well, I tried
it, except that I didn't go all the way...I didn't want to actually make a reservation
and Southwest is cheaper and is nicer to its customers (except for no in-flight
meals).

I stopped at the page where I was to enter my name, card number etc.

Sorry, I should have said that, like Cassius, I stopped short of actually booking a flight. Scrapbook saved both the Orbitz search results and the flight details page perfectly.

Ike
Ike Washington 7/11/2007 1:45 am


Ike Washington wrote:
From the right-hand panel in my Firefox
browser, Scrapbook displays a perfect copy of an Orbitz.com search result.

.....snip......

And since it launched as a left-hand panel on my screen, I had a pretty nifty work
flow: copy articles, pdfs into Firefox Scrapbook on the left, work in a wiki in the
center, send finished work over to Net Snippets on the right. Pity it's not being
developed any further; but it's fine for storing odds and ends.

Ike

For anyone confused by my confusion over what's left, what's right, I meant, of course, that Scrapbook sits in Firefox's left-hand panel and that Net Snippets has a right-hand panel.

It's late where I am :).

Ike
Derek Cornish 7/11/2007 8:10 pm
Ike,

I use dt search here so this isn’t an issue for me.

Yes, good point. Scrapbook and NetSnippets store their retrieved data in the Windows file system - not in proprietary databases or in forms that desktop search programs can't (or won't) currently handle. The ability to index and search files without first having to export them is a really important benefit for those who like to be able to search across all their files in one go. As frequently mentioned here, the one thing that gets in the way of working with a number of different information-capturing or note-taking programs is the difficulty of retrieving information from all of them at once. Scrapbook and NetSnippets solve at least part of that problem. dtSearch can also index and search WR's htm files (with some tinkering) - but only this file-type.

This is why I constantly dither over whether to standardize on WR (nice interface, neat features), or Scrapbook/NetSnippets - for ease of searching.

Derek

Derek
Ike Washington 7/12/2007 10:51 am
Derek Cornish wrote:
Scrapbook
and NetSnippets store their retrieved data in the Windows file system - not in
proprietary databases or in forms that desktop search programs can't (or won't)
currently handle. The ability to index and search files without first having to export them is a really important benefit for those who >like to be able to search across all their files in one go.

Another important benefit, something which makes Scrapbook "best" in my reckoning, is that any changes I make to my savings, highlights, notes, are saved in the actual html and so can be seen quite outside of Firefox. This means that any notes I add to the saved page are indexed by dt search and show up in its search results (which uses IE). Further, these pages are relatively future proof. As long as I can find a browser which reads html, I'll be able to read my modifications.

Ike
Cassius 7/15/2007 8:19 am
7/15/2007:

So far only JJ and I appear to have done any comparative testing of Web page saving.

Here are my results so far.

[NOTE: The formatting of this page may become "messed up when I post it. Also. the Hotmail page URL may extend beyond the visible screen.]

-c

===========================================================

Hotmail article-you must have Hotmail account to open this.

http://by124w.bay124.mail.live.com/mail/ReadMessageLight.aspx?Aux=0%2c0%2c633198934416300000&FolderID=00000000-0000-0000-0000-000000000001&InboxSortAscending=False&InboxSortBy=Date&ReadMessageId=ba5ef438-5d52-48da-bc89-5423089944f8&n=42444845

MyBase:
With FF: Could not save
With IE: Could not save

With Web Research:
With FF: Could not save and DAMAGED WR and/or ScrapBook*
With IE: Saved, but required firewall permission to access internet

With Surfulater:
With FF: Could not save
With IE: Saved

==============================================================

Originally tested by JJ using WR and UltraRecall (UR). My results follow:

www.Drudgereport.com

MyBase:
With FF: Saved, but viewing in MB caused several IE ad windows to open
With IE: Saved, but viewing in MB caused several IE ad windows to open

With Web Research:
With FF: Saved*
With IE: Saved, but required firewall permission to access internet

With Surfulater:
With FF: Saved
With IE: Saved

==============================================================

Originally tested by JJ using WR and UltraRecall (UR). My results follow:

www.Taunton.com

MyBase:
With FF: Saved
With IE: Saved

With Web Research:
With FF: Saved
With IE: Saved, but required firewall permission to access internet

With Surfulater:
With FF: Saved
With IE: Saved

==============================================================

Originally tested by JJ using WR and UltraRecall (UR). My results follow:

www.GPSlodge.com

MyBase:
With FF: Could not save properly
With IE: Could not save properly

With Web Research:
With FF: Saved
With IE: Saved, but required firewall permission to access internet and took VERY long
time

With Surfulater:
With FF: Saved
With IE: Saved

==============================================================

Originally tested by JJ using WR and UltraRecall (UR). My results follow:

www.Hertz.com

MyBase:
With FF: Couldn't save
With IE: Saved, but very slow in showing part of page in MB

With Web Research:
With FF: Saved*
With IE: Saved, but required firewall permission to access internet

With Surfulater:
With FF: Saved (took several seconds)
With IE: Saved (on second try)

==============================================================

Here's one I did earlier, based on a comment by JJ (I think I did these using only FF):

Well, I tried it, except that I didn’t go all the way...I didn’t want to actually make a reservation and Southwest is cheaper and is nicer to its customers (except for no in-flight meals).

I stopped at the page where I was to enter my name, card number etc. I filled in a name and then saved the page using Web Research, Surfulater, and MyBase.

MB & WR saved the page perfectly, EXCEPT the filled-in blanks were blank. That is, information entered into the blanks was not saved.

Surfulater saved a version of a DIFFERENT page—a somewhat mangled version of the opening ORBITZ page. This is more than strange, as I have had another page that Surfulater saved better than did MB.

==============================================================

* WARNING: When I tried in FireFox to save the Hotmail article
http://by124w.bay124.mail.live.com/mail/ReadMessageLight.aspx?Aux=0%2c0%2c633198934416300000&FolderID=00000000-0000-0000-0000-000000000001&InboxSortAscending=False&InboxSortBy=Date&ReadMessageId=ba5ef438-5d52-48da-bc89-5423089944f8&n=42444845

WebResearch wouldn't save it and WR also became damaged and could not save any other pages in Firefox. I tried reinstalling the WR FireFox add-on, but that didn't help. I ended up using GoBack to return my system to a time before I first tried saving this page. WR then worked properly with all other Web saves EXCEPT when I finally again tried this Hotmail page. Again WR became damaged and wouldn't save any other Web page. I suggest that one should not use WR for saving Hotmail pages.

-c


superoutliner 7/16/2007 12:51 am
What about Evernote? It can save web clips, and does it really well. And its overall features are pretty cool also.
Cassius 7/16/2007 5:09 pm

superoutliner wrote:
What about Evernote? It can save web clips, and does it really well. And its overall
features are pretty cool also.

Personally, I don't care for Evernote. However, I encourage you to test these pages (URLs) yourself with Evernote and report the results here.

Note that I only tested saving entire pages and that if I only said "Saved," I meant that the entire page was saved correctly (except for possibly changing advertisements).

-c