Mining E-mail Newsletters for Interesting Topics
< Next Topic | Back to topic list | Previous Topic >
Posted by David Garner
Jul 20, 2020 at 08:08 PM
I’ve subscribed to a lot of e-mail newsletters through the years. Recently, I’ve discovered a bunch of them have pointers to interesting projects on the web. Unfortunately, I am getting so many newsletters, that I can’t keep up with them. I’d like to automate the location and extraction of the data, but until I can get my brain around enough NLP (Natural Language Processing), ML (Machine Learning), and other AI (Artificial Intelligence) technology, to create something to do this, I should probably figure out if someone has already done this.
I’ve not seen any mention of this kind of software in the Open Source community, but that does not mean that it does not exist. Anyone know of anything that might address even part of this effort?
Thanks.
Posted by satis
Jul 21, 2020 at 02:47 AM
If you used a Mac you could put the files into DevonThink or use DevonSphere Express, which stores and organizes files/emails/scans then indexes, auto-groups and auto-classifies info using AI - basically a sophisticated algorithm that examines the text in a given document, compares it to the text in all other documents, and mathematically computes the degree of similarity for each one.
In another post you indicated you were using Windows so you perhaps could check out DevonThink, see if it’s relevant and if so research any Windows alternatives.
That said, this really has nothing to do with outliner software or writing productivity, which is what this forum focuses on, so I’ll stop. You might want to check out a forum more closely related to this subject, perhaps https://www.reddit.com/r/productivity/
Posted by Franz Grieser
Jul 21, 2020 at 07:35 AM
At the end of this thread someone recommended Ultrarecall for working with Outlook emails:
https://www.outlinersoftware.com/topics/viewt/9101/15
Posted by MadaboutDana
Jul 21, 2020 at 08:05 AM
It’s not as sophisticated as DEVONthink, but has certain advantages of its own: UltraRecall is very capable.
I’d argue that e-mail clients do actually fit the definition of “outliner”; as for information mining/retrieval, that’s an area where e-mail clients (as outliners) are notoriously unreliable. Hence the recent spate of messages about transferring data out of e-mail clients and into some kind of info management app (or outliner).
There was an all-in-one app for Windows that acted as e-mail client/file manager/basic notetaker, but for the life of me I can’t remember the name. I remember experimenting with it a little while before we switched to Apple and being mildly impressed, but it doesn’t seemed to have gained any traction.
I think the transformation of e-mail clients into fully fledged information managers is well overdue. Some interesting efforts have been made, but the most impressive ones all appear to be aggregating online systems (i.e. web clients that suck data from all your e-mail accounts into a central web client and then allow you to process e-mails from the individual accounts). For a wide variety of reasons, I hate that idea!
I’ve tried some of the more promising recent entries to the field (for macOS/iOS) such as Airmail, Readdle Spark, Canary Mail, etc., but even these supposed paragons of efficiency tend to fall down on the most important function of all: finding and retrieving information. The most efficient tool I’ve found for handling very large e-mail repositories is MailStore (an e-mail archival tool by a German developer; only runs on Windows). It’s an outstanding tool with an outstanding search function, and there is a free version.
Cheers,
Bill
Posted by MadaboutDana
Jul 21, 2020 at 08:06 AM
Sorry, “aggregating online systems” = “online aggregators”