Is Semantics search the end of 'information organization'?
Started by Dellu
on 4/16/2018
Dellu
4/16/2018 6:07 pm
The new technology (https://books.google.com/talktobooks/ arising from Google camp is quite astounding.
A semantics search; new form of technology in action (have been theorized for ages; nobody dared to put into action).
With the advent of new technology capable of searching your "thoughts" rather than mere technical terms that we are inclined to forget, it is becoming quite plausible to get away with the whole idea of organized knowledge.
A nice article on the technology: https://qz.com/1252664/talk-to-books-at-ted-2018-ray-kurzweil-unveils-googles-astounding-new-search-tool-will-answer-any-question-by-reading-thousands-of-books/
Could semantics search replace the whole business of 'information organization' or 'information management'?
What do you guys think?
Dellu
4/16/2018 6:16 pm
I personally tried a couple of searches with the Talk to Books. I am totally impressed with the results.
The only challenge, so far as I can see, is the technology is owned by a corporation that might not have the interest to share it to the good of humans. We cannot rely on a technology that is owned by a greedy company that would be glad to 'enslave' us if it can. I feel hopeless on that side.
But, I still feel hopeful that the technology would inspire universities or other public units to develop a similar open system (semantics search) that we all could own and use.
The only challenge, so far as I can see, is the technology is owned by a corporation that might not have the interest to share it to the good of humans. We cannot rely on a technology that is owned by a greedy company that would be glad to 'enslave' us if it can. I feel hopeless on that side.
But, I still feel hopeful that the technology would inspire universities or other public units to develop a similar open system (semantics search) that we all could own and use.
Paul Korm
4/16/2018 8:01 pm
The latest incarnation of a Memex, perhaps?
Semantic search has been around for a while. It depends on the quality or robustness of the search algorithm and the quality and breadth of the corpus that is searched. Since Google has invested billions in accumulating a massive corpus, it would be hard for a "benign" academic institution to catch up.
You might be interested in some of the research published in this area through the ACM (Association of Computing Machinery), which has a number of SIGs specializing in semantic computing, search, hypertext, and related fields.
Semantic search has been around for a while. It depends on the quality or robustness of the search algorithm and the quality and breadth of the corpus that is searched. Since Google has invested billions in accumulating a massive corpus, it would be hard for a "benign" academic institution to catch up.
You might be interested in some of the research published in this area through the ACM (Association of Computing Machinery), which has a number of SIGs specializing in semantic computing, search, hypertext, and related fields.
bartb
4/19/2018 4:29 pm
I don't know if it could be a replacement. But it would be a great tool to add to the KM toolbox.
Dellu
4/19/2018 6:01 pm
Paul Korm wrote:
You might be interested in some of the research published in this area
through the ACM (Association of Computing Machinery), which has a number
of SIGs specializing in semantic computing, search, hypertext, and
related fields.
Thank you. I am totally unaware of the academic literature on this. I am checking it out.
Chris Thompson
4/19/2018 6:27 pm
You'd probably enjoy taking a look at this paper: http://students.ecs.soton.ac.uk/mwra1g13/msc/comp6045/pdfs/Bernstein%20-%20Can%20We%20Talk%20about%20Spatial%20Hypertext.pdf
It's not very technical, but it's an interesting exploration of various spatial hypertext idioms from the author of Tinderbox. Tinderbox itself emerged from Bernstein's graduate work in spatial hypertext; there are tons of really cool old spatial hypertext research systems like VIKI that few people have ever heard of but which introduced a lot of interesting ideas. The list of references in that paper provides a starting point for reading about them, and there is an interesting, opinionated discussion in Bernstein's "The Tinderbox Way" book.
There's also a whole body of work on the "semantic web" (basically, hypertext data representation with labelled arrows; think arbitrary concept maps), but much of that seems to have petered out. This Google project uses "semantic" in a different sense, referring to its traditional meaning of language understanding.
--Chris
Dellu wrote:
It's not very technical, but it's an interesting exploration of various spatial hypertext idioms from the author of Tinderbox. Tinderbox itself emerged from Bernstein's graduate work in spatial hypertext; there are tons of really cool old spatial hypertext research systems like VIKI that few people have ever heard of but which introduced a lot of interesting ideas. The list of references in that paper provides a starting point for reading about them, and there is an interesting, opinionated discussion in Bernstein's "The Tinderbox Way" book.
There's also a whole body of work on the "semantic web" (basically, hypertext data representation with labelled arrows; think arbitrary concept maps), but much of that seems to have petered out. This Google project uses "semantic" in a different sense, referring to its traditional meaning of language understanding.
--Chris
Dellu wrote:
Paul Korm wrote:
>You might be interested in some of the research published in this area
>through the ACM (Association of Computing Machinery), which has a
number
>of SIGs specializing in semantic computing, search, hypertext, and
>related fields.
Thank you. I am totally unaware of the academic literature on this. I am
checking it out.
Dellu
4/19/2018 7:02 pm
Chris Thompson wrote:
There's also a whole body of work on the "semantic web" (basically,
hypertext data representation with labelled arrows; think arbitrary
concept maps), but much of that seems to have petered out. This Google
project uses "semantic" in a different sense, referring to its
traditional meaning of language understanding.
--Chris
Hypertext is interesting means of organizing information. my understanding of "semantic search" is different from hypertext. For Tinderbox type of system, you need to explicitly assign "tags" to give a meaning to your content. "semantic search" could make the assigning of the tag superfluous because the system already knows conceptual associations (lexical synonymous; , artificially intelligent correlations between concepts) that you don't need to assign key terms or hyperlinks. To my mind, this latter approach is much more potent. Comparing it with the semantic search implemented in the google system, Tinderbox's type of mapping feels "manual" organization of information because you have to manually tag the attributes. Probably, I don't understand hypertext well; specially the 'arbitrary concept map' part.
Dellu
4/19/2018 7:26 pm
I put "Norway prosper" into the search engine, some of the results I get are very fascinating.
None of these outputs actually contain the term "prosper" nor any of its close Synonymous.
If I can magically fetch the exact concepts I am looking for like this, why do I need to keep a list of quotations or summaries or the notes from the books in my note writing software?
In this kind of powerful semantic search is in our hands, the organization of knowledge by manually setting maps/tags or hyperlinking concepts is successfully outdated.
In Norway, after the discovery of oil in the North Sea in 1969, economic growth accelerated for the following 25 years, allowing the country to catch-up and then exceed its otherwise highly similar Scandinavian neighbours, Denmark and Sweden, in terms of GDP per capita (Larsen, 2005)
After Norway became an oil-producing nation and the national income per capita rose above that of Sweden, the self-image of the Norwegians seems to have been strengthened
None of these outputs actually contain the term "prosper" nor any of its close Synonymous.
If I can magically fetch the exact concepts I am looking for like this, why do I need to keep a list of quotations or summaries or the notes from the books in my note writing software?
In this kind of powerful semantic search is in our hands, the organization of knowledge by manually setting maps/tags or hyperlinking concepts is successfully outdated.
Chris Thompson
4/19/2018 7:49 pm
That's true, the need to manually tag is why the "semantic web" concept ended up petering out, though it spawned piles and piles of research papers on "ontologies" and "triplet stores" and a lot of other things. Intelligent untagged search ("semantic search") just ends up being more powerful because no one has the time to tag the universe or apply all these standardized ontologies.
But as a researcher/notetaker, the task is a little more focused. There's no shortcut to actually taking notes based on your own understanding in order to develop your personal understanding of a domain. It's why having three dozen research papers on your hard drive ends up not being all that enlightening--even if you had a perfect semantic search engine for those papers--while having little personal write-ups/summaries, extracted personal highlights, and being able to sketch out interesting relationships you've noticed between concepts in those papers tends to lead to more complex understanding and the sort of insights you can develop to write novel papers/books. It's this latter thing that hypertext systems like ConnectedText and spatial hypertext systems like Tinderbox are designed to help with.
--Chris
Dellu wrote:
But as a researcher/notetaker, the task is a little more focused. There's no shortcut to actually taking notes based on your own understanding in order to develop your personal understanding of a domain. It's why having three dozen research papers on your hard drive ends up not being all that enlightening--even if you had a perfect semantic search engine for those papers--while having little personal write-ups/summaries, extracted personal highlights, and being able to sketch out interesting relationships you've noticed between concepts in those papers tends to lead to more complex understanding and the sort of insights you can develop to write novel papers/books. It's this latter thing that hypertext systems like ConnectedText and spatial hypertext systems like Tinderbox are designed to help with.
--Chris
Dellu wrote:
Hypertext is interesting means of organizing information. my
understanding of "semantic search" is different from hypertext. For
Tinderbox type of system, you need to explicitly assign "tags" to give a
meaning to your content. "semantic search" could make the assigning of
the tag superfluous because the system already knows conceptual
associations (lexical synonymous; , artificially intelligent
correlations between concepts) that you don't need to assign key terms
or hyperlinks. To my mind, this latter approach is much more potent.
Dellu
4/19/2018 8:18 pm
Chris Thompson wrote:
But as a researcher/notetaker, the task is a little more focused.
There's no shortcut to actually taking notes based on your own
understanding in order to develop your personal understanding of a
domain. It's why having three dozen research papers on your hard drive
ends up not being all that enlightening--even if you had a perfect
semantic search engine for those papers--while having little personal
write-ups/summaries, extracted personal highlights, and being able to
sketch out interesting relationships you've noticed between concepts in
those papers tends to lead to more complex understanding and the sort of
insights you can develop to write novel papers/books. It's this latter
thing that hypertext systems like ConnectedText and spatial hypertext
systems like Tinderbox are designed to help with.
--Chris
I fully agree. I just considered these points more of part of knowledge generation (rather than knowledge organization). I indeed need to structure my own take/points, deductions and extensions, of the ideas(concepts) in my notes. that is a way to construct and generate new knowledge. It is also possible that a reader constructs conceptual links or deductions that no artificial intelligence could emulate.
satis
4/20/2018 2:56 pm
Eh. I entered 'Are there alien intelligences secretly watching us?' wanting to see how many woo-woo citations I'd get. Of the top 5 results one was off-topic completely, three were science-oriented hits, and only one (Alien Abductions and UFO Sightings 5-eBook Bundle) was garbage.
But the next 5 results were not particularly edifying (although one's 'alien intelligence' was about chimps!).
But the next 5 results were not particularly edifying (although one's 'alien intelligence' was about chimps!).
Dellu
4/20/2018 5:28 pm
woo-woo citations
For now, they have included only 100,000 books in their corpus. They probably selected the books on the basis of the publishers (academic publisher). But, as database grows, I don't think there is a way to avoid a woo-woo citation--unless we get some way of manually blacklisting or excluding junk literature.
