The user view will be an email or chat client without a "to" box. Your messages will be posted to a global public pool and go to anyone who cares, human or machine. From this pool you would receive messages ranked according to relevance, which the system would guess from messages you posted in the past.
This is a distributed design based on lots of specialists and a distributed index for routing messages to the right experts. Ownership would be distributed. People will have an economic incentive to compete for reputation by providing useful and high quality information on a hostile P2P network because doing so will allow them to sell preferential ranking (advertising) at higher market prices. Thus, our role would be to design and code the software to build the infrastructure by which peers will communicate. This would be a tiny but influential effort relative the the enormous costs of providing the human knowledge and corresponding computing resources needed to implement AGI. It would be analogous to designing and implementing the first versions of HTTP and HTML (which took 1 person 6 weeks) as opposed to building the Web.
As the first step, we need to nail down the P2P protocol, because once we write the software and people start using it, we are stuck with it. The architecture is based on passing digitally signed and timestamped messages in natural language, which later could be extended to images, video, and other human-understandable data types. Once the infrastructure is built, we can focus on the more interesting problem of distributed indexing, which boils down to quickly estimating mutual information between messages. This is an AI problem, deciding which messages are appropriate responses to other messages, where the question and answer could contain words or pictures. Whether this is best solved by OpenCog, Genifer, NARS, TexAI, Cyc, text compression, or something else will be decided by the market.
Specifically I have proposed the CMR protocol described in appendix A of http://mattmahoney.net/agi2.html with HTTP handshake and Diffie-Hellman key exchange.
-- Matt Mahoney, matma...@yahoo.com
>Silly question. Will it be compatible with RSS?
Yes. CMR protocol is for peer to peer. Peer to user can be anything, like RSS, HTML, email, facebook, twitter, text message...
>(Perhaps) less silly question. In your description, information has negative value, but agents prefer to offload it to each other rather than delete it outright. Why is this? The idea that it's beneficial to keep similar information together (building specialists) makes sense, but it seems that in the cases in which we don't just want to throw information away, it still has positive value. It's just more valuable to someone who is building more expertise in the area (perhaps to the point where they won't delete it to free up room).
>
>The idea that a *lot* of information has negative value to us (such as irrelevant facebook posts) also makes sense. But, if all information had negative value, it seems like we would discard all of it.
Negative value is much greater to humans than to machines because the cost of human time is much greater than machine storage. But in either case that is just the average. Some information has positive value to some people.
>And, finally... it makes sense to me that the mutual-information distance metric would be an ok approximation of what might interest someone, but really, interests are dependent on goals in a complex way. For example, if I type a bloglike post into mailpool, I'll be interested if it spits back a few similar posts made by other people. If it spits back hundreds, I won't be so interested. Mere similarity is not enough. I'd rather see the posts from the experts in the area, if there are any. Is there a way this is modeled in your framework?
Good point. A peer receiving lots of similar messages can store them very cheaply by compressing them. But the cost to humans is their uncompressed size.
The CMR protocol says that duplicate messages and messages with the recipient in the list of senders should be discarded. This prevents routing loops if all peers behave, but they might not. So peers need to be smart about which messages they accept. Peers will be ranked by reputation networks. You trust X and X says you can trust Y. It's how a lot of the internet works now. You trust the big websites and the links on them. CMR supports this by providing secure authentication.
-- Matt Mahoney, matma...@yahoo.com
Yes. CMR protocol is for peer to peer. Peer to user can be anything, like RSS, HTML, email, facebook, twitter, text message...
Negative value is much greater to humans than to machines because the cost of human time is much greater than machine storage. But in either case that is just the average. Some information has positive value to some people.
>(Perhaps) less silly question. In your description, information has negative value, but agents prefer to offload it to each other rather than delete it outright. Why is this? The idea that it's beneficial to keep similar information together (building specialists) makes sense, but it seems that in the cases in which we don't just want to throw information away, it still has positive value. It's just more valuable to someone who is building more expertise in the area (perhaps to the point where they won't delete it to free up room).
>
>The idea that a *lot* of information has negative value to us (such as irrelevant facebook posts) also makes sense. But, if all information had negative value, it seems like we would discard all of it.
Good point. A peer receiving lots of similar messages can store them very cheaply by compressing them. But the cost to humans is their uncompressed size.
>And, finally... it makes sense to me that the mutual-information distance metric would be an ok approximation of what might interest someone, but really, interests are dependent on goals in a complex way. For example, if I type a bloglike post into mailpool, I'll be interested if it spits back a few similar posts made by other people. If it spits back hundreds, I won't be so interested. Mere similarity is not enough. I'd rather see the posts from the experts in the area, if there are any. Is there a way this is modeled in your framework?
The CMR protocol says that duplicate messages and messages with the recipient in the list of senders should be discarded. This prevents routing loops if all peers behave, but they might not. So peers need to be smart about which messages they accept. Peers will be ranked by reputation networks. You trust X and X says you can trust Y. It's how a lot of the internet works now. You trust the big websites and the links on them. CMR supports this by providing secure authentication.
-- Matt Mahoney, matma...@yahoo.com
From: Abram Demski <abram...@gmail.com>
To: general-in...@googlegroups.com
Sent: Tuesday, July 26, 2011 9:40 PM
Subject: Re: [GI] Mailpool/Genifer
Peers will have to be intelligent to estimate relevance to their users and the reliability of their sources. It is a hard problem. The market will determine the winners.
To: AGI <a...@listbox.com>
Sent: Wednesday, July 27, 2011 1:37 PM
Subject: [agi] Re: [GI] Mailpool/Genifer
Abram Demski <abram...@gmail.com> wrote:>> what is the advantage over the existing internet?We should not have to remember which website, which application, or which file to go to for different things. We should just tell the computer what we want, and it will do it.
Facebook, Twitter, blogs, and mailing lists should figure out who you know and who you would like to know, based on common interests.
A few large companies like Google and Microsoft (and the governments that regulate them) should not control which parts of the internet you see.
When you update a website, you should not have to wait for search engines to be aware of the changes.
Some of this is already happening, but I think the trend in search engines, at least, is in the wrong direction. Smaller players cannot compete with larger ones that can keep a copy of a larger piece of the internet in their cache and update it faster.
We can't completely avoid it. The CMR protocol makes all messages available to all peers, but peers may have different user interfaces and different strategies for ranking the messages presented to the user. The user will still have to make choices.
>>Facebook, Twitter, blogs, and mailing lists should figure out who you know and who you would like to know, based on common interests.
>
>Again, fair, but I'm not sure how the protocol causes this to happen. There is an economic incentive to provide this service, but that's equally true of existing social networking sites.
Yes, and they are already doing it, for example, Facebook's news feed. A peer could have a user interface with "like" and "spam" buttons to learn your preferences. I expect there will be experimentation.
>>A few large companies like Google and Microsoft (and the governments that regulate them) should not control which parts of the internet you see.
>
>How does this fix that problem? In the loose description I wrote out, there were still "mixers" which would mostly be run by large companies/organizations. How can this be avoided?
Right now, indexing is centralized and content is distributed. But there is no reason that indexing can't be distributed too. Each part of the index would know about some topic and about other peers that know about similar topics. Messages would get routed to the right expert along multiple paths that can't be shut down by controlling just a few peers. Large companies would still have the advantage of higher reputations, but would no longer have the advantage of access to more content and faster updates that now shut out small competitors.
> how do we decide who to send our updates to?
If each peer used a simple strategy of sending 2 copies to other peers picked at random, then eventually the message would get to every peer in O(log n) time. But peers that use a more targeted strategy based on message content would be rewarded by preferential treatment of their messages. Peers would learn the specialties of their neighbors by caching their messages and comparing incoming messages with them.
>How does the system encourage small players? In the scenarios I've imagined, it seems like big players are still encouraged.
Just like anyone can start a website or a blog now.
-- Matt Mahoney, matma...@yahoo.com
From: swkane <diss...@gmail.com>
To: general-in...@googlegroups.com
Sent: Wednesday, July 27, 2011 2:18 PM
Subject: Re: [GI] Mailpool/Genifer
Abram Demski <abram...@gmail.com> wrote:We can't completely avoid it. The CMR protocol makes all messages available to all peers, but peers may have different user interfaces and different strategies for ranking the messages presented to the user. The user will still have to make choices.
>On Wed, Jul 27, 2011 at 2:24 PM, Matt Mahoney <matma...@yahoo.com> wrote:
>>> what is the advantage over the existing internet?
>>
>>We should not have to remember which website, which application, or which file to go to for different things. We should just tell the computer what we want, and it will do it.
>
>Sounds fair, but in the scenario I wrote out, a human user would still have to decide where to direct a message-- though they will often direct it at some mixer in order to minimize the thought put into this. How can this be avoided? How does the protocol get around this?
Yes, and they are already doing it, for example, Facebook's news feed. A peer could have a user interface with "like" and "spam" buttons to learn your preferences. I expect there will be experimentation.
>>Facebook, Twitter, blogs, and mailing lists should figure out who you know and who you would like to know, based on common interests.
>
>Again, fair, but I'm not sure how the protocol causes this to happen. There is an economic incentive to provide this service, but that's equally true of existing social networking sites.
>>A few large companies like Google and Microsoft (and the governments that regulate them) should not control which parts of the internet you see.Right now, indexing is centralized and content is distributed. But there is no reason that indexing can't be distributed too. Each part of the index would know about some topic and about other peers that know about similar topics. Messages would get routed to the right expert along multiple paths that can't be shut down by controlling just a few peers. Large companies would still have the advantage of higher reputations, but would no longer have the advantage of access to more content and faster updates that now shut out small competitors.
>
>How does this fix that problem? In the loose description I wrote out, there were still "mixers" which would mostly be run by large companies/organizations. How can this be avoided?
If each peer used a simple strategy of sending 2 copies to other peers picked at random, then eventually the message would get to every peer in O(log n) time. But peers that use a more targeted strategy based on message content would be rewarded by preferential treatment of their messages. Peers would learn the specialties of their neighbors by caching their messages and comparing incoming messages with them.
> how do we decide who to send our updates to?
Just like anyone can start a website or a blog now.
>How does the system encourage small players? In the scenarios I've imagined, it seems like big players are still encouraged.
-- Matt Mahoney, matma...@yahoo.com
They might, if they know where to send them. It wouldn't be hard to do using term matching. Even if peers used the very simple strategy of copying messages to everyone they know and discarding duplicates, every message would get to every router. As the network grew, this would no longer work, so peers would need to specialize by topic. Peers would tend to receive targeted messages as well as send them because other peers would learn their specialty.
Peers can act as sources, sinks, routers, or all three. Peers with user interfaces are obviously useful to their users. I suppose it is less obvious why a peer should provide the service of routing messages. Because routers rank messages before deciding whether to discard them or cache and relay them, and they can charge a fee for higher rankings if they want. Even if they don't do this immediately, they can invest in their reputation by providing this service, which will enable them to charge higher fees later because other peers ranked them highly.
-- Matt Mahoney, matma...@yahoo.com