Early Ideas + Feedback

9 views

Skip to first unread message

Walter

unread,

Sep 5, 2008, 3:52:10 AM9/5/08

to Phantom Protocol

Hi all,

Great project.

Here's a random selection of ideas that came to me while reviewing the
PPT and PDF, hope it's of some value.

- Walter

Magnus Bråding

unread,

Sep 5, 2008, 4:07:37 AM9/5/08

to phantom-...@googlegroups.com

Hi Walter,

Sounds great indeed. Unfortunately, I could not find any attached file
with your email? Maybe you forgot to attach it, or maybe it was removed
by Google groups?

If you could please try again, and also CC my private email address this
time, that would be great!

Regards,
Magnus

walter

unread,

Sep 5, 2008, 4:18:19 AM9/5/08

to phantom-...@googlegroups.com

Sorry I wrote that email from the 'write new page' link on the groups site...

URL is http://groups.google.com/group/phantom-protocol/web/early-ideas-feedback

- Walter

Magnus Bråding

unread,

Sep 5, 2008, 8:43:09 AM9/5/08

to phantom-...@googlegroups.com

Ah, thanks, I will take a look at it soon, and comment it!

Regards,
Magnus

Walter

unread,

Sep 19, 2008, 11:50:17 PM9/19/08

to Phantom Protocol

Hi Martin,

Are you still planning to comment the notes?

Anyway please give us an update on what's happening.

- Walter

Magnus Bråding

unread,

Sep 20, 2008, 7:44:59 AM9/20/08

to phantom-...@googlegroups.com

Hi Walter,

Yes, absolutely! Really sorry for the delay, lots of things going on at
the moment. :-/ I'll really try to squeeze it in real soon, hopefully
this weekend! I really appreciate your feedback and comments.

Regards,
Magnus

Magnus Bråding

unread,

Sep 20, 2008, 7:17:09 PM9/20/08

to phantom-...@googlegroups.com

Hi Walter!

Sorry again for the late reply. Here below follows (inline) the full
answers/comments to your questions/comments, I hope I have not
misunderstood them too much, please let me know on those potential
cases, and I will reply again.

> * From the user's perspective
> Until exit nodes become commonplace, people will need access to
> both Phantom and the regular internet. User-side, I'm concerned
> difficulties in connecting to an alternate and clashing IP address
> space. If the masses can't get on board due to technical complexities,
> then the project will fail straight up. I think this has been a
> contributing factor to the failure of some previous attempts in this
> area to gain a larger userbase. Perhaps the easiest way to tie-in this
> sort of functionality with the ignorant regular user's every-day
> internet experience would be to build browser plugins that force the
> browser through the Phantom network. I'm not really familiar with
> browser design but an immediate complexity here would be the potential
> DNS-cache clash between the public internet and the Phantom network.
> (Perhaps google's multi-process, sandboxed 'Chrome' may be configured to
> offer a Phantom tab, like its incognito one?) Another approach would be
> to limit initial address allocations / generations on Phantom to
> less-used 'non-routable' address spaces reserved by IANA.

Colliding addresses won't be a problem, since an application will either
be on the Phantom network or not. Unlike TOR, you won't be able to surf
to any "normal" website/URL with a Phantom connected browser.

Simple Phantom-connection of arbitrary applications should probably be
done through a simple and cute GUI which lists each (running)
application in the system with a checkbox on the side etc, yes.

> * Centralisation
> The 'signed from a trusted authority (eg: project maintainers)'
> line in the presentation made me wince. If project maintainers can
> arbitrarily blacklist hosts or influence arbitrary (or all) nodes in the
> network, this is basically centralisation by another name. Perhaps
> implementation of such features for network resilience purposes should
> be left out of the design at first?

As mentioned in this section, it absolutely can be left out completely.
But what exactly, in your opinion, are the biggest disadvantages of this
kind of centralization, I'm just curious?

> Another issue is the
>
> o Address Selection and Persistence
> How does this work? If using standard IP-style routing
> tables for address allocation, where address ranges are assigned from a
> central authority, is this not inherently lending a traceable structure
> to the network by virtue of address allocation records? Or are purely
> random AP addresses pulled from the potential IPv4 space? What prevents
> the same address being selected by two nodes?

The database will only allow allocation of addresses that do not already
have a (non-expired) entry, which will be easy to check also in an
decentralized database.

> What prevents a different
> node using the same address as a former node, now disconnected, for
> future incoming communications, and advertising itself as that node?

The address entries are protected by asymmetric crypto, so you can only
modify an existing (non-expired) entry if you have the right
asymmetrical signing key.

> (Consider v0.7, p.17 - "it should be impossible for any node to know or
> conclude if it is connected directly to a node, or to a routing path
> owned by the same node".)
>
> The use of a random key for each connection would ensure
> maximum anonymity, but would also could negate the possibility of
> knowing with any certainty at all the identity of an AP address holder
> across disparate connections across time (without some additional
> supporting infrastructure).
>
> If this is the case, here are some options...
>
> + Trade-off option - each node maintains a secondary
> table of 'virtual' node addresses, whereby the address of a particular
> node is a combination of '(AP address) + (exit node) + (possibly a
> random ID provided by the exit node to indentify a particular version of
> an available route to the destination)' instead of just AP address.
>
> This would be quite a trade-off, having at least the
> following effects:
> # minimise the potential for subsequent
> connections to the same destination to be route to an imposter
> advertising the same address
> # reduce randomness in routing by encouraging
> re-use of the same exit node, thereby affecting anonymity
>
> I don't really like this option as it doesn't provide
> positive node identification and it weakens anonymity in routing.
>
> + Persistant Cryptographic Node Identification
> This would prevent emulating another host with the
> same AP address, requiring second and subsequent connections to the same
> AP to be validated by a cryptographic exchange matching a stored key
> from the last transaction. (Note: After reaching v0.7 p20 I realised
> that the 'communication certificate' and 'path building certificate'
> seem to perform this function. But read on....) However, in the event
> of a failure (ie: node cannot provide correct authentication details),
> this approach poses fairly annoying user interface problems due to the
> likely requirement for immediate, interactive user intervention. Random
> idea: perhaps the initial or final bit in the address space could be
> used to identify 'randomly assigned' versus 'static' intention
> addresses, whereby those 'randomly assigned' were not burdoned with
> persistant cryptographic node identification. Outgoing requests that
> did not require a particular 'source AP' address (ie: almost all
> connections conceivable, perhaps with a manual exclusion config.) could
> have a random source AP generated in order to increase anonymity. Such
> randomly assigned AP routes could then be rapidly or instantly expired
> by nearby routing nodes after the duration of the connection.

As you mention, there are already the "communication certificate" for
positive identification of authentic AP address nodes.

About the "source APs", it's even better already, anonymous outgoing
connections don't even have a "source AP" to begin with, i.e. to be able
to communicate with outbound connections only, you don't need to
allocate an AP address at all!

> * Publishing Connection Entry Points
> In contrary to v0.7 p19 it is worth considering adding a feature
> whereby nodes can request that their IP addresses and ports for
> connection are not published by other nodes. This is important in an
> environment where connection to the network is made illegal. All of the
> text about resisting realtime traffic analysis becomes significantly
> devalued if other nodes go right ahead and publish the fact that you are
> connected.

This can be handled by the node itself, simply by refusing all routing
requests from other nodes (although such a feature might pose a risk of
everyone in general just "freeloading" the protocol in the end, which is
of course a dilemma).

> * (Geo)IP Blacklists
> Another feature worth considering early-on to significantly
> improve legal security for users would be the use of (Geo)IP databases
> to add 'banned countries' and 'banned addresses' with which to filter
> list of potential nodes to connect to. Similar to the above case, this
> would be very useful (though naturally not 100% reliable) in an
> environment where connection to the network is made illegal by local
> authorities. As an extra step, the client software could have databases
> of hostile regulatory environments mapped to GeoIP blacklists built-in
> such that the above feature (requesting 'do not publish' for the node's
> address) could be automatically enabled if the external IP address of a
> node is detected as one that resides within a legally hostile jurisdiction.

Yes, this would be a nice feature on the client level (which is above
the actual protocol level though, since the protocol only dictates that
each node can select its own routing nodes, however it likes).

> * Independence from Particular Transports aka Alternate Transport
> Layers (not plain TCP or UDP/internet)
> The great thing about layered networking is that, aside from
> requiring certain general properties in a lower-level transport layer
> (ie: high-level application-level requirements include bandwidth and
> latency, lower-level requirements include 'delivery acknowledgement'),
> individual network layers do not really care how their data gets from A
> to B. This means that it is entirely feasible for certain nodes, for
> example nodes within an environment where connection to the network is
> made illegal, could use entirely different transport-layers than TCP or
> UDP over the IPv4 internet. For example, steganography-enabled hidden
> channels, or AP over non-IP transport layers (AP over PPP, AP over
> radio, etc.).
>
> It is then possible that a single node would maintain multiple
> Phantom connections over different transport layers, even prioritising
> their use based upon the application type required. Since at a glance
> it would appear that the transport layers providing the highest value
> for anonymity and deniability, along with the widest possible scope for
> implementation and use at the lowest possible transmission price would
> be those implementing steganography within the context of the public
> internet, let's look at an example there.
>
> For example, a channel implementing an image-based steganographic
> channel that uploads JPEGs to a public webserver, while slower and
> computationally more expensive than standard TCP or UDP over the public
> internet, would still be fine for store and forward protocols such as
> email or logging. It is conceivable that an AP node could route email
> across such a channel by detecting outgoing connections to an AP node on
> port 25, possibly implementing a local proxy to transcend likely
> challenges related to transparent software anonymisation (based upon
> connection timeouts built in to traditional email server applications).
>
> Steganographic channel style P2P protocols have been developed in
> the past, for example the Six/Four system by Mixter described by Oxblood
> Ruffin at http://www.cultdeadcow.com/cDc_files/cDc-0384.php and
> available for download from http://sourceforge.net/projects/sixfour/ -
> NB: I haven't actually looked at this myself.

Yes, this is indeed a lower-level feature that can be added for higher
security at a later point, once a basic implementation of the protocol
is ready. There are infinite amounts of such improvements that can be
added over time too, and it will be very interesting to talk more about
them once a basic prototype is ready!

> * Route Building
> Firstly, when the selected nodes fail to build a connection (for
> example, due to firewalling between two nodes), how is this handled?
>
> o 'Transport Class' Idea to add Resiliency / Extensibility
> To make route-building more resilient and the network in
> general more flexible / adaptable while maintaining
> backwards-compatibility, I propose that the 'setup package' (v0.7 p.21)
> be enhanced to include support for alternate transport layers (described
> above), and multiple connection records for the next host in the
> sequence (item 5.2). I have called these 'transport classes', since
> similar transports may be configured under one 'transport class' using
> the mechanism proposed below.
>
> Essentially, where item 5.2 on page 21 (v0.7) currently
> allows only a single IP address and port number to define the 'next
> host', I propose that a set of one or more connection records be provided.
>
> Initial header record.
> + Record length
> This specifies the total number of bytes taken up by
> individual connection records.
>
>
>
> Individual connection records
> + Transport class identifier (2 bytes?)
> This identifies the connection mechanism on the
> lower-level network. 0 = standard TCP or UDP over IP, other values in
> future. The spare address range would be divided between an 'officially
> assigned' range and a 'do as you please' range to account for secret
> and/or steganographic transports that may be of questionable use.
> + Record length (2 bytes)
> Number of bytes used by this connection record.
> + Transport-specific data (arbitrary)
> This data would include all information necessary for
> this transport type. For instance, for transport class 0, this would
> include IP address and port number (maybe protocol) for standard TCP or
> UDP-based connections over public IPv4 infrastructure.
>
> The benefits of this approach are as follows:
> + frees Phantom from artificial limitation to IPv4
> infrastructure
> + provides real resilience to established monitoring
> technologies by enabling custom, rare or outdated connectivity options
> + failed connections can be disregarded and a list of
> alternatives tried immediately, without requiring a complete
> re-negotiation of route involving the original source node
> + provides mechanism for expanding to IPv6
> infrastructure (already a planned future expansion)
> + allows for steganographic transports, both
> 'registered' and unregistered (if of legally questionable nature, or
> obscurity is desired), while minimising the chance of clashes by
> providing the option for transport class identifier registration
>
> The approach would also require modifying published node
> lists to include transport class identifiers, and re-structuring them to
> provide IP address and port number (and possibly protocol) as 'transport
> class identifier specific' information.
>
> Example of a possible new node list format (eg: AP/IPv4,
> AP/PPP, AP/SMS, AP/stego-email, AP/non UDP/TCP-based hidden tunnel).
> 0:1.2.3.4:tcp:222
> 18:+123456789:maybe_user:maybe_pass
> 39:+12345679023
>
> 19:steganogra...@transport-class.net.justanexample:optionalseed_or_optionalpassword
> 40:1.2.3.4
>
> Of course some of these suggested mechanisms
> (phones/mobiles) may greatly decrease anonymity in some situations... or
> may cost money ... it is up to the node or its policy to decide which to
> use.
>
> This approach would also require modifying the 'expected
> previous node' (v0.7 p.21 5.1) to be something transport class
> independent, along similar lines to the above modification.
>
> Overall though, I think heterogeneity is a good thing!

Yes, such a thing might very well be a good idea for "the next level",
after the basic protocol implementation is complete (I'm afraid that if
too many fancy/complex features are added before a basic implementation
is complete, there will never be any complete, even basic,
implementation at all, so that's why I've been wanting to keep it at the
most basic level as possible to begin with, only taking security rather
that this kind of practical matters into account, in the first revision
of the specification).

> * Independence from Particular Encryption Mechanisms + Key Lengths
> It would probably be a good idea to build in to the protocol at
> an early stage (ie: NOW) the ability to use alternative key lengths
> and/or encryption algorithms. This could be done in a similar manner to
> 'transport classes' above, but with 'encryption classes' whereby a
> particular algorithm could be specified, as well as a key length and/or
> other options.
> o If an algorithm is broken, the entire network is not
> compromised
> o When an algorithm is publicly broken, people can easily
> change over, without requiring a complete client re-install, which in
> some circumstances may not be possible or desired. (Actually, this
> would be a good scenario for the 'broadcast signed message to network'
> idea that I wasn't so keen on - see 'Centralisation' above... though I'm
> still not sure I'd agree that it's a good idea.)

Probably good for the "next level" after the basic level implementation
too. It should be noted that there are quite large non-obvious problems
related to sending non-random data (i.e. outside any encryption) in any
form between nodes before a complete path/route is formed though, due to
information piggybacking risks between non-adjacent nodes, so such
things must be very well through-through before implementation.

> * Speed of Route Negotiation
> Whilst certain parts of the world have the luxury of fast
> internet connections, it may be worthwhile to consider those without at
> the protocol design stage. The time taken to sequentially communicate
> with each node in a proposed route will seriously add up for people on
> higher latency connections (satellite, phone, small countries with poor
> internet connections and overbearing governments, etc). An obvious
> consideration, then, is parallel route negotiation rather than
> sequential, which would certainly offer significant speed improvements
> for reasonable-length paths. See also 'routing tunnel failures', below.

Route establishment will only be one-time events, which makes their time
less important. There is nothing stopping parallel creation of several
paths in the current design though (it's even intended actually), you
just can't _communicate_ data from the same connection over multiple
paths. Latency is much less of a goal than throughput for the protocol
though, which also makes this less important. Please let me know if I
misunderstood your question/suggestion though.

> * Diminishing Returns on Route Length?
> Increasing the length of outbound routes increases anonymity on
> the transport infrastructure (ie: public IPv4, or alternative if
> suggestion above is taken). But it also increases the risk of
> established routes disappearing due to nodes going down, and the
> dissemination of knowledge that communication between two named nodes is
> occuring.

Yes, this is an inherent problem/tradeoff for anyone wanting higher
theoretical anonymity.

> * Routing Tunnel Failures
> If a routing tunnel fails, how easily can existing operating
> system networking stacks be told to increase timeouts or even prompted
> for retransmission in order to maintain a higher-level (eg: TCP)
> connection? If a new routing tunnel is established within a second,
> then this may 'just work' with existing stacks and higher-level
> connections may be preserved, but if it takes a long time, this could be
> a problem. This is another reason to consider parallel route
> negotiation (see above).

Please see the "high availability" routing path design info in the white
paper. Normal routing paths are not designed to handle node failures at all.

> * Possible Attacks
>
> o Use of Latency to Ascertain Information About Routing Paths
> No easy solution to this one?

Not in addition to adding artificial latency, which is not desired
though, as is also mentioned in the white paper. No critically
conclusive evidence can be collected this way though I think (holding up
in court etc).

> o Transport-Layer Traffic Analysis for Suspected Node
> Identification
> The identity of a suspected node can be confirmed by
> analysing the transport infrastructure (IPv4 connections in the v0.7
> document) around the suspected node, and issuing various connection
> requests to that node. Even assuming a near-omnipotent government
> opponent, the feasibility of such attacks should be lowered as the
> network grows.

Yes, correct, and also an accepted risk.

> o Network Flooding Combined with Assuming Another AP
> As traffic increases, nodes with lower bandwidth will tend
> to reject new routing paths. Well funded attackers (eg: governments)
> could run very high bandwidth nodes, which would result in more traffic
> being routed through those nodes, and then pretend to be other AP
> addresses. This is possible due to no mechanism for securely
> identifying nodes' identities. (Consider v0.7, p.17 - "it should be
> impossible for any node to know or conclude if it is connected directly
> to a node, or to a routing path owned by the same node".). Granted, the
> public internet infrastructure has the same vulnerability, but there is
> a degree of physical security and trust assumed that increases as data
> travels towards long-distance carrier companies from the end user - in
> Phantom v0.7, this does not exist.

Yes, and this is one reason for the "centralized" banning of certain IP
addresses. It migth still require quite high resources to be successful
in such an attack though.

> * Public IPv4 Infrastructure + Phantom Bridging and Security
> It should be possible to define a URL scheme for Phantom that
> allows browsers equipped with plugins to offer links between standard
> internet infrastructure and Phantom. Such a scheme essentially results
> in a 'shared secret' scenario even before nodes have communicated across
> the network, which means that:
>
> o The hash of a public key that identifies the node could be
> provided as part of the link
>
> o Additional (third-party) keyserver nodes could be specified
> as 'external references' to the public key of the target node, with
> which the cryptographic key's hash could also be verified.
>
> Armed with such a public key, when the client node attempts to
> establish a routing path to the target node, it could use this
> information to reject imposters. Ideally, decentralised and strictly
> 'opt-in' key authorities could be defined to address the same issue
> (outlined twice above, with reference to v0.7, p.17 - "it should be
> impossible for any node to know or conclude if it is connected directly
> to a node, or to a routing path owned by the same node"). This would
> require some further modifications to the protocol, and would require
> that the key authorities to which the node had 'opted in' had their own
> cryptographic identifiers stored on the client as well.
>
> o Protocol Modifications for Secure / Persistent
> Identification of AP nodes
> It appears to me that to support persistent identification
> at least the following modifications would be required.
> + Routing path establishment should include (optional?)
> provision of public keys (against which a hash check can be made)
> + Since the system is decentralised and there is no way
> to resolve AP address collissions with any authority, the hash of the
> node's cryptographic key could be included as an adjunct to the target
> AP address in all outgoing network communications, and should be stored
> with all AP routing information
> + Individual nodes should remember the hashes of public
> keys of nodes they have previously communicated with, and give the user
> some feedback if a check fails (possible user interface issue...)

I'm not sure I fully understand this suggestion, but the public keys for
each AP address are stored in the network database along with the
routing information for the AP address, so no servers should really be
reqired? Also, there is no bridging between the IPv4 address space and
the AP address space to begin with, which makes me a little confused?

Also, you might potentially have misunderstood the v0.7, p.17 - "it
should be impossible for any node to know or conclude if it is connected
directly to a node, or to a routing path owned by the same node" quote,
since you keep mentioning it in connection with AP address
authentication/identification. The quote refers rather to the outer
level communication, while the authentication of the endpoint nodes will
be done with separate keys over a single connection _inside_ the
combined outer connections between nodes, see what I mean?

Finally, thanks a lot for your questions and suggestions, please keep
them coming!

Regards,
Magnus (which is not the same as Martin btw ;-))

walter

unread,

Sep 21, 2008, 2:28:51 AM9/21/08

to phantom-...@googlegroups.com

>> * From the user's perspective
>> Until exit nodes become commonplace, people will need access to
>> both Phantom and the regular internet. User-side, I'm concerned
>> difficulties in connecting to an alternate and clashing IP address
>> space.
>

> Colliding addresses won't be a problem, since an application will either
> be on the Phantom network or not. Unlike TOR, you won't be able to surf
> to any "normal" website/URL with a Phantom connected browser.
>
> Simple Phantom-connection of arbitrary applications should probably be
> done through a simple and cute GUI which lists each (running)
> application in the system with a checkbox on the side etc, yes.

Hrrm, interesting approach. I had no idea how this would work by
looking at the paper and presentation. This unavoidably brings
discussion back to application-level stuff that lies outside of
the raw protocol spec... sorry :)

Potential issues with this approach:
- some applications (eg: browsers) will cache data regarding the
network (eg: DNS resolutions, actual content) across both Phantom
and non-Phantom connections, so there is still a cross-over of
data at the application level if you adopt this strategy.
- applications with their own IP-level settings (DNS servers, proxy
servers, IM servers, etc.) will suffer from the requirement to
re-configure if they change from IP to Phantom and back again.
- it's very annoying for the user to have to have two copies of an
application running to access both the regular IPv4 address
space and the phantom network (and wastes resources)

Of course there are options for minimising these issues, such as
automatic application-specific reconfiguration/cache flushing,
running Phantom-connected applications in a dedicated VM or
maintaining multiple copies of a given application, though perhaps
it is worth taking a step back and looking at other strategies.

Brain dump of some ideas that come to mind:
- extending major applications such as browsers to include some
form of Phantom awareness
- extending implementations to include a DNS server
that always resolves a particular DNS suffix (eg: .phantom)
via Phantom, then dynamically creating an application-specific
proxy port on the local interface that redirects the application
across the Phantom network. Alternatively, with some more
hacking, maybe a 127.x.x.x range address could be mapped
to route arbitrary protocols' traffic in both directions? This
would allow regular IPv4 and phantom to work from a single
instance of an arbitrary application.

>> * Centralisation
>> The 'signed from a trusted authority (eg: project maintainers)'
>> line in the presentation made me wince. If project maintainers can
>> arbitrarily blacklist hosts or influence arbitrary (or all) nodes in the
>> network, this is basically centralisation by another name. Perhaps
>> implementation of such features for network resilience purposes should
>> be left out of the design at first?
>
> As mentioned in this section, it absolutely can be left out completely.
> But what exactly, in your opinion, are the biggest disadvantages of this
> kind of centralization, I'm just curious?

It assumes a degree of trust in the authority. If the authority is
compromised, the network is to some extent compromised.
Also, it makes the authority a target of legal and technical attacks.

>> o Address Selection and Persistence
>> How does this work? If using standard IP-style routing
>> tables for address allocation, where address ranges are assigned from a
>> central authority, is this not inherently lending a traceable structure
>> to the network by virtue of address allocation records? Or are purely
>> random AP addresses pulled from the potential IPv4 space? What prevents
>> the same address being selected by two nodes?
>
> The database will only allow allocation of addresses that do not already
> have a (non-expired) entry, which will be easy to check also in an
> decentralized database.

Aha. Makes sense. However, it also begs the question of expiry
rules, which will be critical to prevent impersonation.

>> What prevents a different
>> node using the same address as a former node, now disconnected, for
>> future incoming communications, and advertising itself as that node?
>
> The address entries are protected by asymmetric crypto, so you can only
> modify an existing (non-expired) entry if you have the right
> asymmetrical signing key.

That's fine for existing entries for nodes with access to them, but is it
somehow possible to provide false information to new nodes with no
knowledge of the 'real' entries?

If possible and not addressed already, perhaps the protocol description
could include a recommendation that the implementation should
validate shared database information by connecting to and comparing
data from multiple sources.

>> * Independence from Particular Transports aka Alternate Transport
>> Layers (not plain TCP or UDP/internet)
>

> Yes, this is indeed a lower-level feature that can be added for higher
> security at a later point, once a basic implementation of the protocol
> is ready. There are infinite amounts of such improvements that can be
> added over time too, and it will be very interesting to talk more about
> them once a basic prototype is ready!

> ( ... snip ... )

> Yes, such a thing might very well be a good idea for "the next level",
> after the basic protocol implementation is complete (I'm afraid that if
> too many fancy/complex features are added before a basic implementation
> is complete, there will never be any complete, even basic,
> implementation at all, so that's why I've been wanting to keep it at the
> most basic level as possible to begin with, only taking security rather
> that this kind of practical matters into account, in the first revision
> of the specification).

I think leaving space for it in the original protocol design costs very
little and provides the possibility for persistence across early and later
implementations of Phantom. At least to me, lower-layer independence
is one of the most intelligent features in popular networking protocols -
why not add it now? (Do your points below about 'information
piggybacking' have some bearing here?)

>> * Independence from Particular Encryption Mechanisms + Key Lengths
>

> Probably good for the "next level" after the basic level implementation
> too. It should be noted that there are quite large non-obvious problems
> related to sending non-random data (i.e. outside any encryption) in any
> form between nodes before a complete path/route is formed though, due to
> information piggybacking risks between non-adjacent nodes, so such
> things must be very well through-through before implementation.

OK. I'm not really clear about this. Would you mind drawing out
a dumbed-down example of this risk and how it is mitigated in the
existing design?

>> * Speed of Route Negotiation

> Route establishment will only be one-time events, which makes their time
> less important. There is nothing stopping parallel creation of several
> paths in the current design though (it's even intended actually), you
> just can't _communicate_ data from the same connection over multiple
> paths.
>
> Latency is much less of a goal than throughput for the protocol
> though, which also makes this less important.

OK, point taken about this being a lesser concern, however because
high latencies may impact on the types of applications that may be
deployed on the network it should still remain a consideration.

>> o Network Flooding Combined with Assuming Another AP
>> As traffic increases, nodes with lower bandwidth will tend
>> to reject new routing paths. Well funded attackers (eg: governments)
>> could run very high bandwidth nodes, which would result in more traffic
>> being routed through those nodes, and then pretend to be other AP
>> addresses. This is possible due to no mechanism for securely
>> identifying nodes' identities. (Consider v0.7, p.17 - "it should be
>> impossible for any node to know or conclude if it is connected directly
>> to a node, or to a routing path owned by the same node".). Granted, the
>> public internet infrastructure has the same vulnerability, but there is
>> a degree of physical security and trust assumed that increases as data
>> travels towards long-distance carrier companies from the end user - in
>> Phantom v0.7, this does not exist.
>
> Yes, and this is one reason for the "centralized" banning of certain IP
> addresses. It migth still require quite high resources to be successful
> in such an attack though.

Although this sounds like a nice case for centralised network trust,
such bans would require some form of human intervention, which
inherently reduces their responsiveness and effectiveness. An 'bad
node' would succeed by default until both noticed and banned. I still
believe that including this mechanism at the protocol level is a bad
idea, as it does not really provide any quantifiable security whilst
giving the ability to manipulate traffic flow network-wide to selected
individuals.

>> * Public IPv4 Infrastructure + Phantom Bridging and Security

> I'm not sure I fully understand this suggestion

Forget it for now, this was basically a thought-train about leveraging
the possibilities of URL handlers to add a higher degree of
persistence to AP nodes than is provided by the network database,
while enabling a single browser instance to access content on
both the IPv4 network and phantom. The persistence idea could
still probably be useful but can be implemented later and does not
affect the protocol. Regarding URL handlers, I now prefer the idea
described above regarding DNS/routing hacks - it's a much cooler
way to get IPv4/Phantom access happening in single instances of
arbitrary applications (not just browsers).

> Also, you might potentially have misunderstood the v0.7, p.17 - "it
> should be impossible for any node to know or conclude if it is connected
> directly to a node, or to a routing path owned by the same node" quote,
> since you keep mentioning it in connection with AP address
> authentication/identification. The quote refers rather to the outer
> level communication, while the authentication of the endpoint nodes will
> be done with separate keys over a single connection _inside_ the
> combined outer connections between nodes, see what I mean?

This is shorter-term.

I was thinking about the larger potential issue that if the network
database looses a particular AP address for awhile, and someone
later re-registers it with a different key, then it could be an
impersonator.

While cryptographically the impersonator's identity can be assured
to be that corresponding to the public key in the network database
at the time of connection, there is no persistence across multiple
registrations for an AP in the network database, and no
address-allocation hierarchy, so there is nothing stopping this
attack. Secondly, the fact that at the application level only the
IP address (and not the corresponding public key) is provided in
order to make a connection, there is no way to ensure that the
user's intended destination is that which they are actually
communicating with.

The inclusion of a DNS server in an implementation (as outlined
above) would provide one great way to solve this problem, with
the potential for md5sumofpublickey.1.2.3.4.phantom or some
similar resolution trick to instruct the implementation to
perform a check that the current network-database has the
same public key registered for the target address as it did
when the link was created.

- Walter

PS: Sorry about messing up your name!

Magnus Bråding

unread,

Nov 17, 2008, 8:18:09 PM11/17/08

to phantom-...@googlegroups.com

Hello Walter,

Some answers for you below:

>> Simple Phantom-connection of arbitrary applications should probably be
>> done through a simple and cute GUI which lists each (running)
>> application in the system with a checkbox on the side etc, yes.
>
> Hrrm, interesting approach. I had no idea how this would work by
> looking at the paper and presentation. This unavoidably brings
> discussion back to application-level stuff that lies outside of
> the raw protocol spec... sorry :)
>
> Potential issues with this approach:
> - some applications (eg: browsers) will cache data regarding the
> network (eg: DNS resolutions, actual content) across both Phantom
> and non-Phantom connections, so there is still a cross-over of
> data at the application level if you adopt this strategy.

I have never seen DNS being cached between restarts of a browser, and
the cache collisions could be manually handled by cache flushes in the
worst case I guess.

> - applications with their own IP-level settings (DNS servers, proxy
> servers, IM servers, etc.) will suffer from the requirement to
> re-configure if they change from IP to Phantom and back again.

Reconfigure? More exactly what do you mean?

> - it's very annoying for the user to have to have two copies of an
> application running to access both the regular IPv4 address
> space and the phantom network (and wastes resources)

If they want to access both networks at the same time, I think it's a
reasonable start to have one copy of the application running for each
network. It could be fixed with special tricks and solutions later on I
guess though.

> Of course there are options for minimising these issues, such as
> automatic application-specific reconfiguration/cache flushing,
> running Phantom-connected applications in a dedicated VM or
> maintaining multiple copies of a given application, though perhaps
> it is worth taking a step back and looking at other strategies.
>
> Brain dump of some ideas that come to mind:
> - extending major applications such as browsers to include some
> form of Phantom awareness

Sure, they (the browser makers etc) are indeed welcome to do this.

> - extending implementations to include a DNS server
> that always resolves a particular DNS suffix (eg: .phantom)
> via Phantom, then dynamically creating an application-specific
> proxy port on the local interface that redirects the application
> across the Phantom network.

A DNS server would make a scary "central point", both from an
availability perspective and anonymity perspective.

> Alternatively, with some more
> hacking, maybe a 127.x.x.x range address could be mapped
> to route arbitrary protocols' traffic in both directions? This
> would allow regular IPv4 and phantom to work from a single
> instance of an arbitrary application.

Yes, this is a candidate for the "special tricks and solutions" I
mention above, sure.

>>> * Centralisation
>>> The 'signed from a trusted authority (eg: project maintainers)'
>>> line in the presentation made me wince. If project maintainers can
>>> arbitrarily blacklist hosts or influence arbitrary (or all) nodes in the
>>> network, this is basically centralisation by another name. Perhaps
>>> implementation of such features for network resilience purposes should
>>> be left out of the design at first?
>> As mentioned in this section, it absolutely can be left out completely.
>> But what exactly, in your opinion, are the biggest disadvantages of this
>> kind of centralization, I'm just curious?
>
> It assumes a degree of trust in the authority. If the authority is
> compromised, the network is to some extent compromised.
> Also, it makes the authority a target of legal and technical attacks.

Legal attacks: Maybe to some extent, but as I emphasize in the paper, no
one theoretically has to know who this authority is (deniability), not
even the project maintainers, it could just be an "anonymous good
shepherd", whose public key is allowed to be left in the source code by
the project maintainers.

Technical attacks: I don't see how?

>>> o Address Selection and Persistence
>>> How does this work? If using standard IP-style routing
>>> tables for address allocation, where address ranges are assigned from a
>>> central authority, is this not inherently lending a traceable structure
>>> to the network by virtue of address allocation records? Or are purely
>>> random AP addresses pulled from the potential IPv4 space? What prevents
>>> the same address being selected by two nodes?
>> The database will only allow allocation of addresses that do not already
>> have a (non-expired) entry, which will be easy to check also in an
>> decentralized database.
>
> Aha. Makes sense. However, it also begs the question of expiry
> rules, which will be critical to prevent impersonation.

Yes, this is indeed an interesting problem to be discussed much more.
Any suggestions?

>>> What prevents a different
>>> node using the same address as a former node, now disconnected, for
>>> future incoming communications, and advertising itself as that node?
>> The address entries are protected by asymmetric crypto, so you can only
>> modify an existing (non-expired) entry if you have the right
>> asymmetrical signing key.
>
> That's fine for existing entries for nodes with access to them, but is it
> somehow possible to provide false information to new nodes with no
> knowledge of the 'real' entries?

People who want to be sure of the identity of someone they connect to
the first time should make sure to validate their certificate (or at
least certificate fingerprint) with one received from a trusted source, yes.

> If possible and not addressed already, perhaps the protocol description
> could include a recommendation that the implementation should
> validate shared database information by connecting to and comparing
> data from multiple sources.

I'm not sure what you mean here? Please elucidate.

>>> * Independence from Particular Transports aka Alternate Transport
>>> Layers (not plain TCP or UDP/internet)
>> Yes, this is indeed a lower-level feature that can be added for higher
>> security at a later point, once a basic implementation of the protocol
>> is ready. There are infinite amounts of such improvements that can be
>> added over time too, and it will be very interesting to talk more about
>> them once a basic prototype is ready!
>> ( ... snip ... )
>> Yes, such a thing might very well be a good idea for "the next level",
>> after the basic protocol implementation is complete (I'm afraid that if
>> too many fancy/complex features are added before a basic implementation
>> is complete, there will never be any complete, even basic,
>> implementation at all, so that's why I've been wanting to keep it at the
>> most basic level as possible to begin with, only taking security rather
>> that this kind of practical matters into account, in the first revision
>> of the specification).
>
> I think leaving space for it in the original protocol design costs very
> little and provides the possibility for persistence across early and later
> implementations of Phantom. At least to me, lower-layer independence
> is one of the most intelligent features in popular networking protocols -
> why not add it now? (Do your points below about 'information
> piggybacking' have some bearing here?)

Sure, but I think there is already "room in the specification" for this,
through the abstraction. I only go as low as the SSL level (or even "SSL
equivalent"), and this can be implemented over UDP or whatever too.
Sorry if not making that as clear as it could be then anyway.

>>> * Independence from Particular Encryption Mechanisms + Key Lengths
>> Probably good for the "next level" after the basic level implementation
>> too. It should be noted that there are quite large non-obvious problems
>> related to sending non-random data (i.e. outside any encryption) in any
>> form between nodes before a complete path/route is formed though, due to
>> information piggybacking risks between non-adjacent nodes, so such
>> things must be very well through-through before implementation.
>
> OK. I'm not really clear about this. Would you mind drawing out
> a dumbed-down example of this risk and how it is mitigated in the
> existing design?

If a node in a routing path can undetectedly modify or control just 32
bits of data _anywhere_ in all the data being sent to any non-adjacent
node in the routing path, it could covertly convey its IP address to any
conspiring nodes in the routing path, thus being able to communicate
with these out-of-band from that point on, which would make it possible
for them to combine their knowledge of included node IP addresses in the
routing path, which would in turn reduce or even completely eliminate
the anonymizing function of the routing path. This should be described
further in the paper too I'm almost certain.

>>> * Speed of Route Negotiation
>> Route establishment will only be one-time events, which makes their time
>> less important. There is nothing stopping parallel creation of several
>> paths in the current design though (it's even intended actually), you
>> just can't _communicate_ data from the same connection over multiple
>> paths.
>>
>> Latency is much less of a goal than throughput for the protocol
>> though, which also makes this less important.
>
> OK, point taken about this being a lesser concern, however because
> high latencies may impact on the types of applications that may be
> deployed on the network it should still remain a consideration.

Sure, it should indeed be optimized as long as it doesn't affect
security or throughput. All quality factors should always be taken into
consideration, some of them just must remain prioritized at all times.

>>> o Network Flooding Combined with Assuming Another AP
>>> As traffic increases, nodes with lower bandwidth will tend
>>> to reject new routing paths. Well funded attackers (eg: governments)
>>> could run very high bandwidth nodes, which would result in more traffic
>>> being routed through those nodes, and then pretend to be other AP
>>> addresses. This is possible due to no mechanism for securely
>>> identifying nodes' identities. (Consider v0.7, p.17 - "it should be
>>> impossible for any node to know or conclude if it is connected directly
>>> to a node, or to a routing path owned by the same node".). Granted, the
>>> public internet infrastructure has the same vulnerability, but there is
>>> a degree of physical security and trust assumed that increases as data
>>> travels towards long-distance carrier companies from the end user - in
>>> Phantom v0.7, this does not exist.
>> Yes, and this is one reason for the "centralized" banning of certain IP
>> addresses. It migth still require quite high resources to be successful
>> in such an attack though.
>
> Although this sounds like a nice case for centralised network trust,
> such bans would require some form of human intervention, which
> inherently reduces their responsiveness and effectiveness. An 'bad
> node' would succeed by default until both noticed and banned. I still
> believe that including this mechanism at the protocol level is a bad
> idea, as it does not really provide any quantifiable security whilst
> giving the ability to manipulate traffic flow network-wide to selected
> individuals.

Sure, humans are slow, but what do you suggest would be a better
solution, other than removing the humans _too_, thus leaving _no_
solution to this problem rather than a non-optimal one?

Please see my comment on central DNS servers and the security of
previously unconnected node's identities above. In addition to this,
yes, the addresses could be comprised like that, but the problem is that
it would break the backward compatibility with IP, which is one of the
main design objectives. I rather chose to enable people to perform the
certificate fingerpring check semi-manually themselves instead, if they
wanted (which could though indeed be done by entering an address of your
suggested format "md5sumofpublickey.1.2.3.4" into any Phantom-aware
application).

Thanks again for your input and ideas Walter, please keep them coming!
(and sorry again for forgetting to reply, it's just that when emails
become too long, you put it off to answer them until "a better time",
and then eventually, if they are longer than your suitable "better
times" are, you easily forget them, and that's exactly what has happened
here :-/)

Regards,
Magnus

walter

unread,

Nov 19, 2008, 10:47:02 AM11/19/08

to phantom-...@googlegroups.com

>>> Simple Phantom-connection of arbitrary applications should probably be
>>> done through a simple and cute GUI which lists each (running)
>>> application in the system with a checkbox on the side etc, yes.
>>
>> Hrrm, interesting approach. I had no idea how this would work by
>> looking at the paper and presentation. This unavoidably brings
>> discussion back to application-level stuff that lies outside of
>> the raw protocol spec... sorry :)
>>
>> Potential issues with this approach:
>> - some applications (eg: browsers) will cache data regarding the
>> network (eg: DNS resolutions, actual content) across both Phantom
>> and non-Phantom connections, so there is still a cross-over of
>> data at the application level if you adopt this strategy.
>
> I have never seen DNS being cached between restarts of a browser, and
> the cache collisions could be manually handled by cache flushes in the
> worst case I guess.

I'm not suggesting cross session. I mean that if you start the app,
switch to the cute GUI, and click the tick, then whatever's been
accessed so far will be cached.

>> - applications with their own IP-level settings (DNS servers, proxy
>> servers, IM servers, etc.) will suffer from the requirement to
>> re-configure if they change from IP to Phantom and back again.
>
> Reconfigure? More exactly what do you mean?

I mean that if you store an IP address as part of a program's
configuration file, regardless of its purpose, then in order to
cope with 'Phantom enabled' and regular internet operations
you will need to have either two instances of the program, or
two instances of the configuration and an application specific
mechanism for switching between them.

>> - it's very annoying for the user to have to have two copies of an
>> application running to access both the regular IPv4 address
>> space and the phantom network (and wastes resources)
>
> If they want to access both networks at the same time, I think it's a
> reasonable start to have one copy of the application running for each
> network. It could be fixed with special tricks and solutions later on I
> guess though.

Yes it's a reasonable start, but it would be nice to think of some
cool hacks to make things easy for the user as soon as possible,
as I believe this is critical for widespread adoption. Greater
adoption brings greater utility, which in turn feeds adoption.
Reaching the 'critical mass' of utility (ie: peers to communicate
with / servers or data available on the Phantom network) is
one of if not the largest challenge for this kind of project. Things
like Freenet failed dismally here. Many smaller P2P networks
were the same.

>> Of course there are options for minimising these issues, such as
>> automatic application-specific reconfiguration/cache flushing,
>> running Phantom-connected applications in a dedicated VM or
>> maintaining multiple copies of a given application, though perhaps
>> it is worth taking a step back and looking at other strategies.
>>
>> Brain dump of some ideas that come to mind:
>> - extending major applications such as browsers to include some
>> form of Phantom awareness
>
> Sure, they (the browser makers etc) are indeed welcome to do this.

While this is possible, it is not a very elegant solution, as it kind of
short-circuits the 'just like IP' / 'drop in and run' benefit that the
protocol otherwise holds.

>> - extending implementations to include a DNS server
>> that always resolves a particular DNS suffix (eg: .phantom)
>> via Phantom, then dynamically creating an application-specific
>> proxy port on the local interface that redirects the application
>> across the Phantom network.
>
> A DNS server would make a scary "central point", both from an
> availability perspective and anonymity perspective.

Perhaps you didn't understand my suggestion - this would be a
local application on each user's system - part of a 'Phantom
network connectivity toolkit', and entirely optional. It would
simply facilitate connecting legacy applications to the Phantom
network in a reliable way, with no 'regular internet' cross-over,
by triggering the legacy applications to use a local proxy that
took care of the Phantom bridging.

>>>> * Centralisation
>>>> The 'signed from a trusted authority (eg: project maintainers)'
>>>> line in the presentation made me wince. If project maintainers can
>>>> arbitrarily blacklist hosts or influence arbitrary (or all) nodes in the
>>>> network, this is basically centralisation by another name. Perhaps
>>>> implementation of such features for network resilience purposes should
>>>> be left out of the design at first?
>>> As mentioned in this section, it absolutely can be left out completely.
>>> But what exactly, in your opinion, are the biggest disadvantages of this
>>> kind of centralization, I'm just curious?
>>
>> It assumes a degree of trust in the authority. If the authority is
>> compromised, the network is to some extent compromised.
>> Also, it makes the authority a target of legal and technical attacks.
>
> Legal attacks: Maybe to some extent, but as I emphasize in the paper, no
> one theoretically has to know who this authority is (deniability), not
> even the project maintainers, it could just be an "anonymous good
> shepherd", whose public key is allowed to be left in the source code by
> the project maintainers.

It doesn't matter who it is. Since the 'good shepherd' thing is basically
a human, it will respond slowly to issues, and only if they are
detected. Having it within the design of the protocol creates
physical, social centralisation of power and influence (which inherantly
creates a degree of vulnerability). Therefore I do not support this
proposal.

> Technical attacks: I don't see how?

I'm talking about the theoretical possibility of a well motivated (funded,
etc.) opponent deciding to hack the shepherd's systems, because
they have identified him, her or it.

Same argument as above, technical/social/physical/credibility attacks
are all equivalent ... basically if there's a 'more powerful target' with any
'special powers' at all, they become an instant target.

>>>> o Address Selection and Persistence
>>>> How does this work? If using standard IP-style routing
>>>> tables for address allocation, where address ranges are assigned from a
>>>> central authority, is this not inherently lending a traceable structure
>>>> to the network by virtue of address allocation records? Or are purely
>>>> random AP addresses pulled from the potential IPv4 space? What prevents
>>>> the same address being selected by two nodes?
>>> The database will only allow allocation of addresses that do not already
>>> have a (non-expired) entry, which will be easy to check also in an
>>> decentralized database.
>>
>> Aha. Makes sense. However, it also begs the question of expiry
>> rules, which will be critical to prevent impersonation.
>
> Yes, this is indeed an interesting problem to be discussed much more.
> Any suggestions?

I am not familiar with the network database (whatever the term used
was) ... however from a distant perspective I think that a degree of
centralisation is assumed if individual nodes trust their partners
opinion on whether an address is taken or not. If this is the case, then
nodes with extra peers become 'more powerful' in that their version
of the truth becomes more weighty than nodes on the fringe of the
network topology.

I am not sure how to solve this issue, but here are some ideas.

1. Reason for using IP addresses is drop-in connectivity for
legacy applications.

2. IPv4 uses a heirarchical (though not entirely centralised) routing
structure. (ie: address/netmask relationship.)

3. It is not necessary to use IPv4 routing structures within the
Phantom network, as long as P2P connections between arbitrary
nodes can be established, and the identity of the peer in
each communication can be either established (if previously
communicated with) or reliably vouched for by other nodes
(if communication is occurring for the first time - this is dangerous
but seemingly necessary.) As a slight tangent, this is what
I was talking about in a previous brainstorming post about
potential use of a URL scheme whereby it would be possible
to include a cryptographic representation of a node's
identity in a link. That way, if the identity of the machine behind
the Phantom address has changed since the link was made,
those following the link will become aware of the fact.

>>>> What prevents a different
>>>> node using the same address as a former node, now disconnected, for
>>>> future incoming communications, and advertising itself as that node?
>>> The address entries are protected by asymmetric crypto, so you can only
>>> modify an existing (non-expired) entry if you have the right
>>> asymmetrical signing key.
>>
>> That's fine for existing entries for nodes with access to them, but is it
>> somehow possible to provide false information to new nodes with no
>> knowledge of the 'real' entries?
>
> People who want to be sure of the identity of someone they connect to
> the first time should make sure to validate their certificate (or at
> least certificate fingerprint) with one received from a trusted source, yes.

This seems like a cop out to me. If real world, deployable mechanisms
for node identification are not included somehow in the proposed
system, then I am skeptical that it will ever be implemented
("Encrypted Communications over IPv4 - An Ongoing Saga").

>> If possible and not addressed already, perhaps the protocol description
>> could include a recommendation that the implementation should
>> validate shared database information by connecting to and comparing
>> data from multiple sources.
>
> I'm not sure what you mean here? Please elucidate.

What I mean is that a node that blindly trusts a version of the
shared/network database coming from one other system is
possible to deceive (if that node passes false information).

Perhaps nodes should aim to establish geographically and/or
temporally disparate relationships with multiple nodes, then
use those nodes to cross-validate any data being received
about the identity of third party nodes.

Sorry I'm not too good at explaining things, so if this is still
not clear, let me know :)

I understand this, however there are critical properties of the
transport layer that would be useful to understand at the
Phantom level. Things like:
- anonymity
- cost
- reliability
- latency
- security from traffic analysis
- ... etc.

To me, it makes sense to include at least adequate
placeholders for the potential addition of such information
for those nodes who choose to make use of it.

A large part of the success of IPv4 has been the
extensibility that it offers. I see this proposal as
providing a similar degree of extensibility for Phantom.

Remember, teaching all nodes to ignore or
deal with unknown values in a known fashion
is an easy thing to do. Upgrading the protocol
in the future to add space for newly desired
features is not. Particularly when you already
have a large user base. That's part of the
reason IPv6 hasn't taken off.

>>>> * Independence from Particular Encryption Mechanisms + Key Lengths
>>> Probably good for the "next level" after the basic level implementation
>>> too. It should be noted that there are quite large non-obvious problems
>>> related to sending non-random data (i.e. outside any encryption) in any
>>> form between nodes before a complete path/route is formed though, due to
>>> information piggybacking risks between non-adjacent nodes, so such
>>> things must be very well through-through before implementation.
>>
>> OK. I'm not really clear about this. Would you mind drawing out
>> a dumbed-down example of this risk and how it is mitigated in the
>> existing design?
>
> If a node in a routing path can undetectedly modify or control just 32
> bits of data _anywhere_ in all the data being sent to any non-adjacent
> node in the routing path, it could covertly convey its IP address to any
> conspiring nodes in the routing path, thus being able to communicate
> with these out-of-band from that point on, which would make it possible
> for them to combine their knowledge of included node IP addresses in the
> routing path, which would in turn reduce or even completely eliminate
> the anonymizing function of the routing path. This should be described
> further in the paper too I'm almost certain.

Sorry I'm not very mathematical - but if the two nodes can communicate
at all (even explicitly) then it should be possible to share information
anyway?

I guess that it is impossible to design a network that allows all nodes
to communicate with each other but prohibits completely any sharing
of information between them.

It seems to me that fundamentally, the only way to mitigate risks of
multiple nodes banding together to manipulate or decieve others
is to employ on every node adequate checks against the accuracy
of information supplied by any one node. Is this correct?

>>> Route establishment will only be one-time events, which makes their time
>>> less important. There is nothing stopping parallel creation of several
>>> paths in the current design though (it's even intended actually), you
>>> just can't _communicate_ data from the same connection over multiple
>>> paths.
>>>
>>> Latency is much less of a goal than throughput for the protocol
>>> though, which also makes this less important.
>>
>> OK, point taken about this being a lesser concern, however because
>> high latencies may impact on the types of applications that may be
>> deployed on the network it should still remain a consideration.
>
> Sure, it should indeed be optimized as long as it doesn't affect
> security or throughput. All quality factors should always be taken into
> consideration, some of them just must remain prioritized at all times.

Re-reading this discussion I had the idea that communicating data
over multiple paths may actually be an EXCELLENT method of both
frustrating traffic analysis and increasing reliability, at least 'in the
cloud' (ie: for those hops that occur in the middle of the network,
rather than just before each node - since these would likely be
shared between multiple paths)

>>>> o Network Flooding Combined with Assuming Another AP
>>>> As traffic increases, nodes with lower bandwidth will tend
>>>> to reject new routing paths. Well funded attackers (eg: governments)
>>>> could run very high bandwidth nodes, which would result in more traffic
>>>> being routed through those nodes, and then pretend to be other AP
>>>> addresses. This is possible due to no mechanism for securely
>>>> identifying nodes' identities. (Consider v0.7, p.17 - "it should be
>>>> impossible for any node to know or conclude if it is connected directly
>>>> to a node, or to a routing path owned by the same node".). Granted, the
>>>> public internet infrastructure has the same vulnerability, but there is
>>>> a degree of physical security and trust assumed that increases as data
>>>> travels towards long-distance carrier companies from the end user - in
>>>> Phantom v0.7, this does not exist.
>>> Yes, and this is one reason for the "centralized" banning of certain IP
>>> addresses. It migth still require quite high resources to be successful
>>> in such an attack though.

I think it's likely to be both slow and ineffective. And how could the
shepherd get to know the real IP addresses of the 'bad nodes' anyway?

I do think that the concept described above (0.7 p17) needs to change.

There should be an option (not a requirement) for nodes to securely
identify each other in the (very common) situation where the
identity of the node to be communicated with must be maintained
between sessions.

>> Although this sounds like a nice case for centralised network trust,
>> such bans would require some form of human intervention, which
>> inherently reduces their responsiveness and effectiveness. An 'bad
>> node' would succeed by default until both noticed and banned. I still
>> believe that including this mechanism at the protocol level is a bad
>> idea, as it does not really provide any quantifiable security whilst
>> giving the ability to manipulate traffic flow network-wide to selected
>> individuals.
>
> Sure, humans are slow, but what do you suggest would be a better
> solution, other than removing the humans _too_, thus leaving _no_
> solution to this problem rather than a non-optimal one?

Sorry I can't get enough context from this version of the
discussion to comment further! Let me know if you're
interested to discuss further and I'll go back and re-read.

Please see my response :) ie: This is not a 'real' (ie: publicly
accessible) DNS server, but rather a utility running on the local
host of the client system only.

> In addition to this,
> yes, the addresses could be comprised like that, but the problem is that
> it would break the backward compatibility with IP, which is one of the
> main design objectives.

The above is not a suggestion for a phantom address format (which
I agree should be standard IPv4) but an optional 'phantom domain
name' equivalent, which provides the identity of the target node (in
the form of a key checksum ... or preferably two or three) to guard
against the address re-registration (and subsequent impersonation)
issue that I outlined in my previous post.

Repeated again here:
===============================================

>> While cryptographically the impersonator's identity can be assured
>> to be that corresponding to the public key in the network database
>> at the time of connection, there is no persistence across multiple
>> registrations for an AP in the network database, and no
>> address-allocation hierarchy, so there is nothing stopping this
>> attack. Secondly, the fact that at the application level only the
>> IP address (and not the corresponding public key) is provided in
>> order to make a connection, there is no way to ensure that the
>> user's intended destination is that which they are actually
>> communicating with.

===============================================

> I rather chose to enable people to perform the
> certificate fingerpring check semi-manually themselves instead, if they
> wanted (which could though indeed be done by entering an address of your
> suggested format "md5sumofpublickey.1.2.3.4" into any Phantom-aware
> application).

I would stay away from the need to 'phantom enable' and stick
with a protocol / implementation design that does not require
any per application 'enabling' and thereby furthers the choice
of IPv4 for legacy application integration.

> Thanks again for your input and ideas Walter, please keep them coming!
> (and sorry again for forgetting to reply, it's just that when emails
> become too long, you put it off to answer them until "a better time",
> and then eventually, if they are longer than your suitable "better
> times" are, you easily forget them, and that's exactly what has happened
> here :-/)

No problem! I'm busy too... but this concept and project
are well worth supporting.

- Walter

Magnus Bråding

unread,

Nov 23, 2008, 3:36:14 PM11/23/08

to phantom-...@googlegroups.com

Hi Walter,

>>> - extending implementations to include a DNS server
>>> that always resolves a particular DNS suffix (eg: .phantom)
>>> via Phantom, then dynamically creating an application-specific
>>> proxy port on the local interface that redirects the application
>>> across the Phantom network.
>> A DNS server would make a scary "central point", both from an
>> availability perspective and anonymity perspective.
>
> Perhaps you didn't understand my suggestion - this would be a
> local application on each user's system - part of a 'Phantom
> network connectivity toolkit', and entirely optional. It would
> simply facilitate connecting legacy applications to the Phantom
> network in a reliable way, with no 'regular internet' cross-over,
> by triggering the legacy applications to use a local proxy that
> took care of the Phantom bridging.

Ok, I see. This would make deanonyumization attacks from e.g. web pages
extremely simple though, since you just have to include e.g. an image in
the page that doesn't use the phantom marker, and you could have the
user's identity correlated and leaked right there. :-/ This is one of
the strong arguments to force entire applications into the network
without their explicit knowledge, as mentioned in the paper.

> It doesn't matter who it is. Since the 'good shepherd' thing is basically
> a human, it will respond slowly to issues, and only if they are
> detected. Having it within the design of the protocol creates
> physical, social centralisation of power and influence (which inherantly
> creates a degree of vulnerability). Therefore I do not support this
> proposal.
>
>> Technical attacks: I don't see how?
>
> I'm talking about the theoretical possibility of a well motivated (funded,
> etc.) opponent deciding to hack the shepherd's systems, because
> they have identified him, her or it.
>
> Same argument as above, technical/social/physical/credibility attacks
> are all equivalent ... basically if there's a 'more powerful target' with any
> 'special powers' at all, they become an instant target.

Yes, this is also mentioned in the report, and the counter measure is to
ajust the privileges of such a "super user" so that they cannot make any
damage that cannot be fully reverted with the release of a new client
update (i.e. containing a new public key, if the old one has been
compromized). That's exactly why I emphasize this in the paper.

Yes, IP routing will have nothing to do with the randomization of such a
database lookup, and neither will current "peers" of a node. Voting
algorithms will be used with entirely random nodes (selected by the
nodes themselves, so no one else can influence this choice!).

>>>>> What prevents a different
>>>>> node using the same address as a former node, now disconnected, for
>>>>> future incoming communications, and advertising itself as that node?
>>>> The address entries are protected by asymmetric crypto, so you can only
>>>> modify an existing (non-expired) entry if you have the right
>>>> asymmetrical signing key.
>>> That's fine for existing entries for nodes with access to them, but is it
>>> somehow possible to provide false information to new nodes with no
>>> knowledge of the 'real' entries?
>> People who want to be sure of the identity of someone they connect to
>> the first time should make sure to validate their certificate (or at
>> least certificate fingerprint) with one received from a trusted source, yes.
>
> This seems like a cop out to me. If real world, deployable mechanisms
> for node identification are not included somehow in the proposed
> system, then I am skeptical that it will ever be implemented
> ("Encrypted Communications over IPv4 - An Ongoing Saga").

Wait a minute, I apparently forgot my own previous brilliant idea there
for a moment (also described in the paper already)... ;-) The public key
used for the SSL (or similar) communication will be stored in the
routing table entry for all addresses in the network database, and the
identity of nodes will always be automatically verified with this one.
That way, no one who has "stolen" someone's IP address will be able to
do anything with it.

>>> If possible and not addressed already, perhaps the protocol description
>>> could include a recommendation that the implementation should
>>> validate shared database information by connecting to and comparing
>>> data from multiple sources.
>> I'm not sure what you mean here? Please elucidate.
>
> What I mean is that a node that blindly trusts a version of the
> shared/network database coming from one other system is
> possible to deceive (if that node passes false information).

The entire database isn't shared (it's distributed in a partial
fashion), but rather, every single lookup is performed with randomized
voting algorithms so secure its authenticity (statistically).

> Perhaps nodes should aim to establish geographically and/or
> temporally disparate relationships with multiple nodes, then
> use those nodes to cross-validate any data being received
> about the identity of third party nodes.

Yes, and this is also already described in the paper (as a suggestion
for strategy to use for nodes when selecting the nodes to inquiry during
lookups). :-)

>>> I think leaving space for it in the original protocol design costs very
>>> little and provides the possibility for persistence across early and later
>>> implementations of Phantom. At least to me, lower-layer independence
>>> is one of the most intelligent features in popular networking protocols -
>>> why not add it now? (Do your points below about 'information
>>> piggybacking' have some bearing here?)
>> Sure, but I think there is already "room in the specification" for this,
>> through the abstraction. I only go as low as the SSL level (or even "SSL
>> equivalent"), and this can be implemented over UDP or whatever too.
>> Sorry if not making that as clear as it could be then anyway.
>
> I understand this, however there are critical properties of the
> transport layer that would be useful to understand at the
> Phantom level. Things like:
> - anonymity
> - cost
> - reliability
> - latency
> - security from traffic analysis
> - ... etc.
>
> To me, it makes sense to include at least adequate
> placeholders for the potential addition of such information
> for those nodes who choose to make use of it.

The protocol is abstracted in a way that those properties of the
underlying transport layer should be irrelevant. If you still stand by
your opinion, please present an example of examtly what these
"placeholders" would look like in the specification, since I still don't
really understand you completely on that point (which also means there
might still be some misunderstanding between us regarding this point).

>> If a node in a routing path can undetectedly modify or control just 32
>> bits of data _anywhere_ in all the data being sent to any non-adjacent
>> node in the routing path, it could covertly convey its IP address to any
>> conspiring nodes in the routing path, thus being able to communicate
>> with these out-of-band from that point on, which would make it possible
>> for them to combine their knowledge of included node IP addresses in the
>> routing path, which would in turn reduce or even completely eliminate
>> the anonymizing function of the routing path. This should be described
>> further in the paper too I'm almost certain.
>
> Sorry I'm not very mathematical - but if the two nodes can communicate
> at all (even explicitly) then it should be possible to share information
> anyway?

Yes, if they are adjacent (i.e. can communicate _directly_), and that's
why I say "_non-adjacent_ node in the routing path" above. :-)

> I guess that it is impossible to design a network that allows all nodes
> to communicate with each other but prohibits completely any sharing
> of information between them.

The goal is to prevent non-adjacent nodes in a randomly put together
routing path from communicating with each other or from being able to
conclude each other's identities (i.e. IP addresses), nothing else. This
is the entire key to the anonymity of the protocol.

> It seems to me that fundamentally, the only way to mitigate risks of
> multiple nodes banding together to manipulate or decieve others
> is to employ on every node adequate checks against the accuracy
> of information supplied by any one node. Is this correct?

Yes, and this is done by cryptograpchical signatures, and voting
algorithms where necessary.

>> Sure, it should indeed be optimized as long as it doesn't affect
>> security or throughput. All quality factors should always be taken into
>> consideration, some of them just must remain prioritized at all times.
>
> Re-reading this discussion I had the idea that communicating data
> over multiple paths may actually be an EXCELLENT method of both
> frustrating traffic analysis and increasing reliability, at least 'in the
> cloud' (ie: for those hops that occur in the middle of the network,
> rather than just before each node - since these would likely be
> shared between multiple paths)

Sending all traffic in a distributed way over multiple paths for the
same connection will practically no doubt affect the throughput
negatively (due to ordering problems if nothing else), but the high
reliability routing path design described in the paper is a solution for
this though, while still taking advantage of the reliability of the
multiple connections. Traffic analysis shouldn't be a concern anyway, at
least not in a way that could be solved without completely devastating
the throughput of the protocol (which is also mentioned in the paper).

>>>>> o Network Flooding Combined with Assuming Another AP
>>>>> As traffic increases, nodes with lower bandwidth will tend
>>>>> to reject new routing paths. Well funded attackers (eg: governments)
>>>>> could run very high bandwidth nodes, which would result in more traffic
>>>>> being routed through those nodes, and then pretend to be other AP
>>>>> addresses. This is possible due to no mechanism for securely
>>>>> identifying nodes' identities. (Consider v0.7, p.17 - "it should be
>>>>> impossible for any node to know or conclude if it is connected directly
>>>>> to a node, or to a routing path owned by the same node".). Granted, the
>>>>> public internet infrastructure has the same vulnerability, but there is
>>>>> a degree of physical security and trust assumed that increases as data
>>>>> travels towards long-distance carrier companies from the end user - in
>>>>> Phantom v0.7, this does not exist.
>>>> Yes, and this is one reason for the "centralized" banning of certain IP
>>>> addresses. It migth still require quite high resources to be successful
>>>> in such an attack though.
>
> I think it's likely to be both slow and ineffective. And how could the
> shepherd get to know the real IP addresses of the 'bad nodes' anyway?
>
> I do think that the concept described above (0.7 p17) needs to change.
>
> There should be an option (not a requirement) for nodes to securely
> identify each other in the (very common) situation where the
> identity of the node to be communicated with must be maintained
> between sessions.

Yes, and there already is, please see my description of this above, and
sorry for being unclear about this previously. Oh, and the shepherd will
be able to know the IP addresses of the bad nodes when enough nodes
report DoS like behavior from these addresses (which must of course be
done in a real careful way not to make the feature itself a DoS tool,
which is also why it cannot be automated in any way).

>>> Although this sounds like a nice case for centralised network trust,
>>> such bans would require some form of human intervention, which
>>> inherently reduces their responsiveness and effectiveness. An 'bad
>>> node' would succeed by default until both noticed and banned. I still
>>> believe that including this mechanism at the protocol level is a bad
>>> idea, as it does not really provide any quantifiable security whilst
>>> giving the ability to manipulate traffic flow network-wide to selected
>>> individuals.
>> Sure, humans are slow, but what do you suggest would be a better
>> solution, other than removing the humans _too_, thus leaving _no_
>> solution to this problem rather than a non-optimal one?
>
> Sorry I can't get enough context from this version of the
> discussion to comment further! Let me know if you're
> interested to discuss further and I'll go back and re-read.

Sure, please do.

>> In addition to this,
>> yes, the addresses could be comprised like that, but the problem is that
>> it would break the backward compatibility with IP, which is one of the
>> main design objectives.
>
> The above is not a suggestion for a phantom address format (which
> I agree should be standard IPv4) but an optional 'phantom domain
> name' equivalent, which provides the identity of the target node (in
> the form of a key checksum ... or preferably two or three) to guard
> against the address re-registration (and subsequent impersonation)
> issue that I outlined in my previous post.

Sure, that could indeed be a nice (optional) security feature.

>> I rather chose to enable people to perform the
>> certificate fingerpring check semi-manually themselves instead, if they
>> wanted (which could though indeed be done by entering an address of your
>> suggested format "md5sumofpublickey.1.2.3.4" into any Phantom-aware
>> application).
>
> I would stay away from the need to 'phantom enable' and stick
> with a protocol / implementation design that does not require
> any per application 'enabling' and thereby furthers the choice
> of IPv4 for legacy application integration.

Absolutely, that's what I've been saying the whole time! :-) I just
wanted to let you know that this would be a good idea if it could be
implemented without breaking IP compatibility, e.g. as you describe
above with the hostnames.

>> Thanks again for your input and ideas Walter, please keep them coming!
>> (and sorry again for forgetting to reply, it's just that when emails
>> become too long, you put it off to answer them until "a better time",
>> and then eventually, if they are longer than your suitable "better
>> times" are, you easily forget them, and that's exactly what has happened
>> here :-/)
>
> No problem! I'm busy too... but this concept and project
> are well worth supporting.

Thanks! Oh, and you're officially added as a project member now btw,
thanks again for your support, and see you around. :-)