You can make the standard a lot stronger by separating what's core from what's PT implementation. The spec conflates some separable concepts which makes it harder to see its core. What's not core can be moved to a separate specification. Furthermore, some of the concepts are doing too much, which results in some attrition such as abusing some SOCKS fields. You can fix that by introducing new concepts to replace the overloaded ones. More details below.
Separate Dispatcher from spec
The Dispatcher is merely a PT Client and Server implementation that calls another implementation. That's a useful tool to plug together implementations in different languages, but it should not be part of the core specification. A lot of the discussion can be moved to a dispatcher user guide.
Need for new concepts
Some "PT"s, are actually only transformations, while others are actually establishing connections. Transports with "implicit targets" (obsf4) and those that need explicit ones (meek) share the same interface. You get attrition by forcing them to be treated the same.
After playing with the concepts, I settled on the following concepts that you may want to map somehow:
- Stream: a unidirectional flow of data. Streams may be chained, forming a new Stream (Stream + Stream = Stream).
- Adaptors: wraps Streams to create a new Stream that speaks a different language (Stream + Adaptor = Stream). It pretty much adds a Stream before the inner Stream to convert to the inner language, and a Stream after the inner Stream to convert from the inner language. Adaptors may be chained, forming a new Adaptor (Adaptor + Adaptor = Adaptor).
- Service Client: gives you a Stream to communicate with a host:port.
- Service Server: listens for connections on a host:port, and gives you a Stream for each Service Client.
Notice how those are very low level and simple building blocks. The implementation is minimal. Maybe I'm missing important things (let me know), but I realized you can do a lot with them.
For example, Service Clients and Servers can be running on any protocol, as long as you address targets by host:port. You can have regular TCP clients, but also a HTTP Connect client or a SOCKS client, both of which will give you a stream given a host:port. It can look something like (simplified):
// This is wherever you choose the transport. Maybe in a config-based factory
client = new HttpConnectServiceClient(proxyHost, proxyPort).withAdaptor(newEncryptedAdaptor(cipher, secret));
targetStream = client.connect(targetHost, targetPort)
Notice that none of this require hacks to the underlying HTTPS or SOCKS protocols. You may even nest transports, which seems to be a limitation on the existing spec:
// Uses innerClient to connect to the SOCKS proxy
innerClient = new HttpConnectServiceClient(httpProxyHost, httpProxyPort);
outerClient = new SocksServiceClient(socksProxyHost, socksProxyPort).withServiceClient(innerClient);
targetStream = outerClient.connect(targetHost, targetPort)
You can also have things like newAdaptorFromCommand("gzip", "gunzip"), which runs the given command for the forward or reverse streams, and talks via the standard IO of the subprocesses. The command could be any binary that is available, working similarly to the dispatcher.
There's a lot that can be dropped by adopting these simpler concepts. For example, you can drop discussions like where to persist transport state: that's an implementation detail and up to the transport implementer.
Decouple from Tor
The whole dependency on TOR_PT_ environment variables are flags indicating that the spec is too coupled to the implementation.
As I mentioned, that can be moved to the dispatcher implementation. It's probably also possible to add a façade that allows you to not have to use those prefixes.
Decouple Server transport from Client transport
It's useful to be able to implement a client transport without having to implement a server transport (because someone else already implemented it for the server, but it's in another language). Currently the Transport specifies both. It may be a good idea to split that.
Need for better deployability
Figuring out how to integrate PT into the build of an application is one of the biggest hurdles to achieve the goal “Transport users have to do the minimum amount of work to add PT support to code that uses standard networking primitives from the language or platform.”. I'm not sure it belongs in this spec, but it would be great to have standard ways to integrate transport implementations.
I'm happy to discuss more about all of this
Vinicius (Vini) Fortuna