How to implement a full fledged proxy in Vert.x?

1,215 views
Skip to first unread message

Ranjit Kumar

unread,
May 5, 2014, 10:40:38 PM5/5/14
to ve...@googlegroups.com
I would like to know how can I implement a proxy server in Vert.x (preferably using the groovy API) which is transparent to the user meaning that when a user enters "http://google.com" the server should be able to serve all the image, js, css files together with being able to handle any ajax calls. Also I should be able to parse the HTML content of the page before displaying it to the user. If the implementation is too big, kindly provide pointers as to how I should get started.

Norman Maurer

unread,
May 6, 2014, 4:09:20 AM5/6/14
to ve...@googlegroups.com, Ranjit Kumar

Am 6. Mai 2014 bei 04:40:41, Ranjit Kumar (ran8...@gmail.com) schrieb:

I would like to know how can I implement a proxy server in Vert.x (preferably using the groovy API) which is transparent to the user meaning that when a user enters "http://google.com" the server should be able to serve all the image, js, css files together with being able to handle any ajax calls. Also I should be able to parse the HTML content of the page before displaying it to the user. If the implementation is too big, kindly provide pointers as to how I should get started.
--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ranjit Kumar

unread,
May 6, 2014, 5:15:06 AM5/6/14
to ve...@googlegroups.com, Ranjit Kumar, norman...@googlemail.com
The proxy example given here redirects requests occurring at a specific URL (http://localhost:8080). Instead I want to redirect all URL requests to the browser. For example I have created an HTTP server on port 8080 and added localhost:8080 as proxy in firefox so all requests are directed to this server. Now I can read the requested URL from req.uri (or req.absoluteURI). How do I return the HTML at this particular URL to the browser? Also how can I add support for AJAX calls as well?

Alexander Lehmann

unread,
May 6, 2014, 7:02:42 AM5/6/14
to ve...@googlegroups.com, Ranjit Kumar, norman...@googlemail.com
Hi,

I have looked at http proxy with vert.x, I think I can give some pointers what you have to do (and what doesn't work).

First of all, to implement a http proxy, you have to do your request parsing yourself, since you currently do not have a proxy functionality available as a httpServer (i.e. the server only understands local urls starting with /). You can use a squid to provide the proxy functionality and then use a redirecter script to redirect to your local http server, then you can basically use the proxy example from the examples source to do the actual proxy operation. For this to work, you can proxy GET and POST methods, so that any request including AJAX calls should work.

When you implement a https proxy, the situation is quite different, if you only want to forward the ssl requests to the correct server, you can process the CONNECT method and do a simple Buffer forwarding (or Pump?) to the origin server. If you however want to look at the request or the result or want to modify the request in any way, you need the ssl "man-in-the-middle" attack which means that the browser will complain about the certificate, but that should be possible also. In this case you have to do the http request parsing yourself as well, if you want to change anything in the request.


bye, Alexander

Tim Fox

unread,
May 6, 2014, 7:04:49 AM5/6/14
to ve...@googlegroups.com
On 06/05/14 12:02, Alexander Lehmann wrote:
Hi,

I have looked at http proxy with vert.x, I think I can give some pointers what you have to do (and what doesn't work).

First of all, to implement a http proxy, you have to do your request parsing yourself, since you currently do not have a proxy functionality available as a httpServer (i.e. the server only understands local urls starting with /).


Alexander, can you elaborate on this point? ^^

You can use a squid to provide the proxy functionality and then use a redirecter script to redirect to your local http server, then you can basically use the proxy example from the examples source to do the actual proxy operation. For this to work, you can proxy GET and POST methods, so that any request including AJAX calls should work.

When you implement a https proxy, the situation is quite different, if you only want to forward the ssl requests to the correct server, you can process the CONNECT method and do a simple Buffer forwarding (or Pump?) to the origin server. If you however want to look at the request or the result or want to modify the request in any way, you need the ssl "man-in-the-middle" attack which means that the browser will complain about the certificate, but that should be possible also. In this case you have to do the http request parsing yourself as well, if you want to change anything in the request.


bye, Alexander

On Tuesday, May 6, 2014 11:15:06 AM UTC+2, Ranjit Kumar wrote:
The proxy example given here redirects requests occurring at a specific URL (http://localhost:8080). Instead I want to redirect all URL requests to the browser. For example I have created an HTTP server on port 8080 and added localhost:8080 as proxy in firefox so all requests are directed to this server. Now I can read the requested URL from req.uri (or req.absoluteURI). How do I return the HTML at this particular URL to the browser? Also how can I add support for AJAX calls as well?



On Tuesday, May 6, 2014 1:39:20 PM UTC+5:30, Norman Maurer wrote:
Did you check the examples?


-- 
Norman Maurer

Arno Schulz

unread,
May 6, 2014, 7:30:31 AM5/6/14
to ve...@googlegroups.com
The only limitation in terms of URI is/was in the websockets (it was present in the master branch but not available as a method for the websocket). But I checked a month ago.

Though I didn't figure a way to properly forward websockets (ie closing from the client didn't go through the proxy and vice versa)

Alexander Lehmann

unread,
May 6, 2014, 1:43:42 PM5/6/14
to ve...@googlegroups.com
Sorry, I got that wrong, the request object includes the necessary information to implement a http proxy directly, when you evaluate req.absoluteURI(), you know which host to connect to etc.

If you start with the proxy example Norman mentioned, you should have most of the functionality already, you may have to implement the absolute URI connection or you can use req.uri.

kim young ill

unread,
May 7, 2014, 7:13:02 PM5/7/14
to ve...@googlegroups.com

i also had a look at this & asked a question in the forum about the design of the httpclient, but there's no answer.
by the way it works you will need to create for each request (or each host) a client instance, because it bounds to a host/port, seems inefficient for a non-reverse proxy.
there're some httpclients (apache or asynchttpclient) which might match the requirements better

cheers


Ranjit Kumar

unread,
May 7, 2014, 10:17:13 PM5/7/14
to ve...@googlegroups.com
Yes that is what I thought. But the part I am stuck at is "implementing the absolute URI connection". I tried dumping the html stored at the URI to a file and then returning that file. It works as far as css/js/png/jpg files are concerned. but a page is not just these things. A lot of pages call APIs on other sites plus there are ajax calls etc. So I was wondering if there is a kind of catch-all solution that will eliminate the need to deal with everything individually.

Thanks

Tim Fox

unread,
May 8, 2014, 6:42:09 AM5/8/14
to ve...@googlegroups.com
On 08/05/14 00:13, kim young ill wrote:

i also had a look at this & asked a question in the forum about the design of the httpclient, but there's no answer.
by the way it works you will need to create for each request (or each host) a client instance, because it bounds to a host/port, seems inefficient for a non-reverse proxy.

It's not particularly inefficient - you can just maintain a Map of HttpClient for each host/port. In Vert.x 3.0 there is a task to make a single  HttpClient work for all hosts/ports.

Tim Fox

unread,
May 8, 2014, 6:44:09 AM5/8/14
to ve...@googlegroups.com
On 08/05/14 03:17, Ranjit Kumar wrote:
Yes that is what I thought. But the part I am stuck at is "implementing the absolute URI connection". I tried dumping the html stored at the URI to a file and then returning that file. It works as far as css/js/png/jpg files are concerned. but a page is not just these things. A lot of pages call APIs on other sites plus there are ajax calls etc.

These are just requests/responses. If you write a proxy that passes through all requests and passes back all responses it will work in all these cases. You don't need to dump anything to a file.

Ivano Pagano

unread,
May 8, 2014, 10:20:38 AM5/8/14
to ve...@googlegroups.com
Unless you want to do caching?

Tim Fox

unread,
May 8, 2014, 10:23:12 AM5/8/14
to ve...@googlegroups.com
Caching does not imply you need to dump things to file.

Arno Schulz

unread,
May 8, 2014, 10:25:47 AM5/8/14
to ve...@googlegroups.com
Would that work with web-sockets as well?

I was trying to get web-sockets to go through a reverse proxy and opening/communication worked fine but when either the server or client disconnected it didn't go across and just stopped at the proxy.


Tim Fox

unread,
May 8, 2014, 10:27:50 AM5/8/14
to ve...@googlegroups.com

On 08/05/14 15:25, Arno Schulz wrote:
Would that work with web-sockets as well?

I was trying to get web-sockets to go through a reverse proxy and opening/communication worked fine but when either the server or client disconnected it didn't go across and just stopped at the proxy.

Sounds like a bug in the proxy implementation.

Alexander Lehmann

unread,
May 9, 2014, 7:22:43 PM5/9/14
to ve...@googlegroups.com
Haven't taken a look at websockets with vert.x yet, but I think websockets do not really work with a proxy. When a proxy is configured, websocket will try to use the CONNECT method to the server port and then use the websocket protocol on the tunnel, but this isn't proper http or https, its the websocket protocol.

A reverse proxy will also have to support some kind of tunnel to the app server, this isn't really a http proxy.

(a comment about the caching question, you have to take into account that a request may take a long time if a large file is requested or the connection is slow, so copying the content to a file and then serving the file after it is finished creates a long delay before the download starts (e.g. assume the url requested is an mp3 that is shaped to the bitrate of the file, that would only start after the download of the running time is finished. If you want to store in a file, you need some kind of streaming that returns the data to the client while writing the file as well and if you want to cache, you have to match different requests for the some url and serve the beginning of the file you already have on disk and then continue with the part that is in the process of downloading)

Alexander Lehmann

unread,
May 9, 2014, 7:41:01 PM5/9/14
to ve...@googlegroups.com, khi...@googlemail.com
I wonder if this is a case of premature optimization, the most time spent on a proxy is network io, even if a httpclient is expensive to create, it still will require most time in the network handling and that is done pretty well in vert.x since it is aynchronous including dns. You need a httpclient for each parallel request anyway, so you will need a pool with enough instances to handle the requests.

If you need a real proxy, jvm may not be the adequate choice, thats what Squid is for.

Alexander Lehmann

unread,
May 9, 2014, 7:49:41 PM5/9/14
to ve...@googlegroups.com
You do not need any different handling of api calls, ajax etc, you will get basically everything done with the proxy example from the examples project.

Alexander Lehmann

unread,
May 9, 2014, 7:54:50 PM5/9/14
to ve...@googlegroups.com, khi...@googlemail.com
Come to think of it, if you want to handle proxy requests efficiently, you need to pool the outgoing connections to be able to handle keep-alive on that end independently of the keep-alive connections from the clients. I'm not sure how vert.x httpclient handles keep-alive, if the client is already connected, keeping the instances around for a few seconds might be a good idea.


On Thursday, May 8, 2014 1:13:02 AM UTC+2, kim young ill wrote:

Arno Schulz

unread,
May 9, 2014, 9:17:14 PM5/9/14
to ve...@googlegroups.com
No websocket proxy kind of works(see other thread) and there are other implementations (nginx) but each have their limitations.
--

kim young ill

unread,
May 10, 2014, 8:11:11 AM5/10/14
to ve...@googlegroups.com
a client instance which works with all host/port will be much better, also simpler when you need to maintain a connectionpool,  maintaining a map of host:port-client is more or less a workaround.

cheers

kim young ill

unread,
May 10, 2014, 8:16:46 AM5/10/14
to ve...@googlegroups.com
normally when the browser is configured to use a proxy, the http-comand sent to you proxy server will contains full host:port:url (absolute uri). (doesnt matter a plain get over browsers url-bar or ajax-call).  with http1.1 there's also host-header which can be used.

if the request line starts with / (relative) means the request is really intended to YOUR server, not to be proxied

cheers
 
Reply all
Reply to author
Forward
0 new messages