Remote library HTTP Headers

179 views
Skip to first unread message

CLAYTON

unread,
Mar 18, 2016, 8:11:21 AM3/18/16
to robotframework-users
Hiya (Pekka)!

We have discussed this before, so am sorry to bring this up again
However what id like to agree is a way that when executing tests in parallel (Pabot), that the HTTP calls made to Remote library servers can differentiate the source of the request

Here is the scenario:
- 1 Remote keyword server running (NRobotRemote is multi-threaded, so only 1 instance is needed when using Pabot)
- Pabot launches many RF instances, each using the keyword server
- A keyword such as "Close all Applications" is called

The problem is here the keyword server cannot differentiate the RF instances, and therefore cannot keep a session in its side.

If RF was to add a HTTP header with its process id to the HTTP requests it would help
If thats not possible, is there a listener interface so that Remote library calls can be intercepted and modified?

Thanks.
Clayton






Pekka Klärck

unread,
Mar 18, 2016, 9:19:35 AM3/18/16
to Clayton Neal, robotframework-users
Hello,

Could you clarify why tests using special keywords to identify
themselves wouldn't work? Additionally, you should be able to use IP
address and port on the network level to separate different
connections even without special headers.

I'm not totally against the idea to add special headers to the remote
library interface, but someone needs to come up with a specification
how it ought to work in detail. The solution should also be generic
and existing remote servers cannot be mandated to support it.

Cheers,
.peke
> --
> You received this message because you are subscribed to the Google Groups
> "robotframework-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to robotframework-u...@googlegroups.com.
> To post to this group, send email to robotframe...@googlegroups.com.
> Visit this group at https://groups.google.com/group/robotframework-users.
> For more options, visit https://groups.google.com/d/optout.



--
Agile Tester/Developer/Consultant :: http://eliga.fi
Lead Developer of Robot Framework :: http://robotframework.org

Kevin O.

unread,
Mar 18, 2016, 11:12:21 AM3/18/16
to robotframework-users, clayto...@yahoo.co.uk
In order to use the port as part of an identifier, the port must not change. But I do not think we can count on that.
The xmlrpcclient uses persistent connections, but the server may cause the connection to close and a new one on a different ephemeral port would be created. The connection could also go idle and be closed. The client would then appear as new even though it is not. Jython 2.5.x used HTTP/1.0 (no persistent connections), but thankfully we don't support it anymore.

Another solution would be to have RF start up the remote server that would only be used in that instance of RF. This is what I had in mind when I allowed for jrobotremtoeserver to use a random/ephemeral port.

I also had thought of other solutions long ago. At the time, my thought was to have a way for a remote library to spawn instances of other remote libraries. The new instance would be used solely in the context that created the instance and have a unique path on the remote server (much like Selenium). I didn't put functionality inside jrobotremoteserver for this specifically. It wouldn't be hard to create a remote Java library like that. jrobotremoteserver allows libraries to be added on the fly.
Yet another alternative solution would be to have a remote library that spawns new remote servers.

But there are probably scenarios where multiple instances of a remote library will not work due to resource constraints, etc. In that case, Clayton's proposed solution seems like a way forward.

Clayton,
If you want to do a custom solution, I came up with a way to send custom headers, but it is quite hacky. I tested it in RF 2.9.2:


from robot.libraries.Remote import XmlRpcRemoteClient, TimeoutTransport, Remote
import xmlrpclib
from robot.utils import timestr_to_secs
import uuid


class CustomRemote(Remote):
    def __init__(self, uri='http://127.0.0.1:8270', timeout=None):
        if '://' not in uri:
            uri = 'http://' + uri
        if timeout:
            timeout = timestr_to_secs(timeout)
        self._uri = uri
        self._client = CustomXmlRpcRemoteClient(uri, timeout)

class CustomXmlRpcRemoteClient(XmlRpcRemoteClient):

    def __init__(self, uri, timeout=None):
        transport = CustomTimeoutTransport(timeout=timeout)
        transport.set_proxy('localhost:8888')
        self._server = xmlrpclib.ServerProxy(uri, encoding='UTF-8',
                                             transport=transport)


class CustomTimeoutTransport(TimeoutTransport):
    def __init__(self, use_datetime=0, timeout=None):
        self._id = uuid.uuid1()
        TimeoutTransport.__init__(self, use_datetime, timeout)

    def send_request(self, connection, handler, request_body):
        connection.putrequest("POST", handler)
        connection.putheader("Robot-Identifier", self._id)

Kevin O.

unread,
Mar 18, 2016, 11:13:44 AM3/18/16
to robotframework-users, clayto...@yahoo.co.uk
oops. need to remove that set_proxy call. had that in there for testing

CLAYTON

unread,
Mar 18, 2016, 11:46:44 AM3/18/16
to robotframework-users, clayto...@yahoo.co.uk
Hiya,

On a HTTP level the remote server just gets a HTTP request from an IP address, there is nothing else to identify the context.
The setup i use is:
- Jenkins launching Pabot
- Pabot spawns multiple robot framework instances
- Robot instance executes tests that call remote keywords
- (One) Remote server executing keywords
So essentially i have 1 remote server, and many RF instances (clients) calling it all from the same IP address (the jenkins machine)

The remote server here is NRobotRemote, which is designed to be multithreaded, so it can have one or more robot framework instances calling it
This way when i execute tests via Pabot, i dont need to spin up more keyword servers... each keyword in NRobotRemote is executed on its own thread!
Not all remote keyword servers are like that AFAIK, and so dont really support Pabot with one instance

I cannot speak for all remote servers, but im guessing most are not looking at HTTP headers
However what I would suggest HTTP headers:

NAME        VALUE
Instance      RF ProcessId
Test            Name of current test case
Suite          Name of current test suite

With these being a cmd line option to robot framework to turn them on (off by default) (e.g. pybot --remoteheaders)
When the remote server gets these headers, it can use them to build a "session" within they keyword server

Why do i need to build a "session" inside the keyword server?
Well i keep track of all browsers/applications opened in the test, so that i can have a keyword such as "Close All Apps" in my teardown.
I can only "close all apps" within a test context, otherwise im killing apps used by tests in other robot framework instances

Hope it helps!
Clayton

David

unread,
Mar 21, 2016, 12:39:46 PM3/21/16
to robotframework-users, clayto...@yahoo.co.uk
I like Clayton's proposal. 

The session tracking reminds me of Selenium's session tracking that's implicit between the client (driver language binding) and the (Selenium) server, but which can be exposed to do additional things in cases of Selenium grid deployment (but on the client side instead, for the client to know which Selenium grid node it is being executed on to do things targeted against that node).

Clayton, 

In the case of enabling the headers though, would test & suite headers be necessary? I would think those would be optional headers and the minimal required header would be the "session ID", which would in the case of pabot be the process ID. By convention, as the use of the header is to distinguish a session being tracked, I think naming it session or "session ID" makes more sense than naming it instance or "process ID".

And on this topic, should the session identifier be generated and sent by the remote server or by the RF client (e.g. pabot)? If by client, client sends on initial, and thereafter, every request to remote server similar to making HTTP POST requests with JSON data where you have to set content type header, or sending back session cookie by header. If by server, like Selenium does, it could be generated and sent to client on first request from an unrecognized client (who calls the standard get keyword names, etc. when the remote library is first loaded/imported by RF), similar to how a web server sends back session cookie header to new client on first request (only), and thereafter the web/remote server will expect the client (RF/pabot) to send the session header back to maintain the session tracking state.

David

unread,
Mar 21, 2016, 1:22:55 PM3/21/16
to robotframework-users, clayto...@yahoo.co.uk
IMHO, the implementation should have the remote server to create the session ID and send to client as a response header as that follows convention - Selenium/JSONWireProtocol and web servers tracking session state via HTTP cookies do it that way.

And for this headers feature, I think it could be done without breaking backward compatibility to unsupported/older remote servers, but should be tested and confirmed. The approach here would be that for this new feature the remote server sends the session ID response header to RF on RF's initial requests/calls to get the list of keywords from the server. RF responds thereafter by sending back the session ID as a header in every keyword request to that server/library. RF may optionally send additional headers in the request, based on test configuration.

Unsupported and older versions of remote servers would simply never generate the session ID response header. And if they receive any HTTP request headers (not part of the current remote library interface spec) they simply ignore it (I believe additional unrecognized HTTP headers do not cause the XML-RPC remote server to fail or crash, although it should be confirmed).

But for my suggested approach, it might be a bit confusing and complicated if a remote server was to "serve" multiple libraries, and wanted to track the session per RF client + per library (on the remote server), in the case that a single RF client is using 2+ libraries on that remote server. But that could be an extension of the implementation, like how Selenium does it (you can launch multiple instances of Selenium on the server from the same client, each gets their own session), and that could relate to Kevin's note about context. From the client and server perspective though, that could work out fine if the single remote server serves the 2+ libraries on different resource paths (or different ports) and so there's 2 imports of the server in RF (different path/port for each library), so there are 2 get keyword names calls, each causing the remote server to generate a different session for each library served. And RF tracks the session on its side per library (instance).

CLAYTON

unread,
Mar 21, 2016, 4:02:00 PM3/21/16
to robotframework-users, clayto...@yahoo.co.uk
Hiya David

Personally i think the simplest solution is that robot just adds its process id as a HTTP header
True, i agree thats not really client/server, but instead think of it as robot is broadcasting its identifier (its process id), rather than expecting it in first response
Yep i agree test name, suite name is an optional extra (A Pabot launched robot process will only be for 1 suite, and 1 test at a time!) - however is useful i think to have

About multiple libraries, again perhaps NRobotRemote is different to other keyword servers.
NRobotRemote can "host" multiple libraries on the same port, and uses URL based routing  to the actual library class/method (same as a web service on port 80 routing URL requests to code)
AFAIK other remote servers expose each library on a different port (i.e. as a new service)
So for NRobotRemote the session will be across all hosted libraries

The other reason i want this is that a keyword can need to cache an expensive resource (e.g. a database connection, an xml file in memory, etc.)
In this case id prefer to code the keyword class implementing IDisposable (https://msdn.microsoft.com/en-us/library/system.idisposable%28v=vs.110%29.aspx)
In NRobotRemote i can then add clean-up cycles (e.g. If no request received from a client in a certain time) - release the expensive resources!
This can happen if people implementing the auto tests forget to call certain keywords in their teardown - eventually the keyword server is holding resources that will never be used
This will allow c# keyword library implementors to have resource caching... and eventual garbage collection

Cheers.
Clayton

Pekka Klärck

unread,
Mar 21, 2016, 4:33:44 PM3/21/16
to Clayton Neal, robotframework-users
Hello,

Based on the discussion, I think sending additional header that
indicate Robot pid and perhaps also version info would be a good idea.
If we'd use something like X-ROBOT-PID and X-ROBOT-VERSION, I think we
could send them always. Existing remote servers ought to ignore such
headers and servers interested in such info could read them. Is
someone interested to prototype this?

Cheers,
.peke

CLAYTON

unread,
Mar 21, 2016, 5:36:04 PM3/21/16
to robotframework-users, clayto...@yahoo.co.uk
Hello.

Would be perfect for me!
Unfortunately im not a python programmer :-(
(i work in c#, always with remote libraries)
I could give it a go, but might take me some tine :-)
Would it be ok to add a GitHub issue?

Cheer guys.

Kevin O.

unread,
Mar 22, 2016, 1:10:27 AM3/22/16
to robotframework-users, clayto...@yahoo.co.uk
I think any change should be forward thinking. I have not seen a request for allowing a remote library to act as a listener, but it is something to consider when making a change. That would probably involve Remote asking the remote server something, which currently it does not.

Pekka,
With local libraries, they communicate the version of the library API they support. Having RF tell the library the its version seems to be the opposite. What is the reasoning for this?

I do not see a problem with the client generating an ID. Using a UUID will avoid collisions. One reason to have the server produce an ID is simply that it allows the server to respond saying essentially - no, I cannot create a new session. If we are not requiring the remote server to support the new functionality than having the server generate the ID seems unnecessary.

To communicate the information to the remote library, I think the only rational approach is to have the remote server call a special method if it exists. Like a listener method named set_client_id or similar. This way the implementation is much like run_keyword, etc. and agnostic to the remote server implementation. set_client_id would store the ID in the current context. If the author of the remote library truly depends on having a client ID, then they can do a null check and fail with a friendly error inside of run_keyword.

I sure hope extra custom headers will not cause issues in any server - that would be surprising.

The prefix of X- for custom headers used to be recommended, but it is now depracated/discouraged - see http://tools.ietf.org/html/rfc6648.

Cheers,
Kevin

Kevin O.

unread,
Mar 22, 2016, 1:30:16 AM3/22/16
to robotframework-users, clayto...@yahoo.co.uk
Some corrections to my post...
Libraries do not communicate the versions they support, but listeners do via ROBOT_LISTENER_API_VERSION . And other information like ROBOT_LIBRARY_SCOPE is communicated in a pull fashion.

And my bit about storing the ID in set_client_id could also apply to static API libraries, even though I mentioned run_keyword. A static API library could still have a set_client_id method and set the library context, although there is a risk of name collision with a name like set_client_id.

CLAYTON

unread,
Mar 22, 2016, 5:32:41 AM3/22/16
to robotframework-users, clayto...@yahoo.co.uk
Hiya,

Ok lets sum up the options:

OPTION1:
Robot just sends HTTP headers for its process id, and version
It does not process response HTTP headers (for any subsequent requests)

OPTION2:
The xml-rpc interface is changed, and a new method is added (set_client_id)
Robot will call this method on first request to the remote server
The remote server will generate a GUID (or something else unique!) and send it back in the response to this method
Robot will then use that GUID in subsequent requests as a HTTP header

OPTION3:
Robot sends HTTP headers for its process id and version
The remote server will check if there is a HTTP header for a session id, if there isnt it will add such a header to the response
Robot will check any response for a HTTP header for a session id, and will then use it for any subsequent request

In terms of preferred options
1 - easy to implement, but inflexible
2 - IMHO wont work, the remote server still needs robot process id so it generates unique session ids
3 - Is my preferred option

Let me know your thoughts!

Cheers.

Pekka Klärck

unread,
Mar 23, 2016, 6:03:54 AM3/23/16
to ombre42, robotframework-users, Clayton Neal
2016-03-22 7:10 GMT+02:00 Kevin O. <korm...@gmail.com>:
>
> Pekka,
> With local libraries, they communicate the version of the library API they
> support. Having RF tell the library the its version seems to be the
> opposite. What is the reasoning for this?

I don't know would Robot's version be that interesting, but I don't
think having it in the headers would hurt either. Another possibility
would be changing User-Agent accordingly. Notice also that local libs
can do `from robot import __version__ as robot_version` if they need
to.

> The prefix of X- for custom headers used to be recommended, but it is now
> depracated/discouraged - see http://tools.ietf.org/html/rfc6648.

If I've understood this correctly, the deprecation of X- headers only
affects headers that are to become standard. With them first
introducing X-Some-Header and later renaming it to Some-Header makes
no sense. Using X- prefix with private headers is still OK in my
opinion.

Cheers,
.peke

Kevin O.

unread,
Mar 25, 2016, 1:00:59 AM3/25/16
to robotframework-users, korm...@gmail.com, clayto...@yahoo.co.uk
Thanks Pekka. That all makes sense and I agree with your assessments.

Clayton,
I think option 1 or 3 are fine, although I would like to know what the advantages of 3 are over option 1 as it is more complicated. Uniqueness is not a problem and neither is rejection unless we want to support local code being able to enforce the support of session identifiers.
I do not like option 2. The set_client_id part I mentioned was about how the remote server communicated with the library and not between Remote and the server.

To all:
If we are going to introduce the concept of a session, I think users will want to have some control over the life cycle of the session. I thought maybe setting the scope dynamically as an argument to Remote would provide an easy way for the local code to control the session duration. Unfortunately, the RF runtime code looks at the scope of the class and not at the instance. So changing the library scope through a parameter to Remote's __init__ would not work.
Am I overthinking it?
A fairly easy change would be to have Remote.TestScoped and Remote.GlobalScoped libraries,
Should there be a way for local code to retrieve the session ID and re-use it when creating another instance of Remote?
Thoughts?

Kevin
Reply all
Reply to author
Forward
0 new messages