Tilde is removed from string that is send via signal object

17 views
Skip to first unread message

Peter Ruijter

unread,
Jun 22, 2018, 5:16:18 PM6/22/18
to divolte-collector
Hi, 

We're facing a problem when we send a custom event via divolte.signal with a string containing a tilde (~) within an object member. The tilde is stripped off from the string on it's way to an Avro-file, while this is not happening when a location with a tilde is sent via a pageview event. Possibly because it's not escaped in the querystring from the string like other characters like: "/".

Below the an example call:
divolte.signal("linkClick", {linkPath: "/link-to-an-article~12345/"});

This results a GET call: 
https://mydomain.com/divolte/csc-event?p=0%3Ajipyz2at%3Ablc28fEZwZPiMAZaXmmDrJecckwj3ksO&s=0%3Ajiqff8a6%3AwH9r~BxmY2mRxHji4gGarDgiI54sP8Eg&v=0%3A8WPTdCCyh3uHAA_c5sczc3H0nGuK9jRo&e=0%3A8WPTdCCyh3uHAA_c5sczc3H0nGuK9jRo1&c=jiqffa4b&n=f&f=f&l=https%3A%2F%2Fwww.mydomain.com%2F&i=1hc&j=vt&k=2&w=1fr&h=mb&t=linkClick&u=(slinkPath!%2Flink-to-an-article~12345%2F!)&x=fbmljs
Inside the querystring that the signal-call is producing, the "u"-param has still the tilde in the string and also the "-" characters.

In the groovy mapper file this field is mapped as follow: map eventParameters().value('linkPath') onto 'linkPath'

Inside the *.avsc file that is used for the field definition for the AVRO file, this field is mapped as follow: { "name": "linkPath", "type": ["null", "string"], "default": null }

Inside the AVRO-file the string is written as follow: 

  "linkPath" : {
    "string" : "/link-to-an-article12345/"
  }
without the tilde, but with the slashes and dashes. 

Somewhere on it's way to this AVRO file the tilde is removed from the string. We can escape this tilde-character ourselve, but then we have to unescape it again on the server-side, because Divolte doesn't do this: 
  "linkPath" : {
    "string" : "link-to-an-article%7E12345/"
  }

I would be very happy if Divolte could escape and unescape this tilde or just keeping it in the string. Then we can process and aggregate the data right away, without building a separate process for this.

Is this a thing you can fix in Divolte or is there an easier solution?

Kind regards, 

Peter

andre...@godatadriven.com

unread,
Jun 23, 2018, 3:21:09 AM6/23/18
to divolte-collector
Hi Peter,

Thanks for the clear explanation of what you're seeing and the steps to reproduce.

This is a bug on the server side, and probably related indirectly to tilde (~) being something that for a while was illegal in URLs without escaping. Modern standards allow it, but URL validation and processing has a remarkably chequered history.

I've created an issue to track fixing this.

Cheers,

 - Andrew
Reply all
Reply to author
Forward
0 new messages