On Thursday, April 18, 2013 3:48:37 PM UTC-4,
daniel...@elo7.com wrote:
It calls my web service and the only thing he send to me is an HTTP parameter telling which encode it uses for each other parameters.
For example:
it sends me "subject" and "text" parameter
then, in the "charsets" parameter it tells me the encoding for the other ones.
[charsets] => {"subject":"UTF-8","text":"iso-8859-1"}
Is there any elegant way to "hack" play, making it to parse the parameters according to the charsets entries?
I don't understand how one parameter can describe the character set of another parameter--the "charset" parameters needs to be decoded first, but how can you read it when it doesn't have a character set?
Are these requests double-encoded? multipart-encoded?
Could you post an example request, e.g., as text and as output of "xxd"?
I know this is not HTTP standard, but it is a requirement for my application.
There are a few approaches I'd ponder, and they depend on what Sendgrid is doing. You'll have to understand that before you proceed.
Single encoding
If you know each message will come with every parameter in the same encoding, but that's not the encoding Play is using, you can override Play's way of determining the character set. Create an app/Global.scala that looks something like this:
import play.api.GlobalSettings
import play.api.mvc.{Handler,RequestHeader}
object Global extends GlobalSettings {
override def onRouteRequest(request: RequestHeader): Option[Handler] = {
// Override character set
val requestWithCharset = new RequestHeader {
// Like RequestHeader.copy(), but overriding charset
val tags = request.tags
val uri = request.uri
val path = request.path
val version = request.version
val queryString = request.queryString
val headers = request.headers
val remoteAddress = request.remoteAddress
override lazy val charset : Option[String] = Some("iso-8859-1")
}
super.onRouteRequest(requestWithCharset)
}
}
This method will be called for every route. You can set up some conditions to make sure it's only called with the routes that matter to you. Adjust the "lazy val charset" to the character set you want.
Multiple encodings
As I said, I don't quite understand how the message is encoded.
It's possible the Sendgrid developers don't, either :). If that's the case, you'll have to follow *their* logic, whatever that is. The brute-force way is to parse the request as a big Array[Byte] instead of as a String. In your controller:
val MaxMemory = 1 * 1024 * 1024 // 1MB
def post = Action(parse.raw(MaxMemory)) { request =>
val bytes : Array[Byte] = request.body.asBytes.getOrElse(throw new Exception("Need to handle case when memory exceeded? Use request.body.asFile"))
// parse the bytes yourself
}
You won't get Java's or Scala's String libraries, or any other goodies to help you.
You'll want to first split the bytes into a Map[Array[Byte],Array[Byte]] by splitting on "&" and "=" (if it's URL-encoded), then you'll want to call new String() on each key and value using the appropriate encodings. Then you can pass the resulting Map[String,String] to Play's Forms.
Dummy encoding?
This is a hack, but it might help....
You could define a "dummy" encoding--that is, make a Java String whose only purpose is to carry bytes. You could convert an Array[Byte] to and from this String losslessly. Think about it:
1. Play would "decode" from the dummy charset
2. Play would do all its processing with these Strings (it would understand the ASCII characters; the rest would be garbage)
3. You would "encode" back into the dummy charset, getting the original Array[Byte]
4. You would use the charset from step 2 to properly decode the bytes
This hack could make sense, under certain conditions. But those would be wonky conditions indeed.
Whatever approach you choose, be sure you understand what's happening. (Other web frameworks might make this easier, but they might not! You need to first understand what the heck Sendgrid is doing.) Always understand when you're dealing with bytes and when you're dealing with Strings, and always understand the encoding of every array of bytes you deal with. (Strings don't have an encoding--at least not one that could possibly be useful to you here.)
Enjoy life,
Adam