Strange behaviour with invalid PDF attachments in responses

11 views
Skip to first unread message

Laurent Savaëte

unread,
Aug 21, 2021, 12:41:50 PM8/21/21
to alavet...@googlegroups.com

Hi everyone,

We have a strange bug with replies from one authority in France (at least) that contain a PDF attachment. See https://madada.fr/demande/attribution_dun_marche_public_po#incoming-968

Specifically, their emails contain an invalid MIME type string, the excerpt below is from the link above, via admin / incoming emails / download.

=============================================
-------------060005060401010509060003
Content-Type: text/html;      <----- THIS LOOKS WRONG FOR A PDF
 name="RAO Drones lot 1-occ.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="RAO Drones lot 1-occ.pdf"
=============================================

The attached PDF itself is valid, I can open it, but somehow the sender's email client messes up the mime email content.
Then Alaveteli follows that content type string, and sends the PDF as an HTML, ending up with the browser displaying garbage.
As a side note, we have a censorship rule that is applied on the result, making things even worse. Our regex has a bug which I'll fix, but I don't think it should apply here at all, at least not the way it's done.

Any suggestions about what to do about it? We will try to mention it to the ministry in question, but I suspect our chances of success are fairly low :/

Is there some sort of process to force alaveteli to consider an attachment as a different format than what it automatically guesses?

Thanks!
Laurent for team MaDada

Gareth Rees

unread,
Aug 23, 2021, 6:13:05 AM8/23/21
to Alaveteli Dev
> Is there some sort of process to force alaveteli to consider an attachment as a different format than what it automatically guesses?

There's not I'm afraid. We recently had a similar issue with a CSV [1].

I guess you might be able to update the extracted content_type value in the foi_attachments table. We tend to avoid database edits though, because a re-parse of the message would override the manual edit. It's too much to keep on top of.

> Any suggestions about what to do about it?

When edge-cases like this happen, we tend to extract the attachment from the raw email, upload it somewhere (we have files.whatdotheyknow.com), and then post an annotation linking to the attachment [2].

Best,

[2] https://www.whatdotheyknow.com/request/commentsobjections_to_applicatio#comment-94109
Reply all
Reply to author
Forward
0 new messages