[Deluge] #3440: charset should be ignored for application/x-bittorrent

2 views
Skip to first unread message

Deluge

unread,
Nov 26, 2020, 8:26:41 AM11/26/20
to delug...@googlegroups.com
#3440: charset should be ignored for application/x-bittorrent
----------------------------+-------------------
Reporter: megaksa | Type: bug
Status: new | Priority: major
Milestone: needs verified | Component: Core
Version: master | Keywords:
----------------------------+-------------------
deluge version: 2.0.4.dev38 on Arch Linux.

I'm used to use delugesiphon Chrome plugin to add new torrents to server.
However after switching to Arch which has latest deluge, the plugin
doesn't work anymore for rutracker.org. After the investigation I've found
the issue to be deluge itself. httpdownloader. When requesting torrent
download rutracker responds with the header:
`Content-Type: application/x-bittorrent; charset=Windows-1251`

While providing charset for this content type IMO doesn't make sense, I
suggest to not do re-encoding to UTF-8 for anything besides 'text/...'
MIME types.

Attached is a suggested fix produced by
`diff /usr/lib/python3.8/site-packages/deluge/httpdownloader.py
/usr/lib/python3.8/site-packages/deluge/httpdownloader_fixed.py`

--
Ticket URL: <https://dev.deluge-torrent.org/ticket/3440>
Deluge <https://deluge-torrent.org/>
Deluge Project

Deluge

unread,
Nov 26, 2020, 8:27:38 AM11/26/20
to delug...@googlegroups.com
#3440: charset should be ignored for application/x-bittorrent
----------------------+----------------------------
Reporter: megaksa | Owner:

Type: bug | Status: new
Priority: major | Milestone: needs verified
Component: Core | Version: master
Resolution: | Keywords:
----------------------+----------------------------
Changes (by megaksa):

* Attachment "charset_fix.diff" added.

Deluge

unread,
Feb 6, 2021, 6:59:47 AM2/6/21
to delug...@googlegroups.com
#3440: charset should be ignored for application/x-bittorrent
----------------------+----------------------------
Reporter: megaksa | Owner:

Type: bug | Status: new
Priority: major | Milestone: needs verified
Component: Core | Version: master
Resolution: | Keywords:
----------------------+----------------------------

Comment (by Cas):

We need more a bit information about the exact problem. Are the torrent
downloads in UTF8 and decoding with Windows-1251 is corrupting the data?
What is the error you are encountering?

I am wary of changing the way httpdownloader works as it could have
unintended consequences.

I would propose to not re-encode application/x-bittorrent (it should be
utf8...) so in request_callback don't set encoding if content-type is
application/x-bittorrent.

{{{
if "application/x-bittorrent" not in content_type:
encoding = charset
}}}

--
Ticket URL: <https://dev.deluge-torrent.org/ticket/3440#comment:1>

Deluge

unread,
Feb 6, 2021, 7:00:04 AM2/6/21
to delug...@googlegroups.com
#3440: charset should be ignored for application/x-bittorrent
----------------------+--------------------
Reporter: megaksa | Owner:
Type: bug | Status: new
Priority: major | Milestone: 2.0.4

Component: Core | Version: master
Resolution: | Keywords:
----------------------+--------------------
Changes (by Cas):

* milestone: needs verified => 2.0.4


--
Ticket URL: <https://dev.deluge-torrent.org/ticket/3440#comment:2>

Deluge

unread,
Feb 7, 2021, 2:22:29 AM2/7/21
to delug...@googlegroups.com
#3440: charset should be ignored for application/x-bittorrent
----------------------+--------------------
Reporter: megaksa | Owner:
Type: bug | Status: new

Priority: major | Milestone: 2.0.4
Component: Core | Version: master
Resolution: | Keywords:
----------------------+--------------------

Comment (by megaksa):

Correct. Example of torrent download headers:

{{{
Content-Type: application/x-bittorrent; charset=Windows-1251
Content-Disposition: attachment;
filename="[rutracker.org].t5778456.torrent";
filename*=UTF-8''%D0%91%D0%B8%D0%B1%D0%BB%D0%B8%D0%BE%D1%82%D0%B5%D0%BA%D0%B0%20%D0%9C%D1%83%D1%80%D0%B7%D0%B8%D0%BB%D0%BA%D0%B8%20-%20%D0%92%D0%B0%D1%80%D0%BC%D1%83%D0%B6%20%D0%92.%20-%20%D0%9C%D0%BE%D1%81%D1%82%D0%BE%D1%80%D0%B3%20%5B1930%2C%20PDF%2C%20RUS%5D%20%5Brutracker-5778456%5D.torrent
}}}
AFAIR charset treatment is generally defined for textual MIME types (RFC
6657), i.e. for those with the text/* MIME type. For the rest, the
treatment is per specific type documentation. For the binary types it may
indicate e.g. an internal encoding (like tags encoding inside an internal
binary file representation, particularly inside a torrent file, maybe
inside an mp3 file). So generally binary files cannot be re-encoded as
textual files can be. I think the right way is to not do the above
encoding unless the file is textual. So your proposed solution is less
correct, but would also work in my particular case. I'd go with reverse.
What are the known types besides text/* where you are interested with
content re-encoding?

--
Ticket URL: <https://dev.deluge-torrent.org/ticket/3440#comment:3>

Deluge

unread,
Feb 20, 2021, 4:22:46 PM2/20/21
to delug...@googlegroups.com
#3440: charset should be ignored for application/x-bittorrent
----------------------+--------------------
Reporter: megaksa | Owner:
Type: bug | Status: closed

Priority: major | Milestone: 2.0.4
Component: Core | Version: master
Resolution: Fixed | Keywords:
----------------------+--------------------
Changes (by Cas):

* status: new => closed
* resolution: => Fixed


Comment:

Yeah I see what you mean and agree we should only be re-encoding text
content types. So I have modified your patch, added test and merged to
develop: [4d970754a4a]

Thanks for detailed reporting and suggested fix!

--
Ticket URL: <https://dev.deluge-torrent.org/ticket/3440#comment:4>

Reply all
Reply to author
Forward
0 new messages