imap library does not like foreign languages for email subjects

1,385 views
Skip to first unread message

kepp...@networkoptix.com

unread,
Mar 6, 2018, 3:53:43 PM3/6/18
to robotframe...@googlegroups.com
I am testing our sign up, reset password, etc. emails in multiple languages.  Imaplibrary has served me well in English but I get 
UnicodeEncodeError: 'ascii' codec can't encode character
when I look for an email with a non-English subject.  Even the Spanish 'ñ' throws the error.  Is this an issue with the encoding in imaplibrary?  I had to mess with encoding to get foreign languages to work in my get_variables.py file but I'm hoping I don't need to modify imaplibrary itself and can just send it what it's looking for.

Example:
    ${email}    Wait For Email    recipient=${recipient}    subject=Активируйте учетную запись   timeout=120


Thanks

Pekka Klärck

unread,
Mar 6, 2018, 4:50:51 PM3/6/18
to kepp...@networkoptix.com, robotframework-users
Hi,

I have no experience with this library, but it could really be that
it's never been tested with non-ASCII data. Could you run tests with
`--loglevel debug` to see the traceback of the error?

Cheers,
.peke

2018-03-06 22:53 GMT+02:00 <kepp...@networkoptix.com>:
> I am testing our sign up, reset password, etc. emails in multiple languages.
> Imaplibrary has served me well in English but I get
> UnicodeEncodeError: 'ascii' codec can't encode character
> For just about every test even the Spanish 'ñ' throws the error. Is this an
> issue with the encoding in imaplibrary? I had to mess with encoding to get
> foreign languages to work in my get_variables.py file but I'm hoping I don't
> need to modify imaplibrary itself and can just send it what it's looking
> for.
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "robotframework-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to robotframework-u...@googlegroups.com.
> To post to this group, send email to robotframe...@googlegroups.com.
> Visit this group at https://groups.google.com/group/robotframework-users.
> For more options, visit https://groups.google.com/d/optout.



--
Agile Tester/Developer/Consultant :: http://eliga.fi
Lead Developer of Robot Framework :: http://robotframework.org

kepp...@networkoptix.com

unread,
Mar 6, 2018, 5:11:23 PM3/6/18
to robotframework-users

Pekka Klärck

unread,
Mar 6, 2018, 5:35:35 PM3/6/18
to kepp...@networkoptix.com, robotframework-users
Hi,

It really looks like the data should be encoded before sending but it's not. You could try encoding it before using the keyword with `Encode String To Bytes` [1] but what would be just a workaround.


Cheers,
    .peke

2018-03-07 0:11 GMT+02:00 <kepp...@networkoptix.com>:

--
You received this message because you are subscribed to the Google Groups "robotframework-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to robotframework-users+unsub...@googlegroups.com.
To post to this group, send email to robotframework-users@googlegroups.com.

kepp...@networkoptix.com

unread,
Mar 6, 2018, 5:40:52 PM3/6/18
to robotframework-users
Ya, that's one of the first things I tried.  'Encode string to bytes' produces:
error: UID command error: BAD ['Could not parse command']



Pekka Klärck

unread,
Mar 6, 2018, 6:06:29 PM3/6/18
to kepp...@networkoptix.com, robotframework-users
Sounds strange. I don't really know anything about IMAP so cannot help much more. Hopefully someone else can or you are able to figure out what's going wrong.

Cheers,
    .peke

--
You received this message because you are subscribed to the Google Groups "robotframework-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to robotframework-users+unsub...@googlegroups.com.
To post to this group, send email to robotframework-users@googlegroups.com.
Visit this group at https://groups.google.com/group/robotframework-users.
For more options, visit https://groups.google.com/d/optout.

kepp...@networkoptix.com

unread,
Mar 6, 2018, 6:09:18 PM3/6/18
to robotframework-users
Thanks for the attempt.  I'm going to try upgrading to python 3.  Perhaps 2 is doing some behind the scenes conversions I don't want.

kepp...@networkoptix.com

unread,
Mar 6, 2018, 6:49:07 PM3/6/18
to robotframework-users
Python 3 does not seem to help.  Is there another email library for robot?

Alex C.

unread,
Mar 7, 2018, 2:12:45 AM3/7/18
to robotframework-users
The 2nd best solution would be raising an issue here:

https://github.com/rickypc/robotframework-imaplibrary

Chris Newman

unread,
Mar 8, 2018, 2:40:38 PM3/8/18
to kepp...@networkoptix.com, robotframework-users
You probably need to convert your strings to binary byte arrays before
passing them to imaplib. For the search command you're attempting, I'd
recommend specifying CHARSET UTF-8 arguments to the IMAP search command
and then using UTF-8 encoding for the byte strings you pass to imaplib.
Hopefully imaplib will do the right thing then.

https://stackoverflow.com/questions/7380460/byte-array-in-python

If you have IMAP questions, I know the protocol well. I'm not a fan of
Python imaplib; but it may be ok for simple tasks. FYI, I use Robot
Framework to test the IMAP server I work on for a living.

- Chris

On 6 Mar 2018, at 12:53, kepp...@networkoptix.com wrote:

> I am testing our sign up, reset password, etc. emails in multiple
> languages. Imaplibrary has served me well in English but I get
> UnicodeEncodeError: 'ascii' codec can't encode character
> For just about every test even the Spanish 'ñ' throws the error. Is
> this
> an issue with the encoding in imaplibrary? I had to mess with
> encoding to
> get foreign languages to work in my get_variables.py file but I'm
> hoping I
> don't need to modify imaplibrary itself and can just send it what it's
> looking for.
>
> Thanks

kepp...@networkoptix.com

unread,
Mar 9, 2018, 1:18:40 PM3/9/18
to robotframe...@googlegroups.com
Thanks Chris, 

I have been trying to write my own keyword to handle just getting the subject but man the decoding is driving me crazy.  Here is the code I'm using.  If you think there is a better email library to use I am all ears.  Just wanting to make sure a subject is expected text should not be this hard.
def check_email_subject(self, email_id, sub_text):
        conn
= imaplib.IMAP4_SSL('imap.gmail.com', 993)
        conn
.login('em...@gmail.com', 'qweasd!@#')
        conn
.select()
        typ
, data = conn.uid('fetch', email_id, '(BODY.PEEK[HEADER.FIELDS (SUBJECT)])')
       
for res in data:
           
if isinstance(res, tuple):
                header
= email.header.decode_header(str(res[1].strip()))
                header_str
= "".join([x[0].decode('utf-8').strip() if x[1] else re.sub("(^b\'|\')", "", str(x[0])) for x in header])
                header_str
= re.sub("Subject: ", "", header_str)
                header_str
= re.sub("(\\\\r|\\\\n|\')", "", header_str)
               
if sub_text != header_str.strip():
                   
raise Exception(header_str+' was not '+sub_text)      
        conn
.logout()


The thing is the above handles only certain situations.  I get results like this:
[(b"b'Subject: ", None), (b' R\xc3\xa9initialisez votre mot de passe', 'utf-8'), (b"'", None)]

Which boils down to:
b"b"Réinitialisez votre mot de passeb"" was not Réinitialisez votre mot de passe
Perhaps your knowledge would point me in the right direction.  Sometimes the results are 3 tuples or 5, sometimes with encoding sometimes not.  

Thanks,
Kyle

kepp...@networkoptix.com

unread,
Mar 9, 2018, 2:32:54 PM3/9/18
to robotframework-users
I think I may have solved my own problem.  Decoding in ascii when header is initialized saves me a lot of headache.

 def check_email_subject(self, email_id, sub_text):
        conn = imaplib.IMAP4_SSL('imap.gmail.com', 993)
        conn.login('em...@gmail.com', 'qweasd!@#')
        conn.select()
        typ, data = conn.uid('fetch', email_id, '(BODY.PEEK[HEADER.FIELDS (SUBJECT)])')
        for res in data:
            if isinstance(res, tuple):
                #header = re.sub("Subject: ", "", res[1])
                header = email.header.decode_header(res[1].decode('ascii').strip())
                print (header)
                header_str = "".join([x[0].decode('utf-8').strip() if x[1] else re.sub("(^b\'|\')", "", str(x[0])) for x in header])
                print (header_str)
                print (header_str.encode('utf-8').decode('utf-8'))
                header_str = re.sub("Subject: ", "", header_str)
Reply all
Reply to author
Forward
0 new messages