Base64 decoding faling - non alphabet digit found

1,471 views
Skip to first unread message

Brad O'Hearne

unread,
Feb 14, 2016, 11:20:09 PM2/14/16
to elixir-lang-talk
I have a Base64 encoded string that I am trying to decode with the following line: 

 Base.url_decode64!(data)

Here is what I am receiving as output:

** (ArgumentError) non-alphabet digit found: "=" (byte 61)
    (elixir) lib/base.ex:111: Base."-do_decode64url/1-lbc$^0/2-0-"/2
    (elixir) lib/base.ex:579: Base.do_decode64url/1

Any ideas for how to get around this problem? Thanks in advance.

Brad

Brad O'Hearne

unread,
Feb 14, 2016, 11:37:13 PM2/14/16
to elixir-lang-talk
I should add that I've been playing with several Base64 decoding apps, and I believe the string I am decoding is not using UTF-8, but ASCII. Is there a way to decode to ASCII rather than UTF-8? 

Brad

Ben Wilson

unread,
Feb 14, 2016, 11:37:45 PM2/14/16
to elixir-lang-talk
Any reason you're using the url_ version? Does Base.decode64! work?

Brad O'Hearne

unread,
Feb 14, 2016, 11:58:55 PM2/14/16
to elixir-lang-talk
No...the reason I copied in the url_ version is because that's what I tested last. Base.decode64! fails with exactly the same error: 

** (ArgumentError) non-alphabet digit found: "=" (byte 61)

    (elixir) lib/base.ex:111: Base."-do_decode64/1-lbc$^0/2-0-"/2

    (elixir) lib/base.ex:547: Base.do_decode64/1


I'm guessing this has something to do with UTF-8 vs. ASCII -- is there a way to change the target encoding? 


Brad

Paulo Almeida

unread,
Feb 15, 2016, 8:26:13 AM2/15/16
to elixir-lang-talk
The error suggests that the "=" character appears in the middle of the encoded text. It should be used only as padding.

Can you provide a sample input that causes the error?

Tallak Tveide

unread,
Feb 15, 2016, 8:32:44 AM2/15/16
to elixir-lang-talk
You could try using codepagex for ASCII -> UTF-8 conversion.

https://hex.pm/packages/codepagex

I believe there are some functions for this in the Erland stdlib also. Probably more lightweight:

http://erlang.org/doc/man/unicode.html

Onorio Catenacci

unread,
Feb 15, 2016, 8:55:52 AM2/15/16
to elixir-lang-talk
It would be a significant help (for us to help you) if you'd share the exact string you're trying to decode.  Better yet share a small snippet of code that demonstrates the issue.

--
Onorio


On Sunday, February 14, 2016 at 11:20:09 PM UTC-5, Brad O'Hearne wrote:

Duilio Ruggiero

unread,
Feb 15, 2016, 9:09:26 AM2/15/16
to elixir-lang-talk
As Paulo Almeida said, the "=" sign is used for padding and should be found only at the end of the encoded string.
Check if you are concatenating encoded strings before decoding :)

Examples:

iex> txt = Base.encode64("any carnal pleas")

"YW55IGNhcm5hbCBwbGVhcw=="

iex> Base.decode64!(txt)                       

"any carnal pleas"

iex> Base.decode64!(txt<>txt)

** (ArgumentError) non-alphabet digit found: "=" (byte 61)

    (elixir) lib/base.ex:111: Base."-do_decode64/1-lbc$^0/2-0-"/2

    (elixir) lib/base.ex:547: Base.do_decode64/1


But it works for concatenated strings that don't contains the "="

iex()> txt = Base.encode64("any carnal pleasur")      

"YW55IGNhcm5hbCBwbGVhc3Vy"

iex()> Base.decode64!(txt<>txt)                 

"any carnal pleasurany carnal pleasur"

Brad O'Hearne

unread,
Feb 15, 2016, 1:53:49 PM2/15/16
to elixir-lang-talk
Thanks everyone for your replies -- very helpful. In regards to this: 


On Monday, February 15, 2016 at 7:09:26 AM UTC-7, Duilio Ruggiero wrote:
As Paulo Almeida said, the "=" sign is used for padding and should be found only at the end of the encoded string.
Check if you are concatenating encoded strings before decoding :)

Unfortunately, I am not at liberty to post the full data in question (because the encoded data is proprietary), but I can post the very end of the Base64-encoded string, which might be enough to draw a conclusion. Here's the last line: 

"MC4wCg==DQo=“


I'm guessing that the problem is that "DQo" is interspersed amongst the padding, and that is causing the parser to crash. I'd be interested in others' opinions -- I'd guess that padding shouldn't have data in it. 


Brad


Mike Shapiro

unread,
Feb 16, 2016, 1:21:17 PM2/16/16
to elixir-lang-talk
"DQo=" is "\r\n".

If that's interspersed throughout the text, my guess is that someone base-64 encoded all the elements of an array before joining them into a string, whereas they should have joined them into a string and then encoded.

--Mike

Bradley O'Hearne

unread,
Feb 17, 2016, 9:33:45 AM2/17/16
to elixir-l...@googlegroups.com
Yeah, there’s junk in the Base64-encoded text — confirmed as a bug with the API vendor which outputs it.During the process, I had actually dropped back to an Erlang base64, and also used two GUI apps which did base64 decoding, and all choked….it’s illegal base64. 

 Thanks for everyone’s help. It was greatly appreciated….and helped me reach the solution.

Brad
Reply all
Reply to author
Forward
0 new messages