thoughts on datatype for cleartext password

dar...@gmail.com

unread,

Jan 13, 2015, 5:40:30 PM1/13/15

to passli...@googlegroups.com

What data type do you recommend for cleartext passwords while they reside in memory? I've seen Apache Shiro use a byte[] datatype, which they immediately clear with null hex values once the application is finished with the password (and before collection).

Maybe you've come across the requirements/recommendations for in-memory data handling that would make an application FIPS 140.2 compliant?

3.7.3 In-Memory Data Handling

In order to minimize in-memory data handling disclosure vulnerabilities in the application,

implement the following procedures:

• Encrypt sensitive data held in physical or virtual memory when not being used.

• Clear all memory blocks used to process sensitive data prior to releasing the memory.

• Revoke access authorizations to data prior to initial assignment, allocation, or

reallocation to an unused state so information produced by a previous user is not

available to a subsequent user that obtains access to an object that has been released back

to the system.

• Ensure that memory clearing code is not removed by the compiler when compiler

optimization is selected.

• Use memory clearing in supporting development environments (e.g., SecureSting in the

.Net Framework Class Library).

Eli Collins

unread,

Jan 26, 2015, 12:53:50 PM1/26/15

to passli...@googlegroups.com, dar...@gmail.com

That's particularly tricky under Python. The main datatype I know of is Python's bytearray(), which can be treated like a byte string in many (but not all) cases. Since it's mutable, you can then zero it out afterwards.

The problem is that any time you try to populate it, or hand it off to another function, a temporary byte copy may be made in memory, which may linger. For example, I could do ...


buf = bytearray("\x00" * 6)
buf[0] = b"s"
buf[1] = b"e"
buf[2] = b"c"
buf[3] = b"r"
buf[4] = b"e"
buf[5] = b"t"

... which would semi-securely populate the byte array without the string "secret" every being allocated by the Python VM (as described in http://stackoverflow.com/a/14667881/681277). Of course, you've got to figure out how to populate it from the password source without *that* code ever creating a Python string.

But that problem aside, even doing something as simple as ...

import hashlib
digest = hashlib.sha256(buf).hexdigest()

... can't be trusted, as there's no guarantee (without examining the source for the particular VM & sha256 implementation) that doing ``sha256(buf)`` doesn't create a lingering copy in memory as the Python->C interface translates your input into something it can hand off to the underlying sha256 engine.

All that said, if I were to approach achieving that, bytearray() would probably be my preferred choice. (mmap'ing the file containing the password might also be useful).

With specific regards to Passlib ... sadly it doesn't handle bytearrays to well just yet, and creates lingering copies of byte objects in a few places. I'm looking into fixing that for the far future, but it's never been a priority, because it's always felt like an impossible battle to strive for such a guarantee in passlib, when I can't trust the VM won't make me a liar in an unexpected place.

For what it's worth, hope that helps :)

Eli Collins

unread,

Jan 29, 2015, 5:14:08 PM1/29/15

to Darin Gordon, passli...@googlegroups.com

I've thought about it for my own purposes, but don't think it's workable in passlib for a couple of reasons ...

For one, I try my best to keep passlib pure-python; both to support a large userbase (Google App Engine doesn't even allow C extensions), and because they are frankly much more of a build, testing, and security headache.

More importantly though, there's the bootstrapping issue. If the passwords are provided by the user (say over the web via pyramid, or on the desktop via pyqt), they'll end up getting allocated as a string floating around in the VM's heap before they even reach passlib. What's needed is a pure-C user-to-python channel, which exposes the password as a byte arrray, or something with python's "buffer" object. (Once that point was reached, passlib would still need to be modified to handle, and the VM audited, but ignoring that...)

The closest channel I can think of is the incredibly useful keyring library (https://pypi.python.org/pypi/keyring). While I don't think it's FIPS 140.2 compliant currently, a number of it's backends (such as the win32 wallet) certainly have the potential to let it acquire user passwords and provide them in python in a secure way. I'm not sure how easy it would be for them to achieve it either, but they're certainly the closest project I can think of for getting there.

- Eli

On Mon, Jan 26, 2015 at 1:33 PM, Darin Gordon <dar...@gmail.com> wrote:

Thank you, Eli, for the thoughtful response. Have you considered a C extension that would handle storing passwords and clearing/collecting them? In other discussions, this was presented as an alternative.

--

Eli Collins el...@assurancetechnologies.com
Software Development & I.T. Consulting
Assurance Technologies www.assurancetechnologies.com

Reply all

Reply to author

Forward