Hey everybody,
Use Case
I have what I think is actually a pretty common problem. All over my code base, there are uses of inspect/2. This is a wonderful thing that helps with debugging. It is less wonderful when it spews PII all over your logs / error pages and you find yourself having sent somebody's social security number to DataDog or disclosed their home address on a 500 page. Then your lawyer starts doing that thing where their eye twitches, you need to notify four different regulators on three continents, alert all of your customers with scary messages, are made to attend 3-hours of mandatory re-education wherein you learn to recite the GDPR from memory, and eventually sacrifice three interns to appease the compliance gods.
Temporary Hack
What myself and some co-conspirators have done to address this is overriding the Inspect protocol for the built-in types to redact things by default and then have a whitelist for the bits we want to show. Given that our top-level logging metadata is a map, we can pretty much just whitelist keys there and call it a day. This works fabulously, but it violates Erlang's expectations significantly. While Erlang is probably used to that, it gets quite cross and refuses to generate a release because it doesn't have a good way to know which .beam file to put in the release.
As you might imagine, I'm pretty bummed that I can't use releases and have to ignore tons of very alarming sounding warnings about redefining modules.
Options
Could we consider some solutions to make redaction require less unforgivable sins against code loading? To start, three directions have been proposed by the various people I've talked to:
- Instead of implementing Inspect for the built-in types, do that inspection in a handler for Any; thereby allowing overriding of the built-in types easily.
- Wedge something into a common entry point (maybe in Inspect.Algebra?) allows us to specify a global redaction function. Perhaps configure this with a global config value?
- Implement some sort of overriding layer for just the Inspect protocol.
In terms of pros and cons, for #1...
- Pro: Works well for built-ins.
- Pro: Implementing this is very straightforward.
- Pro: This probably doesn't break any existing code, very small blast radius.
- Con: Doesn't work at all as soon as anything defines its own inspection protocol.
- Con: Isn't amenable to configuration at runtime (maybe this is not an issue?).
As for #2...
- Pro: Can be configured at runtime.
- Pro: I have no idea how to implement this and Inspect.Algebra scares me.
- Pro: This probably doesn't break any existing code, very small blast radius.
- Con: Given that the Inspect protocol is pretty much "turn X into string", I'm not sure how much redaction we could really do if we allow the existing protocol to run.
As for #3...
- Pro: This provides a clear way to just replace the protocol for a given type.
- Pro: Implementing this is very straightforward.
- Pro: This probably doesn't break any existing code, very small blast radius.
- Con: It's all fun and games until a library does it, then you need Override2 or 3 or 4...
- Con: Probably gets redundant if there ever is a blessed way to override protocols.
- Con: I already pitched this to a few Elixir celebrities and they thought it was a bit too hacky.
In Closing
So, yeah, in the long term, maybe we'll have a blessed way to override protocols; but, short of that, there's got to be some stopgap, right? What do people think of adopting something like one of these approaches so that my PII problems evaporate and I can finally build some sweet, sweet releases? I'll even implement it myself! I promise!
Thanks,
- J