The end of FUSE

265 views
Skip to first unread message

Matthew McPherrin

unread,
Jan 30, 2017, 8:29:00 PM1/30/17
to keywhiz-users
Hello Keywhiz-users,

I want to give everyone a heads up about a big upcoming change.

The canonical Keywhiz client has been a FUSE filesystem for many years at Square.

Due to operational pains with FUSE filesystems, we've decided to stop using it.  I've been writing a prototype of a replacement that syncs files to a tmpfs.

That (not-production-ready!) is available at


Over the next quarter, we plan to productionize that, and deprecate keywhiz-fs within our infrastructure.  At some point, probably by the middle of the year, we plan to stop supporting keywhiz-fs.

A little flavor on what the pains we've run into with FUSE:

(1) The process must stay alive as long as any processes have files open.  This design has fundamentally limited our ability to deploy keywhiz-fs, as we need to run the old copy alongside the new one until all files are closed.  Our deploy system assumes old processes shut down before new ones start.

(2) Today, we have 1 process per filesystem.  As we further containerize our infrastructure, and increase the number of keywhiz clients per system, this has been an operational pain - largely due to the design of our deployment system, which doesn't really assume the number of processes you're running varies based on configuration on the host.

(3) We've had trouble with some FUSE tooling, especially around concurrency writing /etc/mtab.

(4) The current design has file access block on network requests, which some tools have objected to.

None of these problems are insurmountable.  Keywhiz-fs is fundamentally "good enough", but we've decided to move away from what has always been a thorn in our side.
Especially as we migrated to caching more and more in the keywhiz-fs process, we've realized a tmpfs will serve us nearly as well.

The major downside is that we don't get audit logs on what processes are reading files from keywhiz-fs.  We've decided that can be handled as a separate problem, either using Auditd, or other general IDS techniques.

Keysync is using the same API as keywhiz-fs today, but we'll be customizing the API for more efficient operation soon.  At some undetermined time in the future, we'll likely be moving to gRPC as well - it's the way Square is moving as a whole, away from our proprietary RPC and JSON/REST systems we have today.  This will likely come with an update to the automation APIs, which today are a confused collection that map 1:1 to bits of our other infrastructure.

Reply all
Reply to author
Forward
0 new messages