It "follows" the SK10 algorithm in the sense of applying a single-octave transform to recursively downsampled versions of the input, but we make no guarantees about exact compatibility or numerical equivalence. Over the years, we've made several modifications, generalizations, and extensions to the core function to support various use cases. I'd be hard-pressed to enumerate all of the changes in one go, but probably the most important for checking compatibility will be basis sparsification (mentioned briefly in SK10, but not elaborated), early downsampling, and the low-pass filter (bidirectional butterworth in SK10, several options in librosa; default being an approximate continuous-time FIR filter, but other modes are supported as well). Other features like tuning adaptation, variable-Q support (added later in the CQT toolbox, see ref in librosa.vqt docstring), Q-factor scaling (filter_length), hybrid/pseudo-cqt, different basis window shapes, and normalization are all less important, and can be ignored or disabled if you just need a basic CQT.
Admittedly, the librosa implementation is probably a bit difficult to untangle, due to being factored out into a dozen or so different functions. Having spent some time looking through the CQT toolbox, I'm not sure it would be a better or simpler starting point, but it will depend on your comfort level with python vs matlab.
At a high level, getting numerical equivalence across languages and implementations for this kind of thing is going to be pretty tough. I recommend starting simple, defining an acceptable tolerance level for what you consider to be "close enough" (eg, matching magnitudes to 1e-6 or something; phases can be tricky for all kinds of reasons, but phase differentials should be easier to match), and going from there.
I hope that helps -- I'd be curious to hear how it goes!