Also, its (marginally but consistently) faster in Julia to just call "copy(T)" rather than "full(Tridiagonal(map(i -> diag(T, i), -1:1)...))", which seems odd to me because "copy" should be O(N^2) and the other method (only copying the sub, main, and super diagonals) is O(N), where N is the dimension of a square matrix.
For small arrays I would guess that hardcoded multiplication is faster than calling a BLAS.
You can find info about you julia by calling versioninfo()
Sorry if the numerical linear algebra stuff is off-topic for this mailing list.
Have you tried Golub and Van Loan's Matrix Computations? In the 3/e, it is on p. 418 ff (Section 8.3.5). They call it zero-chasing, but it's the same thing.I've read about "chasing the bulge" and the implicit QR algorithm, but I haven't been able to find a description (or implementation/psuedocode) that I could easily understand. Is there any chance you could point me to an implementation or pseudocode or a good explanation of the algorithm? Do you happen to know of an exposition of the implicit QR algorithm that uses Householder projections rather than Givens rotations?