I am doing a numerical analysis which needs to calculate many determinants of "huge" dense matrices. The sizes of the matrix are less than 100*100.
I found the computation of the determinant is quite slow with Matrix_generic_dense. I followed the method on
What is the time complexity of numpy.linalg.det? and made minor modifications to the script to benchmark for Sagemath.
1. np.arange(1,10001, 100) => np.arange(1, 292, 10) because the speed is too slow, I have to reduce the testing range to make the script finish within a reasonable time.
2.
np.ones((size, size)) => ones_matrix(RR, size, size)
3. timeit('np.linalg.det(A)', globals={'np':np, 'A':A}, number=1) => timeit('A.det(algorithm="df")', globals={'A': A}, number=1)
Here is the full script I used.
from timeit import timeit
import matplotlib.pyplot as plt
import numpy as np
from sage.rings.real_mpfr import RR
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
sizes = np.arange(1, 292, 10)
times = []
for size in sizes:
A = ones_matrix(RR, size, size)
time = timeit('A.det(algorithm="df")', globals={'A': A}, number=1)
times.append(time)
print(size, time)
sizes = sizes.reshape(-1, 1)
times = np.array(times).reshape(-1, 1)
quad_sizes = PolynomialFeatures(degree=2).fit_transform(sizes)
quad_times = LinearRegression().fit(quad_sizes, times).predict(quad_sizes)
cubic_sizes = PolynomialFeatures(degree=3).fit_transform(sizes)
cubic_times = LinearRegression().fit(cubic_sizes, times).predict(cubic_sizes)
quartic_sizes = PolynomialFeatures(degree=4).fit_transform(sizes)
quartic_times = LinearRegression().fit(quartic_sizes,
times).predict(quartic_sizes)
plt.scatter(sizes, times, label='Data', color='k', alpha=0.5)
plt.plot(sizes, quad_times, label='Quadratic', color='r')
plt.plot(sizes, cubic_times, label='Cubic', color='g')
plt.plot(sizes, quartic_times, label='Quartic', color='b')
plt.xlabel('Matrix Dimension n')
plt.ylabel('Time (seconds)')
plt.legend()
plt.show()
This is the time complexity diagram on the linked post. The data from the testing with Numpy. It only took about 20 seconds to calculate a 10000*10000 matrix.
This is the testing with matrix(RR) with the df algorithm.
I also tested with the "hessenberg" algorithm which got worse.
Compared to Numpy, Sagemath spent 800 seconds calculating a 300*300 matrix. It seems that Sagemath still runs the algorithm under O(n^3) (the curves of the cubic model and the quartic model overlap) but with a large constant. Why Sagemath is so slow? Is it possible to speed up?