Matrix inversion with Cholesky decomposition

12 views
Skip to first unread message

henry wasker

unread,
Nov 14, 2021, 5:50:57 PM11/14/21
to MAGMA User
Hello !

I have a first version of a function that inverses a matrix of size m and using magma_dgetrf_gpu and magma_dgetri_gpulike this :

  // Inversion
  magma_dgetrf_gpu( m, m, d_a, m, piv, &info);
  magma_dgetri_gpu( m, d_a, m, piv, dwork, ldwork, &info);

Now, I would like to also inverse but using the Cholesky decomposition. The function looks like the first version one,except the functions used which are :

  // Inversion
  magma_dpotrf_gpu( MagmaLower, m, d_a, m, &info);
  magma_dpotri_gpu( MagmaLower, m, d_a, m, &info);

Here is the entire function that inverses :

// ROUTINE MAGMA TO INVERSE WITH CHOLESKY
void matrix_inverse_magma(vector<vector<double>> const &F_matrix, vector<vector<double>> &F_output) {

  // Index for loop and arrays
  int i, j, ip, idx;

  // Start magma part
  magma_int_t m = F_matrix.size();
  if (m) {
  magma_init (); // initialize Magma
  magma_queue_t queue=NULL;
  magma_int_t dev=0;
  magma_queue_create(dev ,&queue );
  double gpu_time , *dwork; // dwork - workspace
  magma_int_t ldwork; // size of dwork
  magma_int_t *piv, info; // piv - array of indices of inter -
  // changed rows; a - mxm matrix
  magma_int_t mm=m*m; // size of a, r, c
  double *a; // a- mxm matrix on the host
  double *d_a; // d_a - mxm matrix a on the device
 
  magma_int_t err;
  ldwork = m * magma_get_dgetri_nb( m ); // optimal block size
  // allocate matrices
  err = magma_dmalloc_cpu( &a , mm ); // host memory for a

  // Convert matrix to *a double pointer
  for (i = 0; i<m; i++){
    for (j = 0; j<m; j++){
      idx = i*m + j;
      a[idx] = F_matrix[i][j];
    }
  }
  err = magma_dmalloc( &d_a , mm ); // device memory for a
  err = magma_dmalloc( &dwork , ldwork );// dev. mem. for ldwork
  piv=( magma_int_t *) malloc(m*sizeof(magma_int_t ));// host mem.

  magma_dsetmatrix( m, m, a, m, d_a, m, queue); // copy a -> d_a

  // Inversion
  magma_dpotrf_gpu( MagmaLower, m, d_a, m, &info);
  magma_dpotri_gpu( MagmaLower, m, d_a, m, &info);

  magma_dgetmatrix( m, m, d_a , m, a, m, queue); // copy d_a ->a

  // Save Final matrix
  for (i = 0; i<m; i++){
    for (j = 0; j<m; j++){
      idx = i*m + j;
      F_output[i][j] = a[idx];
    }
  }

  free(a); // free host memory
  free(piv); // free host memory
  magma_free(d_a); // free device memory
  magma_queue_destroy(queue); // destroy queue
  magma_finalize (); 
  // End magma part
  }
}

Unfortunately, after checking the output data, I have a wrong inversion with my implementation.

I have doubts about the using at this line :

  ldwork = m * magma_get_dgetri_nb( m ); // optimal block size

Could anyone see at first sight where the error comes from in my using of dpotrf and dpotri functions (actually magma_dpotrf_gpu and magma_dpotri_gpu) ?


Reply all
Reply to author
Forward
0 new messages