Hi All,
I have reduced my bug to the following minimal example: I create a MatrixFree object for two different ConstraintMatrices (provided as vector of ConstraintMatrices to the reinit(..)).
matrix_free_data.reinit(dofHandlerVector, d_constraintsVector, quadratureVector, additional_data);
The dofHandlers are the same for both the ConstraintMatrices. Only difference- the first ConstraintMatrix has periodic constraints while the second one has no constraints. Next I run two cases
Case1:
FEEvaluation<3,FEOrder,FEOrder+1,1> phiTotEval(matrix_free_data,0, 0);
const unsigned int numSubCells=matrix_free_data.n_components_filled(cell);
const int numQuadPoints=phiTotEval.n_q_points;
double phiTotRhoOutQuadSum=0.0;
for (unsigned int cell=0; cell<matrix_free_data.n_macro_cells(); ++cell){
phiTotEval.reinit(cell);
phiTotEval.read_dof_values_plain(dftPtr->poissonPtr->phiTotRhoOut); //phiTotRhoOut is a parallel::distributed::Vector where I have previously called ConstraintMatrix(the one with periodic constraints)::distribute() and update_ghost_values()
// I have also called matrix_free_data.initialize_dof_vector(phiTotRhoOut) prior
phiTotEval.evaluate(true,true);
for (unsigned int q=0; q<numQuadPoints; ++q){
VectorizedArray<double> phiTot_q= phiTotEval.get_value(q);
for (unsigned int iSubCell=0; iSubCell<numSubCells; ++iSubCell){
phiTotRhoOutQuadSum+=phiTot_q[iSubCell];
}
}
}
double phiTotRhoOutQuadSumTotal=Utilities::MPI::sum(phiTotRhoOutQuadSum, mpi_communicator);
if (this_mpi_process == 0){
std::cout << "phiTotRhoOutQuadSumTotal_vectorized " << phiTotRhoOutQuadSumTotal<<std::endl;
}
Case2:
Same as Case 1- only first line is changed to (The only difference in both cases is the fe_id argument to FEEvaluation constructor)
FEEvaluation<3,FEOrder,FEOrder+1,1> phiTotEval(matrix_free_data,1, 0);
I get different values of phiTotRhoOutQuadSumTotal for the two cases when run on multiple processors but same value when run on single processor. Comparing with a non-vectorized loop, case 1 which uses the periodic constraints gives the correct answer. However going by the manual, I expected the same value as I am using read_dof_values_plain(..) which doesn't take any constraints into account. I am wondering if I am doing anything wrong here.
Best,
Sambit