First, let me answer your general question: the main reason behind this unfolding procedure is computational efficiency:
- computing the initial ground state and constructing MLWFs is more efficient using k-point sampling than doing it directly in a supercell
- meanwhile, the DSCF calculations for calculating the screening parameters (where we explicitly remove/add an electron to a localised state) need to be performed in a supercell to avoid the localized charged defect interacting with its periodic images
The computational expense of the unfolding procedure is negligible compared to that of the Wannierization. I can't think of any situtaion where it would be advantageous not to use the smallest possible cell when constructing the MLWFs.
Now, to answer your specific question: yes this is possible, but it is not recommended. If you start from a cell that is a 2x2x2 supercell of the primitive cell, then setting the k-point grid to [1, 1, 1] will mean that the screening parameters are performed on the same 2x2x2 supercell of the primitive cell (which I believe is the process that you are describing). However, if you do this, I think you will still see the "unfolding to supercell" step during the workflow because this step -- in addition to converting the k-point-indexed WFs to Gamma-only WFs -- converts the Wannier90 files to a format that the subsequent kcp.x calculations can read, so it is still necessary even when the k-point grid is [1, 1, 1]!