Hi Maoting,
No worries!
1) You are correct that there will be no difference in calculated values besides file/directory count. However, as you note there is no --select-prefixes at this time for voila tsv. Currently the lab and I have not decided if this will be a legacy format, superseded by the majiq tsv commands in some way, or kept longer term. I'm going to post some internal messages soon to get an answer on this. If we do keep it, I will consider adding in more flags such as --select-prefixes. Until then, as you suggest, voila defaults to showing groups by default, and to show individual experiments, there should be one file per experiment.
2) The only relevant data in sgc files is read counts. At a cursory look over the voila tsv output columns, I believe that the only necessity of providing these files is when using the --show-read-counts switch, so I will verify this and provide an updated version soon which only requires sgc files if this switch is specified. To answer your question directly though, sgc files should be provided for all groups/experiments that you are providing psicov/dpsicov files for, for the reads output columns to be successfully processed
3) Internally, we have had scheduled a group discussion to go over an appropriate method to keep output header documentation in line with internal development in a sustainable way. This meeting has been delayed multiple times, I will bring it up again as a renewed priority citing your confusion as a cause. In the meantime, please feel free to ask here about any other columns you may not understand. In the new junc-per-row format of v3, the information that was in these columns "junction_coords", "exon_coords", "IR_coords" were usually a semicolon separated list of data which is now broken up into each row:
junction_coords: one set of start-end for each junction, in the new format this is the "start" and "end" columns
exon_coords: one set of start-end for each exon, in the new format this is ref_exon_start-ref_exon_end, plus other_exon_start-other_exon_end for each other row
infron_coords: one set of start-end for an intron, if the lsv contained an intron, in the new format this is "start" and "end" columns in the case that the "is_intron" column is True
For all of these, the criterion that delineate the LSV is when the set of (gene_id, event_type, ref_exon_start, ref_exon_end) changes.
Let me know if it is understandable.
-San