Hi Philip,
I'm working on extracting DDI from tabular files, hence the question. During file ingestion, Dataverse calculates summary statistics for all numeric variables. Is this correct? When I load a CSV file, it performs this calculation on all variables it identifies as numeric. Is this also correct? When I configure a structured file using SPSS (SAV file), I can define the variable metrics as "Scale," "Nominal," or "Ordinal," and this makes a difference for SPSS, but not for Dataverse, as it considers everything numeric unless it's a character type... Is this also correct?
My question was regarding the variable settings in SPSS, whether they would make any difference to Dataverse. For example, if I set the variable to "Nominal," Dataverse would stop calculating summary statistics, but I tried these settings this week without success. Dataverse only failed to calculate summary statistics for character type variables.
Example with a numeric variable:
<var ID="v6852" name="CO_CEP" intrvl="discrete">
<location fileid="f276"/>
<labl level="variable">CEP</labl>
<sumStat type="mode">.</sumStat>
<sumStat type="min">7.685E7</sumStat>
<sumStat type="max">7.6997E7</sumStat>
<sumStat type="vald">501.0</sumStat>
<sumStat type="invd">0.0</sumStat>
<sumStat type="mean">7.691397609580839E7</sumStat>
<sumStat type="medn">7.6907648E7</sumStat>
<sumStat type="stdev">47741.80343756191</sumStat>
<varFormat type="numeric"/>
Example with the same variable, but changing to character type: