Statistical Elements for a Variable

0 views
Skip to first unread message

dgillm...@gmail.com

unread,
Jun 25, 2025, 1:12:11 PMJun 25
to ddi...@googlegroups.com

CDI,

 

Below are elements for describing the statistics for a variable as found in a given data set. I developed this as part of my work for the US Department of Labor 3 years ago.

 

  • Counts of the values for a variable in a data set:

 

    • Number of allowed values
      • The number of records that have one of the permissible values as defined in the substantive value domain

 

    • Number of missing values
      • The number of records that have one of the permissible values as defined in the missing (sentinel) value domain

 

    • Number of valueless
      • The number of records that have an entry that is impermissible with respect to both value domains
      • Note – This becomes a real problem when data are entered by hand

 

  • Valueless list
    • The set of valueless entries

 

  • The following apply to numeric variables (measurements, estimates, or counts):

 

    • Minimum
      • The minimum value in the data (not necessarily the theoretical minimum)

 

    • First quartile
      • The 25% highest value in the data

 

    • Mean
      • The average value from the data

 

    • Median
      • The 50% highest value in the data

 

    • Third quartile
      • The 75% highest value in the data

 

    • Maximum
      • The maximum value in the data (not necessarily the theoretical maximum)

 

    • Standard deviation (optional)
      • The standard deviation of the substantive values
      • Note – There are times when it doesn’t make sense to calculate this

 

  • Allowed category counts
    • For a categorical variable, the counts for the 10 most common categories (sometimes the number of categories is very large)

 

  • Missing category counts
    • For any variable, the count for each of the missing (sentinel) categories

 

 

Of course, let me know if you have questions. I hope this is helpful.

 

Yours,

Dan

 

 

Dan Gillman

Data Unchained, LLC

+1.410.624.9582

dgillm...@gmail.com

 

 

 

 

 

Wendy Thomas

unread,
Jun 25, 2025, 1:26:54 PMJun 25
to ddi...@googlegroups.com
Dan, 

Not only helpful here but there is also a CV for summary statistics. I'll check that and if they are missing any of these I'll request including them in a new version.

Wendy

Wendy L. Thomas                            
ISRDI [retired]

--
DDI-CDI (Cross Domain Integration), https://ddialliance.org/Specification/DDI-CDI/
Email list archive at: https://groups.google.com/forum/#!forum/ddi-cdi
---
You received this message because you are subscribed to the Google Groups "DDI-CDI" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ddi-cdi+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ddi-cdi/012801dbe5f4%244799e310%24d6cda930%24%40gmail.com.

dgillm...@gmail.com

unread,
Jun 25, 2025, 1:30:38 PMJun 25
to ddi...@googlegroups.com

Wendy,

 

Thanks. I think the “valueless” idea won’t be included in the CV. The rest probably will one way or the other.

 

Did you tell me whether you want to join my retirement celebration tomorrow?

 

Dan

Wendy Thomas

unread,
Jun 25, 2025, 1:59:57 PMJun 25
to ddi...@googlegroups.com
Just sent. I saw this just before opening the CDI meeting and hadn't gotten back to it.

Wendy L. Thomas                            
ISRDI [retired]
Reply all
Reply to author
Forward
0 new messages