Question about MSAK schema and data it collected

99 views
Skip to first unread message

Ray Tseng

unread,
Jul 14, 2025, 6:47:05 AMJul 14
to discuss
Hi,

I have couple questions related MSAK's schema and the data it collected.

  • What is Metadata - NameValue, does it only includes data about the client like Browser/Android? Or what does include in this Object?
https://github.com/m-lab/msak/blob/3b679ee70d69616807444d0e35bbaa6d538f4803/pkg/throughput1/model/namevalue.go

  • Does MLab collet other information about IP like Geo location and can use for backtracking user's activity? Does MLab/MSAK collect other data about user or other PII other than IP address?
  • What is the data retention period? Will it be cleanup after specific period?


Thanks
Ray

Roberto D'Auria

unread,
Jul 14, 2025, 9:38:29 AMJul 14
to Ray Tseng, discuss
Hi Ray,

> What is Metadata - NameValue, does it only includes data about the client like Browser/Android? Or what does include in this Object?

That's a generic structure for storing key/value pairs in BigQuery. It's used in the raw.ClientOptions and raw.ClientMetadata records, which contain test options and free-form metadata provided by the client, respectively. You can find a description of the schema in the MSAK design document.

The content of ClientMetadata is entirely up to the client. For example the official Go client sends metadata about the client OS (very generic since it's just the GOOS variable, e.g. "Linux"), CPU architecture, and client name/version + client library name/version. The JS client library does something similar by parsing the User Agent.

> Does MLab collet other information about IP like Geo location and can use for backtracking user's activity? Does MLab/MSAK collect other data about user or other PII other than IP address?

We don't collect the location at test time, but we do add geographical and network annotations later on so that they are available in BigQuery (see client.Geo and client.Network in the msak.throughput1 view). This geographical annotation is based on the IP address, using the MaxMind database to *estimate* the client's location.

We do not collect any PII other than the client's IP address, and we have no way to "track the user activity" - all we know is that a client with a given IP address ran an MSAK (or an NDT) test at a given time and the test result.

> What is the data retention period? Will it be cleanup after specific period?

We store all test data forever.

Kind regards,
Roberto D'Auria

--
You received this message because you are subscribed to the Google Groups "discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@measurementlab.net.
To view this discussion visit https://groups.google.com/a/measurementlab.net/d/msgid/discuss/2583b484-5447-48ad-b1d3-e6f0692f0c07n%40measurementlab.net.


--
Roberto D'Auria
Platform Engineer
Reply all
Reply to author
Forward
0 new messages