JSON-Stat API and Model Context Protocol (MCP) for AI.

23 views

Skip to first unread message

Sergio Nieto

unread,

Apr 15, 2026, 1:07:23 PMApr 15

to json-stat

What modifications do I need to make to my JSON-Stat–based API to add metadata and build a Model Context Protocol (MCP) on top of my API?”

Xavier Badosa

unread,

Apr 20, 2026, 4:46:41 PMApr 20

to json-stat

Segio,

Please check Jan's reply:

I do not think this is a direct answer to your question, but PxWebApi v2 use JSON-stat v2. I have been working on a generic Claude skill for PxWebApi. I just uploaded a Beta. Maybe you will find some inspiration here: https://github.com/janbrus/ssb-api-v2-examples/tree/main/pxwebapi-v2-generic-skill . It is based on the skill for Statistics Norway's PxWebApi v2 https://github.com/janbrus/ssb-api-v2-examples/tree/main/ssb-pxwebapi-v2-skill

Thank, Jan. That's very interesting. I will read it carefully. I think Statistics Finland has done similar tests (building an MCP based on their JSON-stat API).

What modifications do I need to make to my JSON-Stat–based API to add metadata and build a Model Context Protocol (MCP) on top of my API?

Sergio, the richer your API, the more powerful your MCP will be. Consequently, it is preferable for the API to support optional JSON-stat features (such as dimension roles), rather than omitting them. Furthermore, leveraging the built-in extension mechanism provided by the JSON-stat specification may be advantageous.

While I have not yet implemented the Model Context Protocol (MCP) specifically, I have developed a data retrieval system enhanced by tool-calling capabilities—an architecture that requires similar service-oriented infrastructures. Specifically, I have built a conversational search assistant powered by a Small Language Model (SLM), such as Gemma 4. Despite its "reduced" parameter count, Gemma 4 provides multilingual support and good native function-calling capabilities, which are needed to interact with the JSON-stat API.

The model accesses lists of statistics and datasets through JSON-stat 'collection' responses from the API, which it has no problem processing. Additionally, the API uses JSON-stat 'dataset' responses to convey metadata ("empty datasets" or datasets without observation values). Again, the model effectively parses these metadata structures without any help, enabling it to independently retrieve information such as update timestamps, the latest available reference periods, or the age groups (or any other classification) used.

What about data? In the past, ensuring accurate model interpretation of JSON-stat data serialization required explicit prompting regarding its 'row-major order' storage convention. Providing this context was essential for the model to correctly reconstruct the multidimensional data cube from the JSON-stat flattened array. This is less and less true and more and more unnecessary. Take for example, Claude. Using Sonnet 4.6 (no need for the powerful Opus), Claude is able to answer this question:

Help me analyze the data from this JSON-stat end-point: https://api.idescat.cat/taules/v2/censph/539/5976/cat/data?LAST=5&lang=en

Claude fetches the JSON-stat dataset on its own and, without the need for explicit architectural hints, natively interprets the JSON-stat structure and builds several data visualizations to help the user understand the data. In addition to visual outputs, the model also generates an automated textual analysis of the data. In case you are interested, here is the shared chat session with Claude (please note that authentication may be required to access the interactive visuals):

https://claude.ai/share/bbc3c56d-8b01-4a7c-9333-58fcbde0db7b

I also have successfully dealt with JSON-stat using Claude Artifacts, like converting a JSON-stat source file into CSV.