Converting NDT PCAP Files to Parquet Format

52 views
Skip to first unread message

Pablo Rojo

unread,
May 20, 2025, 5:35:45 AMMay 20
to discuss

Hi Community,

In some cases, we may need to process NDT PCAP files to calculate test statistics not included in BigQuery. To facilitate this, we've published a Python utility that converts PCAP files to Parquet format (JSON and CSV are also supported).

This tool is designed to be generic and decodes all fields present in NDT PCAP files (see schema.md for details). It accepts multiple gzip files as input, generates a single output file, and can run multi-threaded.

Here's the link to the repository in case it is usefull to anybody: https://github.com/nokia/pcaptoparquet

The CLI has been primarily tested in Linux environments, but it also runs on Windows. The Python library should be usable on any platform.

We've tested it with NDT gzip files. The Parquet file size for NDT PCAP files is only 10% larger than the original gzip files.

Size Evaluation:

devbox$ more test_mlab_bulk.log Input file size: 700009.711 KiB Output file size: 760944.678 KiB Compression ratio: 109%


We welcome your comments and contributions. Please let me know if you have any questions.

Best regards, Pablo


Reply all
Reply to author
Forward
0 new messages