Source for large test files?

6 views
Skip to first unread message

John Liefeld

unread,
Jan 7, 2026, 1:37:05 PM (4 days ago) Jan 7
to Discuss
Are there any public collections that have some large (10GB+) files available that I can use for testing?  I am debugging an issue in my application that affects large files only, and trying to use a local Globus Connect Personal as the source for large files, but limited upload speeds (at home) and flakey wifi (at the university) means that I am getting last-mile problems from my GCP sources.  

I need a 10+ GB file I can repeatedly and reliably request to transfer to my system as I wok out where the application failure is.

Thanks

Lev Gorenstein

unread,
Jan 7, 2026, 1:43:21 PM (4 days ago) Jan 7
to John Liefeld, Discuss
John,

The good folks from ESNet maintain several “ESNet  $LOCATION DTN (Anonymous read only testing)” collections (where $LOCATION = {Denver, Houston, LBL-DEV-, Starlight, CERN, Sunnyvale}, etc) with assemblies of a wide variety of files types / sizes that you could use as data sources for performance and throughput testing.  See https://fasterdata.es.net/performance-testing/dtns/ for more details. E.g. here's one at Starlight.

Note that while they are public and available, these DTNs are configured to block access to non-R&E connected sites (to prevent commodity peering sites from being accidentally overwhelmed). So as an example, you would not be able to use them for testing performance of your laptop endpoint while at home - but testing a university cluster endpoint (or even that same laptop while in the office) would work just fine.


Lev

Vas Vasiliadis

unread,
Jan 7, 2026, 1:47:38 PM (4 days ago) Jan 7
to John Liefeld, Discuss
Hi John,

There are a few such collections that I’m aware of:

- https://app.globus.org/file-manager?origin_id=651903ec-e892-460e-bfcf-f824d66509fb&origin_path=%2FDME%2Fperftest%2F
- https://app.globus.org/file-manager?destination_id=e6cc344e-91e5-4cc3-afda-3603d02b8a00&destination_path=%2Fperftest%2F

The above two collections have the following datasets:
- ds01: 100MB 10,000 x 10KB files in single directory
- ds04: 10GB 10,000 x 1MB files in 100 non-nested directories, 100 files/directory
- ds06: 100GB 100,000 x 1MB files in single directory
- ds08: 1TB 50 x 10GB; 350 x 1GB; 1,000 x 100MB; 5,500 x 10MB; 23,176 x 1MB files in single directory
- ds10: 1TB 100 x 10GB files in single directory
- ds12: 100GB 1 x 100GB file in single directory
- ds14: 5TB 50 x 100GB files in single directory
- ds16: 1TB 4 x 250GB files in single directory

ESnet also maintains a number of collections around the globe for performance testing. You can search for these using “ESnet anonymous” in the Globus web app. Some results:

- https://app.globus.org/file-manager?origin_id=8409a10b-de09-4670-a886-2c0b33f0fe25&origin_path=%2F (Sunnyvale)
- https://app.globus.org/file-manager?origin_id=78f14af7-a8a3-488f-b42d-8c6fa4dfc2ac&origin_path=%2F (Houston)

- Vas

Vas Vasiliadis

unread,
Jan 7, 2026, 1:55:31 PM (4 days ago) Jan 7
to John Liefeld, Discuss
P.S. For access to the first two collections, you will need to join this Globus group: https://app.globus.org/groups/3ca64c67-9daf-11e9-855f-0e45b29ab6fa/join
> To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@globus.org.
>

John Liefeld

unread,
Jan 7, 2026, 1:57:36 PM (4 days ago) Jan 7
to Discuss, v...@globus.org, Discuss, John Liefeld
Thanks!
Reply all
Reply to author
Forward
0 new messages