The first (partial) paragraph explains the structure. Your interpretation is basically correct.
This file has 2 datasets:
The first is named 'filetype' with a single string value named 'source'.
From the description, the values are always 'source' for external sources. Reading this dataset will return a NumPy ndarray of strings.
The second dataset is named 'source_bank' and is a compound dataset. That means it has a mix of different varaible types (in this case 4 real*8 vars plus 1 int*4 var). Compound datasets are returned as NumPy record arrays.
If you have an example file, here is some simple code to get shape and dtype for each dataset, then read the contents into a NumPy array.
import h5py
h5f = h5py.File('filename.h5','r')
print (h5f['filetype'].dtype)
print (h5f['filetype'].shape)
filetype_arr = h5f['filetype'][:]
print (h5f['source_bank'].dtype)
print (h5f['source_bank'].shape)
source_bank_arr = h5f['source_bank'][:]
h5f.close()
Do you need to convert .txt files to .h5 files?
If so, that depends on the the format of the .txt file.
The text ends with "The process consists of two steps". What does it say after that?