Missing audio_embedding Feature in Audioset TFRecord Files

72 views
Skip to first unread message

Ivan Birkmaier

unread,
Sep 12, 2024, 5:04:44 PM9/12/24
to audioset-users

Hi everyone,

I'm trying to run the Audioset recipes from AST, but I'm facing an issue with the TFRecord files from the EU features dataset. When I analyze the TFRecord files, I can't find the audio_embedding feature—only other features like labels and end_time_seconds. Additionally, it seems like labels and end_time_seconds are switched in the file.

Has anyone else encountered this issue? Is there a known solution or workaround? Unfortunately, I don't have another source for the data.

Thanks in advance!

Ivan Birkmaier

unread,
Sep 12, 2024, 5:06:23 PM9/12/24
to audioset-users
I also tried the US feature dataset - same issue.

Manoj Plakal

unread,
Sep 18, 2024, 3:25:43 PM9/18/24
to Ivan Birkmaier, audioset-users

Hi Ivan,

Could you be a little more specific about the errors you are seeing, which files you tried, and what code you are using to read the features?

I just tried downloading the features and I can see the features when I read the files directly in Python. Labels and times look OK.  Sample session below.

Manoj

code:

import tensorflow as tf

raw_ds = tf.data.TFRecordDataset('audioset_v1_embeddings/unbal_train/oL.tfrecord')
for raw_rec in raw_ds.take(10):
  ex = tf.train.SequenceExample()
  ex.ParseFromString(raw_rec.numpy())
  print(ex.context)  # should print video_id, labels, start/end times
  print(list(ex.feature_lists.feature_list))   # should print 'audio_embedding'
  print(list(map(hex, ex.feature_lists.feature_list['audio_embedding'].feature[0].bytes_list.value[0])))   # print first 128-d embedding

output:

1507
feature {
  key: "video_id"
  value {
    bytes_list {
      value: "oLF_1sIOAf8"
    }
  }
}
feature {
  key: "start_time_seconds"
  value {
    float_list {
      value: 400
    }
  }
}
feature {
  key: "labels"
  value {
    int64_list {
      value: 137
    }
  }
}
feature {
  key: "end_time_seconds"
  value {
    float_list {
      value: 410
    }
  }
}

['audio_embedding']
['0xa6', '0x48', '0x57', '0x68', '0xb9', '0x3e', '0x6c', '0x6b', '0x91', '0xbe', '0x7e', '0xb7', '0x66', '0x78', '0x8c', '0x37', '0xb7', '0x55', '0x5a', '0x86', '0x76', '0x27', '0xaf', '0x65', '0x7f', '0x6e', '0xdb', '0xf5', '0x57', '0xbb', '0x15', '0x22', '0x56', '0x66', '0xcc', '0xb9', '0x
32', '0x72', '0x45', '0xa2', '0x76', '0x46', '0x7b', '0x85', '0xb0', '0xb7', '0x87', '0x8c', '0x9b', '0x9d', '0x80', '0xcf', '0x59', '0xac', '0xef', '0x76', '0xf1', '0x2c', '0x7e', '0x9e', '0x53', '0xab', '0x0', '0x2c', '0x70', '0x89', '0x64', '0xb9', '0xb4', '0x52', '0xcd', '0x53', '0xb3',
'0xa4', '0x2e', '0x71', '0x7e', '0x6c', '0x67', '0x46', '0x8e', '0x87', '0x7f', '0x8e', '0xad', '0xa7', '0x53', '0x63', '0x99', '0x37', '0x51', '0x95', '0x8b', '0xbf', '0x48', '0x62', '0xd6', '0x50', '0xa5', '0x1f', '0xb8', '0xa2', '0x48', '0x50', '0x4d', '0xe2', '0x4f', '0x32', '0xb6', '0x2
d', '0x50', '0x64', '0x50', '0x26', '0x91', '0x69', '0xcb', '0x4d', '0x65', '0x98', '0x93', '0xaa', '0x44', '0xee', '0x7e', '0x52', '0xb5', '0x8c']
1507
feature {
  key: "video_id"
  value {
    bytes_list {
      value: "oLIRBB_Y0Ao"
    }
  }
}
feature {
  key: "start_time_seconds"
  value {
    float_list {
      value: 30
    }
  }
}
feature {
  key: "labels"
  value {
    int64_list {
      value: 137
    }
  }
}
feature {
  key: "end_time_seconds"
  value {
    float_list {
      value: 40
    }
  }
}

['audio_embedding']
['0xa6', '0x48', '0x57', '0x68', '0xb9', '0x3e', '0x6c', '0x6b', '0x91', '0xbe', '0x7e', '0xb7', '0x66', '0x78', '0x8c', '0x37', '0xb7', '0x55', '0x5a', '0x86', '0x76', '0x27', '0xaf', '0x65', '0x7f', '0x6e', '0xdb', '0xf5', '0x57', '0xbb', '0x15', '0x22', '0x56', '0x66', '0xcc', '0xb9', '0x
32', '0x72', '0x45', '0xa2', '0x76', '0x46', '0x7b', '0x85', '0xb0', '0xb7', '0x87', '0x8c', '0x9b', '0x9d', '0x80', '0xcf', '0x59', '0xac', '0xef', '0x76', '0xf1', '0x2c', '0x7e', '0x9e', '0x53', '0xab', '0x0', '0x2c', '0x70', '0x89', '0x64', '0xb9', '0xb4', '0x52', '0xcd', '0x53', '0xb3',
'0xa4', '0x2e', '0x71', '0x7e', '0x6c', '0x67', '0x46', '0x8e', '0x87', '0x7f', '0x8e', '0xad', '0xa7', '0x53', '0x63', '0x99', '0x37', '0x51', '0x95', '0x8b', '0xbf', '0x48', '0x62', '0xd6', '0x50', '0xa5', '0x1f', '0xb8', '0xa2', '0x48', '0x50', '0x4d', '0xe2', '0x4f', '0x32', '0xb6', '0x2
d', '0x50', '0x64', '0x50', '0x26', '0x91', '0x69', '0xcb', '0x4d', '0x65', '0x98', '0x93', '0xaa', '0x44', '0xee', '0x7e', '0x52', '0xb5', '0x8c']

...



--
You received this message because you are subscribed to the Google Groups "audioset-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to audioset-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/audioset-users/5cda2422-d004-4690-a67b-70ca779a1891n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages