I think you search duplicate in DataKey right ?
Datastore holds where data are stored.
DataKey is the actual data used for a record.
What I use to select DataKey is something similar to :
records = Record.objects.filter(label__in=records_to_process, project__id=project)
for record in records:
QuerySet = DataKey.objects.filter(input_to_records=record, path__contains=PATH, digest__contains = DIGEST)
This might be faster by selecting it directly :
QuerySet = DataKey.objects.filter(path__contains=PATH, digest__contains = DIGEST)
I didnot test it.