That said, a few fields may not always be fully up to date in the document. In practice, the most reliable way to understand the current structure is to query the Search API directly (which returns lite by default) and inspect the response fields. I have also attached an example XML file content of PMCLiteMetadata archive.
Regarding your pipeline, it would be helpful if you could share examples of your current search queries to better understand your use case and advise on the right api and optimisations.
Thanks,
Madhu