GA4GH AI & LLM Kick-Off Discussion follow up

0 views

Skip to first unread message

Beatrice Amos

unread,

Sep 22, 2025, 7:55:26 AM9/22/25

to Clinical and Phenotypic Data Capture, Work Stream - Genomic Knowledge Standards (GKS), Cloud Work Stream, Regulatory and Ethics Work Stream, Product - Conversational/Generative AI for Genomic Data Sharing, Kyle Ellrott, Davidcs, Vsmalladi, Man Zawati, Marc Fiume, Sasha Siegel, Discovery Technical Work Stream

Dear Colleagues,

Thank you all for your active participation in last week's GA4GH AI & LLM Kick-Off Discussion. For those who couldn't make it, below is a quick recap of what we discussed.

‼️ Please take a moment to review the meeting notes and share any edits or additions to help keep the summary accurate and complete. If you notice any mistakes, have resources to add, or want to share more details, please feel free to jump in and update the notes - we welcome your input! A link to the recording is available in the linked meeting notes.

👋 A very warm welcome to any new faces who joined us!

_____________

Meeting summary

The meeting served as an initial forum to consolidate ongoing discussions across various GA4GH groups on integrating AI and large language models (LLMs) into genomics standards and workflows.

- Benjamin Berk presented an update on the Family Health History Initiative, focusing on the implementation project to capture family health history, particularly cancer information.

- He then recapped the Connect session at Boston, US exploring the use of LLMs for converting narrative notes into Phenopackets and developing differential diagnoses for rare diseases, discussing implications for regulatory and ethical considerations.

- Venkat Malladi and Kyle Ellrott described the Cloud Workstream's efforts to identify gaps in cloud-based APIs and security controls for AI implementation. They are exploring use cases, particularly focusing on federated computing and data access, with plans to continue discussions at Connect.

- Vasiliki Rahimzadeh outlined the Regulatory and Ethics Workstream's (REWS) generative AI guidelines for genomic data usage, emphasising identifiability and institutional policies.

Use cases included Venkat's proposal for an AI agent to score publications on FAIR principles and reproducibility, and David Steinberg's description of AI agents aiding biologists in analytical workflows without bioinformatics expertise.

Broader discussions touched on AI readiness of datasets, benchmarks, incentives for data sharing, and cross-pollination with external standards like ML Commons.

Sasha Siegel proposed a two-part structure: a foundational pillar for general AI principles and standards, and Work Stream-specific applications.

_____________

Main Action Items

All participants: Register for the upcoming 13th Connect Meeting in Uppsala - the conversation will continue there!
Benjamin, Orion, Ryan, and Matt: Set up a meeting to discuss the Pedigree standard and family health history initiative.
If sufficient interest is expressed (via pings to me), schedule a follow-up meeting to hash out structural details for AI efforts, such as the proposed foundational Work Stream.
Join the LLM interest and Federated ML Slack channels (#ga4gh-llm-interest and #federated-ml) for ongoing conversations on AI-related topics.
Contact Reggan (reggan...@ga4gh.org) and Jessica (jessica....@ga4gh.org) to get more information on getting involved in the Federated ML and REWS Generative AI discussion respectively.
Keep your eyes open for the follow up meeting!

_____________

Please let me know if you would like to contribute to, or review use cases in a shared document, including ideas like reproducibility agents and FAIR scoring for publications or if you have a view about anything that was discussed during the meeting.

Thank you for your valuable contributions and ongoing engagement.

Best,
Beatrice

Beatrice Amos

Work Stream Manager

beatri...@ga4gh.org

ba...@sanger.ac.uk

https://www.ga4gh.org/what-we-do/calendar/

Reply all

Reply to author

Forward

0 new messages