Summary:
Aim: AI for Scientific Discovery with a focus on deployability
Observation
Multi-modal ML models are becoming increasingly more common
Language and vision are much more common than other modalities (time series, sensor, audio, tabular)
Motivates work to incorporate the other modalities
E.g. discriminative and generative models
Generative modeling is very capable because it can reconstruct one modality from others
Bridge between modalities
Align correspondence between them
Cross-modality representations, which are complementary
Novel multi-modal privacy risk because one can’t anonymize modalities in isolation
Models are increasingly maturing; data is the major differentiator that enables discovery
Deployment-centric AI
Consider deployment constraints and user needs
E.g. ethical considerations
Flow:
Planning: Problem definition, deployment constraints, task formulatoin
Multimodal AI system development
Real-world deployment
DrugBAN: Drug-target interaction prediction
Graph convolutional network to encode drug’s chemical structure
1D convolutional network encodes sequences of proteins the drug binds with
Multi-head Bilinear interaction model + pooling integrates the two modalities into a joint representation
Focus on deployability: drug companies specifically care about discovery of new drugs
Cluster data according to drug type
Train on some categories of drugs, predict on different ones
Same approach applied to cell line-drug response, mutation-drug association
MapDiff: Inverse protein folding
Design protein with a particular function/structure
Mask prior-guided denoising Diffusion
Mark-prior pretraining:
Protein 3D structure modality
Mark part of the sequence, predict it again
Model learns a good embedding for 3D structure for amino acid sequences
Mask-guided denoising network
Equivariant graph neural network: generates 3D sequence based on structure
Entropy-based mask (more uncertain sequence components are masked and predicted)
Diffusion model to predict sequence
Using Alphafold to validate the consistency of the sequence’s foldability
Recommendations for deployment-centric development
Safety, reliability, interpretability
Scalability and resource efficiency
Ethical compliance and user preparedness
Recommendations for data- and model-centric development
Data scarcity and access
Balanced data representation
Multimodal fusion and modality selection
Foundation models
Recommendations for stakeholder engagement and collaboration
Stakeholder inclusion and alignment
Cross-disciplinary standards and communication
Intellectual property and workflow adaptation
PyKale: pykale.github.io
Knowledge aware machine learning from multiple sources
UKOMAIN: UK Open Multimodal AI Network: https://multimodalai.github.io/
Builds diverse interdisciplinary network
Create knowledge exchange platform
Idenrtify & fund key focus areas of high impact
Engage industry & policymakers
Promote sustainability & responsibility
Enhance research capability & training
OMAIB: Open Multimodal AI Benchmark (funding call)
CD3: Cancer Data Driven Detection:
Advance ability to prevent, diagnose and detect cancer
Enhance equity
Research community that federates resources
Health records, cancer multi-omics
Advanced analytics
Partnerships
Conclusion: multimodal data + GenAI = Impact