Hi all, we are excited to announce the release of MLflow 3.5.0! MLflow 3.5.0 includes several major features and improvements!
Major Features
Job Execution Backend: Introduced a new job execution backend infrastructure for running asynchronous tasks with individual execution pools, job search capabilities, and transient error handling. (#17676, #18012, #18070, #18071, #18112, #18049, @WeichenXu123)
Flexible Prompt Optimization API: Introduced a new flexible API for prompt optimization with support for model switching and the GEPA algorithm, enabling more efficient prompt tuning with fewer rollouts. See the [documentation](https://mlflow.org/docs/latest/genai/prompt-registry/optimize-prompts/) to get started. (#18183, #18031, @TomeHirata)
Enhanced UI Onboarding: Improved in-product onboarding experience with trace quickstart drawer and updated homepage guidance to help users discover MLflow's latest features. (#18098, #18187, @B-Step62)
Security Middleware for Tracking Server: Added a security middleware layer to protect against DNS rebinding, CORS attacks, and other security threats. Read the [documentation](https://mlflow.org/docs/latest/ml/tracking/server/security/) for configuration details. (#17910, @BenWilson2)
Features- [Tracing / Tracking] Add `unlink_traces_from_run` batch operation (#18316, @harupy)
- [Tracing] Add batch trace link/unlink operations to DatabricksTracingRestStore (#18295, @harupy)
- [Tracking] Claude Code SDK autologging support (#18022, @smoorjani)
- [Tracing] Add support for reading trace configuration from environment variables (#17792, @joelrobin18)
- [Tracking] Mistral tracing improvements (#16370, @joelrobin18)
- [Tracking] Gemini token count tracking (#16248, @joelrobin18)
- [Tracking] Gemini streaming support (#16249, @joelrobin18)
- [Tracking] CrewAI token count tracking with documentation updates (#16373, @joelrobin18)
- [Evaluation] Allow passing empty scorer list for manual result comparison (#18265, @B-Step62)
- [Evaluation] Log assessments to DSPy evaluation traces (#18136, @B-Step62)
- [Evaluation] Add support for trace inputs to built-in scorers (#17943, @BenWilson2)
- [Evaluation] Add synonym handling for built-in scorers (#17980, @BenWilson2)
- [Evaluation] Add span timing tool for Agent Judges (#17948, @BenWilson2)
- [Evaluation] Allow disabling evaluation sample check (#18032, @B-Step62)
- [Evaluation] Reduce verbosity of SIMBA optimizer logs when aligning judges (#17795, @BenWilson2)
- [Evaluation] Add `__repr__` method for Judges (#17794, @BenWilson2)
- [Prompts] Add prompt registry support to MLflow webhooks (#17640, @harupy)
- [Prompts] Prompt Registry Chat UI (#17334, @joelrobin18)
- [UI] Delete parent and child runs together (#18052, @joelrobin18)
- [UI] Added move to top, move to bottom for charts (#17742, @joelrobin18)
- [Tracking] Use sampling data for run comparison to improve performance (#17645, @lkuo)
- [Tracking] Add optional 'outputs' column for evaluation dataset records (#17735, @WeichenXu123)