Meeting Purpose
Sync on recent Lance updates, releases, and upcoming proposals.
Key Takeaways
- Lance 3.0 releases today, with a 3.0.1 bug-fix release likely to follow soon.
- LanceDB is now a core DuckDB extension, a major milestone that will boost discoverability and user adoption.
- A proposal to shorten the major release voting window to 3 days will be put to a vote to accelerate development cycles.
- A new distributed vector index proposal aims to eliminate a single-node merge bottleneck, with benchmarks showing a potential 20x speedup.
Topics
Lance 3.0 Release & Process
- Lance 3.0 releases today after its community vote closed.
- A 3.0.1 bug-fix release is likely to follow soon, as several important fixes have merged since the RC was cut.
- Proposal: Shorten Major Release Voting Window
- Problem: The current 1-week voting period for major releases (e.g., 2.0, 3.0) slows development, forcing a choice between delaying the release or deferring bug fixes to a subsequent version.
- Solution: Shorten the voting window to 3 days, matching the process for minor releases (e.g., 3.0.1).
- Rationale: The project's frequent breaking changes make major releases a regular event, not a special one requiring a longer process.
- Status: The proposal will be put to a community vote.
File Format Strategy
- File format 2.2 was recently released, adding compression and other benefits.
- Key Challenge: Making a new format the default requires a clear migration path to ensure compatibility with older SDK versions.
- Action: A table mapping SDK versions to supported file formats is needed to guide the default-setting decision.
- Action: A mechanism to upgrade file formats for existing indexes will be investigated.
LanceDB DuckDB Extension
- Status: Now a core DuckDB extension, moving from the community page.
- Benefits: This move significantly increases discoverability and user adoption.
- Blocker: Official docs and install commands are pending DuckDB's CI finalization.
- Features: Two major features (Merkle Gain 2 support, maintenance cycles) are merged and will be in the next core extension release.
Distributed Vector Index Proposal
- Problem: The current index building process has a single-node merge bottleneck, where one node must combine all shards, limiting scalability.
- Proposal: Eliminate the final merge step by making shards directly searchable.
- Process: Workers build shards → shards become searchable segments (e.g., 8–16) → queries run against all segments.
- Trade-off: Query performance can degrade with too many segments, requiring a threshold (e.g., 50GB, 1M rows) to trigger a merge.
- Performance: Local benchmarks show a potential 20x speedup for the merge step.
- Status: A PR with benchmarks is expected tomorrow. The proposal will be shared on Discord for community feedback.
Other Updates
- FTS Indexing: A new PR (under review) speeds up FTS index creation and reduces memory usage.
- Lance Trino: A critical bug fix has been merged, unblocking users of the Trino extension.
Next Steps
- Will:
- Initiate the community vote on shortening the release voting window.
- Investigate a mechanism for upgrading file formats for existing indexes.
- Xuanwo:
- Submit the PR with distributed vector index benchmarks.
- Prashanth & ChanChan:
- Announce the LanceDB core extension status on socials once DuckDB's CI is finalized.
- Post the distributed vector index proposal on Discord for community feedback.
- Team:
- Create a table mapping SDK versions to supported file formats.
|
|
|
|
|
Action Items ✨
|
|
|
|
|
Meeting Purpose
Sync on recent Lance updates, releases, and upcoming proposals.
Key Takeaways
Topics
Lance 3.0 Release & Process
File Format Strategy
LanceDB DuckDB Extension
Distributed Vector Index Proposal
Other Updates
Next Steps
|
|
|
|
|
Ask Fathom!
|
|
Ask our AI Assistant for answers and insights. It's ChatGPT for your meetings!
|
|
Try Ask Fathom →
|
|
|
|
|
|
Never take notes again.
Sign up for Free
|
|
🎁 Referral bonus: Sign up now and unlock a free month of Premium for you
|
|
|
|
|
|
|
|