Feast Community Newsletter #6

47 views
Skip to first unread message

Danny Chiao

unread,
Dec 30, 2021, 8:08:27 PM12/30/21
to Danny Chiao
feast_logo

Dec 30, 2021

Announcements

🚀 Annual wrap-up

What a year it's been for Feast! Some quick stats:
  • GitHub stars grew over 2x
  • The Feast slack community grew by almost 10x to over 3.1k members
  • We crossed the 100 contributor line, with over 120 contributors who've contributed code to Feast over time! (including 500+ merged PRs and ~200 closed issues)
  • 400k+ downloads of the Feast python package

👉 We welcomed several new maintainers from Twitter (@mavysavydav) and Shopify (@MattDelac). They've been instrumental in developing Feast to be intuitive, useful, and scalable.

 

🔥 We launched a lot of new functionality:

  • Lightweight Python-centric Feast (blog post)
  • New integrations
    • New offline and online stores: Redis, AWS (Redshift, DynamoDB, S3), Azure (plugin), Snowflake (plugin), Hive (plugin), Postgres (plugin), Trino (plugin), SingleStore (blog post)
    • Other MLOps tooling: including Arize AI (docs), Flyte (docs), Pinecone (docs)
  • On demand feature views, enabling on the fly feature transformations
  • Stream ingestion into Redis online stores for very fresh features
  • Python feature server (including serverless deployment to AWS Lambda (guide) and integration with KServe (repo))
  • Updated java feature server for low latency feature serving
  • And more! (e.g feature services, feature views without entities, and entity aliasing)
Demo: easy deployment of Feast with AWS Lambda
 

💡 We published a ton of content / documentation (with more to come!), covering topics like how to extend Feast, best practices on deploying Feast, and using Feast to simplify training / serving machine learning models (See documentation). 

 

A very big thanks to all the contributors, users, and content creators for being part of this journey to improve the state of ML tooling with Feast! 

 

Happy holidays and stay safe! 🥂

Blog posts and videos

Check out some new content featuring Feast:

  • [Blog] The Rising Importance of Feature Stores (post)
  • [Podcast] Keeping Feast Simple (podcast)

 

If you want to produce content, please reach out! We're happy to help.

Upcoming events

Times below are in Pacific time unless otherwise noted: 
  • Jan 4 @ 10AM: Feast community call (agenda, zoom link)
    • Please add agenda items to discuss!
  • Jan 25 @ 8AM: Cape Town Machine Learning Meetup talk on recommender systems with Feast (event link)
  • Jan 27-28: Data Council Conference, featuring a Feast talk on recommender systems (event link)
  • Feb 10: apply() meetup, with talks on solving practical data engineering challenges (event link)

Community contribution highlights

  • Add feast-python-server helm chart (@michelle-rascati-sp)
  • Improving python feature serialization by 20-30% and deserialization of list features by ~7x (@judahrand)
  • Improving the feast python server RPS by ~4x by shifting the endpoint to be a synchronous endpoint (@nossrannug)

Work in progress

feast plan

feast plan aims to make it much easier to preview changes from running feast apply and integrating Feast into CI/CD pipelines. Implementation is complete for an initial milestone that allows users to preview changes to Feast FCOs. This will be released soon. 

A followup release will output validation errors and infrastructure diffs when running feast apply.

 

RFC-029: Feast Plan

RFC-030: Feast Plan Implementation

Benchmarks and performance tuning

Users commonly ask how performant Feast is for online retrieval. The answer depends on the number of features being processed, choice of offline/online stores, request batch sizes, deployment strategies, etc. Early results have been gathered and will make their way into a blog post soon.

 

Multiple members of the community have been also optimizing online serving across Python and Java feature servers (e.g. see PR #2119, PR #2159PR #2164, PR#2165PR #2172, PR#2166) resulting in significantly faster online serving, especially for larger batch size feature retrieval.

 

RFC-031: Benchmarking Feature Servers

Data quality monitoring

Early implementation of logging online features has begun. The next step here towards the first milestone is to also save datasets, and then use Great Expectations to validate training datasets. Help here is welcome!

 

RFC-027: Data Quality Monitoring

RFC presentation and discussion (video recording, feast-dev meeting invite)

Batch transformations

One of the top requests from the community was to build batch feature engineering. This RFC discusses how to implement batch transformations (row-level + eventually window transformations) to easily build features. Thinking here is in early stages and feedback is welcome.

 

RFC-028: Batch Transformations

Entity TTL

Online store data can quickly become stale or unneeded. Often entities or features have natural expiration windows (e.g. users to score expiring after being ranked). This RFC explores different ways to automatically cleanup unnecessary feature data in online stores.

 

RFC-032: TTL for online feature sets

Join the conversation on Slack, give feedback on our roadmap, or checkout our GitHub!

Reply all
Reply to author
Forward
0 new messages