Machine Learning Systems Design

21 views
Skip to first unread message

Jeremy Jordan

unread,
Feb 23, 2020, 1:36:42 PM2/23/20
to Penny University
On Friday, we held a discussion surrounding the topic of ML systems design. We had representation from data scientists and engineers at companies such as Lowes, The Home Depot, GitHub, ViacomCBS, and Eventbrite. (In fact, we discovered that Hangouts has a maximum capacity of 10 participants - which prevented us from accommodating all who were interested in joining the chat.)

Our conversation focused on topics of designing systems which can scale to large inference demands and typical design patterns. 

Handling large scale requirements
  • Heavy use of caching can dramatically reduce the computational burden. Rob mentioned that his team relies heavily on Redis to avoid unnecessary computations. 
  • Taking a tiered approach to inference can be quite effective. This design pattern relies on using very simple models with cheap features where possible, then progressively using more expensive models and features where the earlier models abstained. Think of the design thinking principle of "just enough and no more" applied to model and feature complexity for inference.
  • Distributed computation frameworks can help manage large scale datasets. Dask was a crowd favorite. 
Systems design
There's a need to track the full provenance of a model, from its initial definition and training all the way through deployment and monitoring. 
General feel: there's so many fragmented tools in this space, it can be difficult to choose and assemble a (complete) stack.


Reply all
Reply to author
Forward
0 new messages