Data scientists excel at creating models that represent and predict real-world data, but effectively deploying machine learning models is more of an art than science. Deployment requires skills more commonly found in software engineering and DevOps. Venturebeat reports that 87% of data science projects never make it to production, while redapt claims it is 90%. Both highlight that a critical factor which makes the difference between success and failure is the ability to collaborate and iterate as a team.
Most data scientists feel that model deployment is a software engineering task and should be handled by software engineers because the required skills are more closely aligned with their day-to-day work. While this is somewhat true, data scientists who learn these skills will have an advantage, especially in lean organizations. Tools like TFX, Mlflow, Kubeflow can simplify the whole process of model deployment, and data scientists can (and should) quickly learn and use them.
This question is critical, because machine learning promises lots of potential for businesses, and any company that can quickly and effectively get their models to production can outshine their competitors.
These questions are important as they will guide you on what frameworks or tools to use, how to approach your problem, and how to design your ML model. Before you do anything else in a machine learning project, think about these data questions.
Data can be stored in on-premise, in cloud storage, or in a hybrid of the two. It makes sense to store your data where the model training will occur and the results will be served: on-premise model training and serving will be best suited for on-premise data especially if the data is large, while data stored in cloud storage systems like GCS, AWS S3, or Azure storage should be matched with cloud ML training and serving.
Even if you have your training data stored together with the model to be trained, you still need to consider how that data will be retrieved and processed. Here the question of batch vs. real-time data retrieval comes to mind, and this has to be considered before designing the ML system. Batch data retrieval means that data is retrieved in chunks from a storage system while real-time data retrieval means that data is retrieved as soon as it is available.
Along with training data retrieval, you will also need to think about prediction data retrieval. Your prediction data is rarely as neatly packaged as the training data, so you need to consider a few more issues related to how your model will receive data at inference time:
Another issue here has to do with data quality. Data used for inference will often be very different from training data, especially when it is coming directly from end-users not APIs. Therefore you must provide the necessary infrastructure to fully automate the detection of changes as well as the processing of this new data.
As with retrieval, you need to consider whether inference is done in batches or in real-time. These two scenarios require different approaches, as the technology/skill involved may be different. For batch inference, you might want to save a prediction request to a central store and then make inferences after a designated period, while in real-time, prediction is performed as soon as the inference request is made.Knowing this will enable you to effectively plan when and how to schedule compute resources, as well as what tools to use.
Efficiency: How efficient is the framework or tool in production? A framework or tool is efficient if it optimally uses resources like memory, CPU, or time. It is important to consider the efficiency of Frameworks or tools you intend to use because they have a direct effect on project performance, reliability, and stability.
Support: How is support for the framework or tool? Does it have a vibrant community behind it if it is open-sourced, or does it have good support for closed-source tools?How fast can you find tips, tricks, tutorials, and other use cases in actual projects?
Next, you also need to know whether the tools or framework you have selected is open-source or not. There are pros and cons to this, and the answer will depend on things like budget, support, continuity, community, and so on. Sometimes, you can get a proprietary build of open-source software, which means you get the benefits of open source plus premium support.
One more question you need to answer is how many platforms/targets does your choice of framework support? That is, does your choice of framework support popular platforms like the web or mobile environments? Does it run on Windows, Linux, or Mac OS? Is it easy to customize or implement in this target environment? These questions are important as there can be many tools available to research and experiment on a project, but few tools that adequately support your model while in production.
Getting feedback from a model in production is very important. Actively tracking and monitoring model state can warn you in cases of model performance depreciation/decay, bias creep, or even data skew and drift. This will ensure that such problems are quickly addressed before the end-user notices.
Consider how to experiment on, retrain, and deploy new models in production without bringing that model down or otherwise interrupting its operation. A new model should be properly tested before it is used to replace the old one. This idea of continuous testing and deploying new models without interrupting the existing model processes is called continuous integration.
Consider Adstocrat, an advertising agency that provides online companies with efficient ad tracking and monitoring. They have worked with big companies and have recently gotten a contract to build a machine learning system to predict if customers will click on an ad shown on a webpage or not. The contractors have a large volume dataset in a Google Cloud Storage (GCS) bucket and want Adstocrat to develop an end-to-end ML system for them.
As the engineer in charge, you have to come up with a design solution before the project kicks off. To approach this problem, ask each of the questions asked earlier and develop a design for this end-to-end system.
The contractor serves millions of ads every month, and the data is aggregated and stored in the cloud bucket at the end of every month. So now you know your data is large (hundreds of gigabytes of images), so your hunch of building your system in the cloud is stronger.
In terms of inference data, the contractors informed you that inference will be requested by their internal API, as such data for prediction will be called by a REST API. This gives you an idea of the target platform for the project.
There are many combinations of tools you can use at this stage, and the choice of one tool may affect the others. In terms of programming languages for prototyping, model building, and deployment, you can decide to choose the same language for these three stages or use different ones according to your research findings. For instance, Java is a very efficient language for backend programming, but cannot be compared to a versatile language like Python when it comes to machine learning.
After consideration, you decide to use Python as your programming language, Tensorflow for model building because you will be working with a large dataset that includes images, and Tensorflow Extended (TFX), an open-source tool released and used internally at Google, for building your pipelines. What about the other aspects of the model building like model analysis, monitoring, serving, and so on? What tools do you use here? Well, TFX pretty much covers it all!
TFX provides a bunch of frameworks, libraries, and components for defining, launching, and monitoring machine learning models in production. The components available in TFX let you build efficient ML pipelines specifically designed to scale from the start. These components has built-in support for ML modeling, training, serving, and even managing deployments to different targets.
TFX is also compatible with our choice of programming language (Python), as well as your choice of deep learning model builder (Tensorflow), and this will encourage consistency across your team. Also, since TFX and Tensorflow were built by Google, it has first-class support in the Google Cloud Platform. And remember, your data is stored in GCS.
Python and TFX and Tensorflow are all open-source, and they are the major tools for building your system. In terms of computing power and storage, you are using all GCP which is a paid and managed cloud service. This has its pros and cons and may depend on your use case as well. Some of the pros to consider when considering using managed cloud services are:
TFX supports a feedback mechanism that can be easily used to manage model versioning as well as rolling out new models. Custom feedback can be built around this tool to effectively track models in production. A TFX Component calledTensorFlow Model Analysis (TFMA)allows you to easily evaluate new models against current ones before deployment.
Django, a powerful web framework for Python,has gained immense popularity for its ability to help develop robust webapplications quickly. Once your Django application is ready for deployment, it'sessential to have a seamless and efficient deployment process that guaranteesconsistency across different environments. This is whereDocker, a versatilecontainerization platform, comes into play.
Docker offers a convenient way to package your Django application with itsdependencies and configurations into a self-contained unit called acontainer. These containersare lightweight, portable, and isolated from the underlying system, making themideal for deploying applications consistently across different environments.
This tutorial will walk you through containerizing your Django application withDocker. You'll start by cloning an existing Django application from a GitHubrepository that performs CRUD operations with a PostgreSQL database. Then you'llcreate a Dockerfile and build a Docker image locally to ensure it functionscorrectly. You'll also learn to share the Django container image using a publiccontainer registry like DockerHub.
7fc3f7cf58