Chap 7. Deployment and model serving

Checklist before deployment.

What's the input/output? which params should be considered in the config file?
What's the minimal RAM, running environment required to run the inference?
What are the fixed requirements for the deployment? for example, fixed GPU types, strict inference time, low latency, etc?
On the cloud or on-site, or edge/mobile deployment?
Security issue; encrypt the code
set up the license

Then, here comes with deployment lifecycle:

develop for testing
staging for pre-release
production for release

Now that you set up a SAAS or Restful API that deploys your deep learning model, and everything looks good. What's next? Which test should be included?

This could be a good starter.

Profile and record the utilities on your local container, including RAM, GPU memory, GPU utilization, etc.
Have a toy dataset to make sure the pipeline works; the function is correct.
Very large data (be mindful of not only the inference part but also, the uploading, compressing, uncompressing, sending back, etc).
Set up an email reminder or other notifications when it failed.
Logging system to monitor each step (easier to debug; time spent for profiling).
If it involves multiple GPU setups, you should also check on the multiple scaling part.

After release, the major bandwidth will be focused on maintaining:

pipeline side
model side

If there is a new issue popping up, could apply a patch to fix that and release a new minor version; or fix it in the new major version's release.

References:

https://www.youtube.com/watch?v=ii89L7LVAs4
flask / streamlit/ Starlette：https://towardsdatascience.com/10-minutes-to-deploying-a-deep-learning-model-on-google-cloud-platform-13fa56a266ee
https://aws.amazon.com/blogs/machine-learning/deploying-machine-learning-models-as-serverless-apis/
LAMBDA is still a RESTful API with preset protocols.
https://towardsdatascience.com/deploy-a-machine-learning-model-as-an-api-on-aws-43e92d08d05b

PreviousTuning-experiment tracking NextDocumentation

Last updated 4 years ago

Was this helpful?