# Chap 7. Deployment and model serving

Checklist before deployment.

* What's the input/output? which params should be considered in the config file?
* What's the minimal RAM, running environment required to run the inference?
* What are the fixed requirements for the deployment? for example, fixed GPU types, strict inference time, low latency, etc?
* On the cloud or on-site, or edge/mobile deployment?
* Security issue; encrypt the code&#x20;
* set up the license

Then, here comes with deployment lifecycle:

* develop for testing
* staging for pre-release
* production for release

Now that you set up a SAAS or Restful API that deploys your deep learning model, and everything looks good. What's next? Which test should be included?&#x20;

This could be a good starter.

* Profile and record the utilities on your local container, including RAM, GPU memory, GPU utilization, etc.&#x20;
* Have a toy dataset to make sure the pipeline works; the function is correct.
* Very large data (be mindful of not only the inference part but also, the uploading, compressing, uncompressing, sending back, etc).
* Set up an email reminder or other notifications when it failed.
* Logging system to monitor each step (easier to debug; time spent for profiling).
* If it involves multiple GPU setups, you should also check on the multiple scaling part.

After release, the major bandwidth will be focused on maintaining:

* pipeline side
* model side

If there is a new issue popping up, could apply a patch to fix that and release a new minor version; or fix it in the new major version's release.&#x20;

References:&#x20;

* <https://www.youtube.com/watch?v=ii89L7LVAs4>
* flask / streamlit/ **Starlette**：<https://towardsdatascience.com/10-minutes-to-deploying-a-deep-learning-model-on-google-cloud-platform-13fa56a266ee>
* <https://aws.amazon.com/blogs/machine-learning/deploying-machine-learning-models-as-serverless-apis/>
* LAMBDA is still a RESTful API with preset protocols.
* <https://towardsdatascience.com/deploy-a-machine-learning-model-as-an-api-on-aws-43e92d08d05b>
