An example of SAAS deployment TODO list
The doc is to outline the items for setup the saas services to test the inference.
Determine the requirements
Identify the input/output for the end-to-end pipeline
Confirm how to do the data io: for example, upload/download thru the website, upload/download thru s3, etc
Which GPU cards to run
Latency for the inference?
Does the received data need to be saved? (for debugging/ other purpose)
The cost for testing
Freeze the model, inference pipeline, and runtime environment
Formalize the pipeline (set random seed, ignore the funcs that’s under dev, etc)
Reproduce the training pipeline in an brand new environment/container
Set up the end-to-end pipeline
Set up the service included (web browser --> web hosted on s3 → amazon API gateway → lambda call runtime → sagemarker model endpoint → attached instance with GPU)
Add detailed log at each step for potential debug
Make the whole pipeline runnable in a toy model and environment
Adapt with the frozen model and pipeline
Test (end-to-end & module tests)
Unit test for each module
Have test data for end-to-end available
Run thru the pipeline and make sure the result looks consistent
Random tests from different ends
Write a manual for the SAAS service
The usage/How to use the pipeline
Notes about the risks, for example, how many requests can be run at the same time, etc
Maintain the pipeline
The practice to switch model/update the pipeline
The practice when the input/output, data io, or other setup changed
Enable model monitoring
Last updated