An example of SAAS deployment TODO list

The doc is to outline the items for setup the saas services to test the inference.

  1. Determine the requirements

    1. Identify the input/output for the end-to-end pipeline

    2. Confirm how to do the data io: for example, upload/download thru the website, upload/download thru s3, etc

    3. Which GPU cards to run

    4. Latency for the inference?

    5. Does the received data need to be saved? (for debugging/ other purpose)

    6. The cost for testing

  2. Freeze the model, inference pipeline, and runtime environment

    1. Formalize the pipeline (set random seed, ignore the funcs that’s under dev, etc)

    2. Reproduce the training pipeline in an brand new environment/container

  3. Set up the end-to-end pipeline

    1. Set up the service included (web browser --> web hosted on s3 → amazon API gateway → lambda call runtime → sagemarker model endpoint → attached instance with GPU)

    2. Add detailed log at each step for potential debug

    3. Make the whole pipeline runnable in a toy model and environment

    4. Adapt with the frozen model and pipeline

  4. Test (end-to-end & module tests)

    1. Unit test for each module

    2. Have test data for end-to-end available

    3. Run thru the pipeline and make sure the result looks consistent

    4. Random tests from different ends

  5. Write a manual for the SAAS service

    1. The usage/How to use the pipeline

    2. Notes about the risks, for example, how many requests can be run at the same time, etc

  6. Maintain the pipeline

    1. The practice to switch model/update the pipeline

    2. The practice when the input/output, data io, or other setup changed

    3. Enable model monitoring

Last updated