5.2 What's next: simple is better

  • More items need to be taken into consideration combining with some prior knowledge in the engineering tests, more specifically, the technical debt for the design. For example, for the swish layer was proposed[1], despite it might give slight db improvement, it is sub optical in the deployment environment, and might not be worth it to use it over regular activation functions, for example, ReLU.

  • Don't over-kill.

references:

[1] SWISH: A SELF-GATED ACTIVATION FUNCTION

Last updated