Development and deployment cycle

 

Description of the eHealth development and deployment cycle, including description of the use of eHealth environments and flow. 

Overview of build and deployment 

Applications must be deployed using one of the predefined eHealth Helm charts and deployed in eHealth Service Mesh

Telemedicine Solutions providers must use the eHealth Infrastructure Deployment Pipeline for deploying to the eHealth environments.

  1. Telemedicine Solutions provider shall use the official eHealth docker images and helm chart for application definition and development

  2. After local development, the Telemedicine Solutions Providers shall upload the docker images to the Docker and Helm chart repository (Harbor)

  3. Telemedicine Solutions Providers use GitLab to define the deployment file or "desired state file" based on Helmsman

  4. A configured Jenkins deployment pipeline monitors the Helmans file in GitLab and pushes updates to Kubernetes 

    1. The Jenkins deployment pipeline is configured by the eHealth Infrastructure provider.

  5. Kubernetes control loop watches the state of your cluster, then makes or requests changes where needed

The following figure illustrates the deployment pipeline.

Development and deployment happen in four phases.

Phase #1 Local development

All Telemedicine Solution Providers shall start the development of their components in their local environment.

This could be a complete clone of the official test environments running on the eHealth platform. But this is not a strict requirement.

Phase #2 Publish build to the eHealth platform

When the Solution provider supplier believes that the application/service/microservice is ready for testing it is published to the eHealth platform.

This means that the docker image is signed and pushed to the central docker image registry used and hosted by the eHealth platform.

Phase #3 Deploy to test environment

To deploy the docker image as a container on the first test environment, the image needs to be added to the helmsman specification file for the given environment.

All applications running on an environment is specified as code in the desired state specification for the given environment.

For each application, this includes:

  • The docker image to run

    • Docker repository

    • Image tag

  • The helm chart

    • Helm chart repository

    • Helm chart version

  • Configuration

    • Ports

    • Replicas

    • Memory usage

    • Environment variables

    • Database and queue secrets

    • DNS bindings

    • etc.

When the desired state specification is updated the applications and configuration is automatically rolled onto the environment.

Phase #4 Test and Promotion

Notice, that every release must follow a strict release plan where eHealth environments are visited in the required order, which is deployed and tested on EXTTEST, before being promoted to PREPRODUCTION, and finally deployed to PRODUCTION.

External Test

Tests should be carried out on the Test environment.

When the application has passed QA, it can be promoted to the next environment. This happens by a promotion of a specific desired state specification in Jenkins-test, by users with the right privileges.

PreProd and Prod

Going onwards to preprod happens by making a pull request to the "prod" branch from the "master" branch.

When the pull request is approved all updates in the desired state specification for preprod are rolled on the preprod environment.

Deployment to production happens by a promotion of a specific desired state specification in Jenkins-prod, by users with the right privileges.

Hot-fixing

If serious bugs or security errors are found in production code, or containers a hotfix can be handled in the following way.

Rollback

Rollback is handled somewhat semi-automatic, if the problem is visible through the pod probe services' liveness and readiness. In other rollback situations manual intervention is necessary.

In the first situation, where either liveness or readiness are responding negatively, the new pod instance will never be part of the serving cluster. The cluster will automatically keep sending traffic to the old instances of the service and skip the new pod. What should happen next has to be decided by the one who started the deployment. It could be a configuration fix or a rollback to the old software version.

In the second situation, where both liveness and readiness respond positively, the new pod will receive traffic. The one who started the deployment will look at the cluster readout and decide on a rollback. That person has to commit to a new cluster configuration. This could be a reverse commit to the desired state file.
Note: if we are using Canary deployment, the negative impact of a software problem can be held at a minimum. That is if the negative impact can be found in Prometheus.

Diagram of the flow

The flow can be viewed as follows.

  1. Application Vendor performs Local Development 

  2. Application Vendor pust application and docker images to the FUT platform

  3. The eHealth infrastructure deploys to different environments, based on the Helmsman files provided by the application vendor.

  4. Application Vendor (or Customer) can test the Telemedicine solution on EXTTEST 

  5. After successful tests, the build can then be promoted to the next eHealth environment (PREPROCUTION). 

  6. After successful tests, the build can then be promoted to the eHealth PRODUCTION environment. 

In the figure is is shown as SRE Team (Software Reliable Engineering) must approve deployment to external test, pre-prod and production. This is not the case for external applications, furthermore, external applications do not have access to Internal Test Environment (INTTEST), but has a special development environment called DEVENVCGI, for their internal test.