How to deploy a CI Pipeline on GCP?

Deploying a pipeline on GCP is efficient, lets get started.

What is CI/CD?

CI stands for Continuous integration and CD stands for Continuous delivery and it is the best practice for the DevOps team to implement for delivering code changes more frequently and reliably. This implementation is also known as CI/CD pipeline. It helps in automating deployment steps which gives the development team more time to develop and meet the business requirements.

By adapting the CI/CD pipeline we can ensure high quality, maintainability of the data process and workflows. The methods that you can apply are as follows:

  • Version control of source code.
  • The automatic building, testing, and deployment of apps.
  • Environment isolation and separation from production.
  • Replicable procedures for environment setup.
Google Cloud Professional Cloud DevOps Engineer Exam - All in One Guide.
Google Cloud Professional Cloud DevOps Engineer Exam – All in One Guide.

Deployment Architecture:

For deploying the CI/CD pipeline following GCP products are required:

  1. Code Build: It is a service that runs your build on Google Cloud and maintains a series of build steps where each step is run in a Docker container. This is used to create a CI/CD pipeline for building, deploying and testing a data-preprocessing workflow and the data processing itself.
  2. Cloud composer: This is a managed Apache Airflow service which offers an environment where you can create, schedule and monitor the complex workflows. This is used for running the steps of workflow such as data processing, testing and verifying the results.
  3. Dataflow: It is used to run the Apache Beam.

The CI/CD Pipeline:

CI/CD pipeline consists of various steps which are:

  1. Cloud build packages the WordCount sample running into a JAR(Java Archive) file using the Maven builder(it is the container with maven installed in it). It runs the tasks whenever a build step is configured.
  2. After that JAR file is being uploaded to Cloud Storage by Cloud build.
  3. Cloud build runs the unit tests on data processing workflow and later deploy the code on the cloud composer.
  4. Later cloud composer picks up the JAR file and executes the data processing job on the dataflow.
This image has an empty alt attribute; its file name is image-9.png
Google Cloud

Detailed View of CI/CD pipeline steps. Source: Google Cloud Doc

Usually, the deployments are taken into two different Cloud Build pipelines a test and a production pipeline.

For deploying the test pipeline following steps are taken:

  • A developer commits the code change into cloud Repositories.
  • The change in code triggers a test build in Cloud Build.
  • After that cloud build builds the self-executing jar file and deploys it to the test Jar bucket on Cloud storage.
  • It also sets the variable in Cloud Composer to reference the newly deployed JAR file.
  • It tests the data-processing workflow DAG(Direct Acyclic Graph) and deploys it to the Cloud composer bucket on Cloud Storage.
  • Workflow DAG file is deployed to cloud composer.
  • At last, the cloud build triggers the newly deployed data processing workflow to run.

The production pipeline consists of the following steps:

  • Developer’s need to manually run the production deployment pipeline in Code Build.
  • Cloud build copies the self-executing JAR file from the test JAR bucket to the production JAR bucket on the cloud storage.
  • Cloud build tests the production data-processing workflow DAG and deploys it to the cloud composer bucket on cloud storage.
  • The production workflow DAG file is deployed to the cloud consumer.

So, at last we will discuss the benefits of CI/CD pipeline:

  • Smaller Code Changes: We can do the small code changes and it can be integrated at one time. These code changes are simpler and easier to handle than huge chunks of code.
  • Fault Isolations: It refers to the practice of designing the system in such a way that when an error occurs the negative outcomes are limited in scope. It helps in making the system more maintainable and reduces the scope of problems.
  • Faster Mean Time to Resolution(MTTR): It measures the maintainability of repairable features and sets the average time to repair a broken feature and it helps in managing a time spent to recover from a failure.
  • Faster Release Rate: It will detect failures faster, can be repaired faster which leads in increase to the release rate.