FAQ | This is a LIVE service | Changelog

Cloud Run Module TNG

Following discussion in #45 (comment 543140), this issue is to plan the next major release (9.0.0) of this module. It feels as though this will almost be a complete rewrite of the module so we should take this opportunity to ensure that all requirements/desires are collated and considered.

Things we want to add

  • Switch from google_cloud_run_service (Cloud Run API v1) resource to google_cloud_run_v2_service (Cloud Run API v2) resource.
    • While both APIs are currently supported, it is noted in the docs that the v2 resources are recommended as they will have broader support for Cloud Run features going forwards.
  • The option of a complete passthrough of the template block to support edge cases that are not immediately available via the module's variables (volumes, sidecar containers, etc).
  • Some way of configuring the traffic block to allow for blue-green/canary deployments.

Things we want to keep

  • Configuration of a dedicated service account identity.
  • Convenience variables for passing in SQL instances.
  • Optional auto-grant of SQL client roles to the service account identity.
  • Pre-deployment Cloud Run Jobs.
  • The ability to ignore certain annotations which Google likes to change under our feet.
  • Turnkey alerting and monitoring via our gcp-site-monitoring module.

Things we want to remove

  • Domain Mapping and related TLS certificate stuff in favour of always using an external load balancer.
  • VPC stuff in favour of a "paved-path" recommended approach for doing it outside of the module.
    • It often doesn't make sense to have a one-to-one relationship between the VPC connector and the Cloud Run service. It would be preferable to split the networking bits into a separate companion module to allow an easier one-to-many relationship.
    • See this comment - this static IP/egress VPC config will now remain in the module. However, we will attempt to refactor some of the logic which has become quite confusing around this.
  • The stackdriver Google provider required alias. This was originally used to pass in a provider instance authenticated as a service account with permissions to create alert policies in the meta project. However, in the new age we have simply granted the workspace-specific terraform-deploy service accounts the permission to do this. Therefore, the provider alias is unnecessary, confusing, and can now simply be removed and replaced with an input variable to specify the meta project ID.
  • The dashboard.tf|json bits. A scan of all repositories under the DevOps group shows that none of our projects appear to be using this. We should remove the config and use the built-in Cloud Run dashboards as they contain the same information if not more.

Miscellaneous

  • Implement integration testing using terraform test, introduced in Terraform 1.6.
    • We advertise this module as supporting multiple provider and Terraform core versions in versions.tf, however, we very rarely check that new features etc. do not introduce regressions. For example, currently versions.tf is configured to allow versions >= 3.7 of the Google provider, but we are using the google_cloud_run_v2_job resource which is not available until >= 4.0.
    • Ideally we would configure an integration suite to test against a matrix of Terraform core and provider minor versions that we wish to support.
  • Add a pre-commit config file with relevant checks.
    • terraform fmt, terraform validate, tfsec, and tflint against each minor version of Terraform that the module supports.
    • Auto generate README.md sections using terraform-docs
  • Add CI linting jobs from our terraform-pipeline.yml template.
    • This will require GKE runners be deployed for the Terraform Modules sub-group.
  • Implement automated releases using our new release template.
Edited by Ryan Kowalewski