# CI/CD

The platform provides standardized CI/CD pipeline templates for all services. Templates handle the common stages — lint, test, build, push, deploy — so teams don't write and maintain pipeline code themselves.

---

## How it works

When you push to your repository, GitHub Actions triggers the platform pipeline template via a [reusable workflow](https://docs.github.com/en/actions/sharing-automations/reusing-workflows). Your `.platform.yml` file controls what stages run and how.

```{mermaid}
flowchart TD
    A([Push / PR]) --> B[Lint]
    B --> C[Test]
    C --> D[Build Docker image]
    D --> E[Push to registry]
    E --> F{Branch?}
    F -->|main| G[Deploy to staging]
    F -->|PR| H([PR check passes])
    G --> I([Manual approval])
    I --> J([Deploy to production])
    style H fill:#2da44e,color:#fff
    style J fill:#2da44e,color:#fff
```

The pipeline template lives in `.github/workflows/platform-pipeline.yml` in this repository. It is the single source of truth for all pipeline logic — when the platform team updates it, all services pick up the change automatically.

---

## .platform.yml reference

```yaml
# .platform.yml — place at the root of your service repository

platform:
  version: "1"           # schema version, always "1" for now

pipeline:
  language: go           # go | python | java | node | rust
  test: true             # run tests (default: true)
  lint: true             # run linter (default: true)
  test_command: ""       # override default test command (optional)

image:
  registry: registry.mycorp.internal
  name: your-service     # image name, usually matches service name

deploy:
  staging:
    auto: true           # deploy on every merge to main
    namespace: yourteam  # override namespace from service YAML (optional)
  production:
    auto: false          # production always requires manual approval
    namespace: yourteam
```

---

## Pipeline stages

### 1. Lint

Runs the language-specific linter. Failures block the merge.

| Language | Linter |
|----------|--------|
| Go | `golangci-lint` |
| Python | `ruff` |
| Node | `eslint` |
| Java | `checkstyle` |
| Rust | `clippy` |

To skip lint (not recommended): set `pipeline.lint: false` in `.platform.yml`.

### 2. Test

Runs the test suite. The platform injects test database and cache services automatically. Your tests can connect to PostgreSQL at `localhost:5432` (database `test`) and Redis at `localhost:6379`.

Default test commands by language:

::::{tab-set}
:::{tab-item} Go
```bash
go test ./... -race -coverprofile=coverage.out
```
:::
:::{tab-item} Python
```bash
pytest --tb=short -q
```
:::
:::{tab-item} Node
```bash
npm test
```
:::
:::{tab-item} Java
```bash
./gradlew test
```
:::
::::

Override the default with `pipeline.test_command` in `.platform.yml`. To skip injected services:

```yaml
pipeline:
  test_services:
    postgres: false
    redis: true
```

### 3. Build

Builds a Docker image from your `Dockerfile`. The platform passes the following build args:

| Arg | Value |
|-----|-------|
| `BUILD_DATE` | RFC 3339 timestamp |
| `GIT_COMMIT` | Short commit SHA |
| `VERSION` | Tag or `main-{sha}` |

Use them to embed version info:

```dockerfile
ARG GIT_COMMIT
ARG VERSION
LABEL org.opencontainers.image.revision=$GIT_COMMIT \
      org.opencontainers.image.version=$VERSION
```

### 4. Push

On a successful build, the image is pushed to `registry.mycorp.internal/{service-name}:{tag}`.

Tags follow this convention:

| Trigger | Image tag |
|---------|-----------|
| Push to `main` | `main-{short-sha}`, `latest` |
| Git tag `v1.2.3` | `1.2.3`, `latest` |
| Pull request | `pr-{number}` (not pushed to registry, only local cache) |

### 5. Deploy to staging

Happens automatically after a successful push to `main` if `deploy.staging.auto: true`. The platform runs:

```bash
helm dependency update deploy/helm
helm upgrade --install {service-name} ./deploy/helm \
  -n {namespace} \
  -f deploy/helm/values.yaml \
  -f deploy/helm/values.staging.yaml \
  --set deployment.image.tag={tag}
```

`helm dependency update` pulls the latest OCI chart from `registry.mycorp.internal/charts` before each deploy.

Deployment status is reported back to the GitHub commit via a deployment status check.

### 6. Deploy to production

Production deployments require a manual approval step in the GitHub Actions UI. After approval:

```bash
helm dependency update deploy/helm
helm upgrade --install {service-name} ./deploy/helm \
  -n {namespace} \
  -f deploy/helm/values.yaml \
  --set deployment.image.tag={tag}
```

Production deployments trigger a notification to the service's Slack channel.

---

## Secrets

Secrets are stored in [Vault](https://vault.mycorp.internal) and injected into the pipeline at runtime. Do not put secrets in `.platform.yml` or in repository environment variables.

To add a secret:

1. Store it in Vault at `secret/cicd/{service-name}/{key}`
2. Reference it in your pipeline config:

```yaml
pipeline:
  secrets:
    - name: DATABASE_URL
      vault_path: secret/cicd/your-service/database-url
```

The secret is available as an environment variable during the `test` and `deploy` stages.

---

## Caching

Dependencies are cached between pipeline runs. Cache keys are based on the lock file for each language:

| Language | Lock file |
|----------|-----------|
| Go | `go.sum` |
| Python | `requirements.txt` or `poetry.lock` |
| Node | `package-lock.json` or `yarn.lock` |

If you need to force a cache reset, add `[no-cache]` to your commit message.

---

## Troubleshooting

**Pipeline does not trigger after pushing `.platform.yml`**

Check that your repository has the platform GitHub App installed. See [#platform-access](https://slack.mycorp.internal/platform-access).

**Test stage fails with "connection refused" for the database**

Verify that your test code connects to `localhost` and not a container name. The platform services are exposed on `localhost` inside the CI runner.

**Image push fails with "unauthorized"**

The pipeline uses a service account token rotated weekly. If your last pipeline run was more than a week ago, re-trigger the pipeline — the new token will be picked up automatically.

**Deployment is stuck in "Pending"**

Check `kubectl get pods -n {namespace}` for `ImagePullBackOff` or `CrashLoopBackOff`. The most common cause is a missing secret or a misconfigured health check. See {ref}`Kubernetes health checks <health-checks>` for health check requirements.
