[Bug]: Self-hosting e2b on GCP

### Sandbox ID or Build ID

_No response_

### Environment

macOS 26.2
terraform 1.5.7

### Timestamp of the issue

2026-01-24 5:23 PM (MT)

### Frequency

Happens every time

### Expected behavior

I am trying to selfhost e2b on GCP by following the guide written in [self-host.md](https://github.com/e2b-dev/infra/blob/main/self-host.md).

### Actual behavior

I tried this thrice after tearing down the entire infra completely and starting fresh. At step 11, when I run `make apply` after running `make plan` I get the following error and I am unable to deploy the infra:

```
╵
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to module.nomad.nomad_job.clickhouse_backup[0], provider
│ "module.nomad.provider[\"registry.terraform.io/hashicorp/nomad\"]" produced an unexpected new value: Root resource was present, but
│ now absent.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to module.nomad.nomad_job.api, provider "module.nomad.provider[\"registry.terraform.io/hashicorp/nomad\"]"
│ produced an unexpected new value: Root resource was present, but now absent.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to module.nomad.nomad_job.clickhouse_backup_restore[0], provider
│ "module.nomad.provider[\"registry.terraform.io/hashicorp/nomad\"]" produced an unexpected new value: Root resource was present, but
│ now absent.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: error applying jobspec: Unexpected response code: 500 (1 error occurred:
│ 	* job "clickhouse-migrator" is in nonexistent node pool "clickhouse")
│ 
│   with module.nomad.nomad_job.clickhouse_migrator[0],
│   on nomad/main.tf line 695, in resource "nomad_job" "clickhouse_migrator":
│  695: resource "nomad_job" "clickhouse_migrator" {
│ 
╵
make[1]: *** [apply] Error 1
make: *** [apply] Error 2
```


### Issue reproduction

Moreover, this are a few workarounds that happened before this:
1. As per step 10 of the self host guide , it says to fill out secrets for postgres connection string but it didn't exist automatically so I created it manually in gcp
2. I ran step 11, `make apply` after plan, It throwed the above errors along with that it couldn't create the secrets because it already existed. So I deleted the ones i created manually
3. Run it again, it failed because it created an empty secret for me and there was no value in it
4. I filled the values for postgres connection string secrets and ran plan and apply again
5. Received the above error logs pasted.

### Additional context

Terraform Plan it created after having all the secrets in place correctly:

Terraform will perform the following actions:

  # module.nomad.nomad_job.api will be created
  + resource "nomad_job" "api" {
      + allocation_ids          = (known after apply)
      + datacenters             = [
          + "us-central1-f",
        ]
      + deployment_id           = (known after apply)
      + deployment_status       = (known after apply)
      + deregister_on_destroy   = true
      + deregister_on_id_change = true
      + detach                  = true
      + hcl1                    = false
      + id                      = (known after apply)
      + jobspec                 = (sensitive value)
      + modify_index            = (known after apply)
      + name                    = "api"
      + namespace               = "default"
      + read_allocation_ids     = false
      + region                  = (known after apply)
      + rerun_if_dead           = false
      + status                  = (known after apply)
      + task_groups             = [
          + {
              + count = 1
              + meta  = (known after apply)
              + name  = "api-service"
              + task  = [
                  + {
                      + driver = "docker"
                      + meta   = (known after apply)
                      + name   = "start"
                    },
                  + {
                      + driver = "docker"
                      + meta   = (known after apply)
                      + name   = "db-migrator"
                    },
                ]
            },
        ]
      + type                    = (known after apply)
    }

  # module.nomad.nomad_job.clean_nfs_cache[0] will be created
  + resource "nomad_job" "clean_nfs_cache" {
      + allocation_ids          = (known after apply)
      + datacenters             = [
          + "*",
        ]
      + deployment_id           = (known after apply)
      + deployment_status       = (known after apply)
      + deregister_on_destroy   = true
      + deregister_on_id_change = true
      + detach                  = true
      + hcl1                    = false
      + id                      = (known after apply)
      + jobspec                 = (sensitive value)
      + modify_index            = (known after apply)
      + name                    = "filestore-cleanup"
      + namespace               = "default"
      + read_allocation_ids     = false
      + region                  = (known after apply)
      + rerun_if_dead           = false
      + status                  = (known after apply)
      + task_groups             = [
          + {
              + count = 1
              + meta  = (known after apply)
              + name  = "filestore-cleanup"
              + task  = [
                  + {
                      + driver = "raw_exec"
                      + meta   = (known after apply)
                      + name   = "filestore-cleanup"
                    },
                ]
            },
        ]
      + type                    = "batch"
    }

  # module.nomad.nomad_job.clickhouse[0] will be created
  + resource "nomad_job" "clickhouse" {
      + allocation_ids          = (known after apply)
      + datacenters             = (known after apply)
      + deployment_id           = (known after apply)
      + deployment_status       = (known after apply)
      + deregister_on_destroy   = true
      + deregister_on_id_change = true
      + detach                  = true
      + hcl1                    = false
      + id                      = (known after apply)
      + jobspec                 = (sensitive value)
      + modify_index            = (known after apply)
      + name                    = "clickhouse"
      + namespace               = "default"
      + read_allocation_ids     = false
      + region                  = (known after apply)
      + rerun_if_dead           = false
      + status                  = (known after apply)
      + task_groups             = [
          + {
              + count = 1
              + meta  = (known after apply)
              + name  = "server-1"
              + task  = [
                  + {
                      + driver = "docker"
                      + meta   = (known after apply)
                      + name   = "clickhouse-server"
                    },
                  + {
                      + driver = "docker"
                      + meta   = (known after apply)
                      + name   = "otel-collector"
                    },
                ]
            },
        ]
      + type                    = "service"
    }

  # module.nomad.nomad_job.clickhouse_backup[0] will be created
  + resource "nomad_job" "clickhouse_backup" {
      + allocation_ids          = (known after apply)
      + datacenters             = (known after apply)
      + deployment_id           = (known after apply)
      + deployment_status       = (known after apply)
      + deregister_on_destroy   = true
      + deregister_on_id_change = true
      + detach                  = true
      + hcl1                    = false
      + id                      = (known after apply)
      + jobspec                 = (sensitive value)
      + modify_index            = (known after apply)
      + name                    = "clickhouse-backup"
      + namespace               = "default"
      + read_allocation_ids     = false
      + region                  = (known after apply)
      + rerun_if_dead           = false
      + status                  = (known after apply)
      + task_groups             = [
          + {
              + count = 1
              + meta  = (known after apply)
              + name  = "backup-server-1"
              + task  = [
                  + {
                      + driver = "docker"
                      + meta   = (known after apply)
                      + name   = "clickhouse-backup"
                    },
                ]
            },
        ]
      + type                    = "batch"
    }

  # module.nomad.nomad_job.clickhouse_backup_restore[0] will be created
  + resource "nomad_job" "clickhouse_backup_restore" {
      + allocation_ids          = (known after apply)
      + datacenters             = (known after apply)
      + deployment_id           = (known after apply)
      + deployment_status       = (known after apply)
      + deregister_on_destroy   = true
      + deregister_on_id_change = true
      + detach                  = true
      + hcl1                    = false
      + id                      = (known after apply)
      + jobspec                 = (sensitive value)
      + modify_index            = (known after apply)
      + name                    = "clickhouse-backup-restore"
      + namespace               = "default"
      + read_allocation_ids     = false
      + region                  = (known after apply)
      + rerun_if_dead           = false
      + status                  = (known after apply)
      + task_groups             = [
          + {
              + count = 1
              + meta  = (known after apply)
              + name  = "backup-restore-server-1"
              + task  = [
                  + {
                      + driver = "docker"
                      + meta   = (known after apply)
                      + name   = "clickhouse-backup-restore"
                    },
                ]
            },
        ]
      + type                    = "batch"
    }

  # module.nomad.nomad_job.clickhouse_migrator[0] will be created
  + resource "nomad_job" "clickhouse_migrator" {
      + allocation_ids          = (known after apply)
      + datacenters             = (known after apply)
      + deployment_id           = (known after apply)
      + deployment_status       = (known after apply)
      + deregister_on_destroy   = true
      + deregister_on_id_change = true
      + detach                  = true
      + hcl1                    = false
      + id                      = (known after apply)
      + jobspec                 = (sensitive value)
      + modify_index            = (known after apply)
      + name                    = "clickhouse-migrator"
      + namespace               = "default"
      + read_allocation_ids     = false
      + region                  = (known after apply)
      + rerun_if_dead           = false
      + status                  = (known after apply)
      + task_groups             = [
          + {
              + count = 1
              + meta  = (known after apply)
              + name  = "migrator-1"
              + task  = [
                  + {
                      + driver = "docker"
                      + meta   = (known after apply)
                      + name   = "migrator"
                    },
                ]
            },
        ]
      + type                    = "batch"
    }

  # module.nomad.nomad_job.ingress will be created
  + resource "nomad_job" "ingress" {
      + allocation_ids          = (known after apply)
      + datacenters             = [
          + "us-central1-f",
        ]
      + deployment_id           = (known after apply)
      + deployment_status       = (known after apply)
      + deregister_on_destroy   = true
      + deregister_on_id_change = true
      + detach                  = true
      + hcl1                    = false
      + id                      = (known after apply)
      + jobspec                 = (sensitive value)
      + modify_index            = (known after apply)
      + name                    = "ingress"
      + namespace               = "default"
      + read_allocation_ids     = false
      + region                  = (known after apply)
      + rerun_if_dead           = false
      + status                  = (known after apply)
      + task_groups             = [
          + {
              + count = 1
              + meta  = (known after apply)
              + name  = "ingress"
              + task  = [
                  + {
                      + driver = "docker"
                      + meta   = (known after apply)
                      + name   = "ingress"
                    },
                ]
            },
        ]
      + type                    = (known after apply)
    }

  # module.nomad.nomad_job.logs_collector will be created
  + resource "nomad_job" "logs_collector" {
      + allocation_ids          = (known after apply)
      + datacenters             = (known after apply)
      + deployment_id           = (known after apply)
      + deployment_status       = (known after apply)
      + deregister_on_destroy   = true
      + deregister_on_id_change = true
      + detach                  = true
      + hcl1                    = false
      + id                      = (known after apply)
      + jobspec                 = (sensitive value)
      + modify_index            = (known after apply)
      + name                    = "logs-collector"
      + namespace               = "default"
      + read_allocation_ids     = false
      + region                  = (known after apply)
      + rerun_if_dead           = false
      + status                  = (known after apply)
      + task_groups             = [
          + {
              + count = 1
              + meta  = (known after apply)
              + name  = "logs-collector"
              + task  = [
                  + {
                      + driver = "docker"
                      + meta   = (known after apply)
                      + name   = "start-collector"
                    },
                ]
            },
        ]
      + type                    = "system"
    }

  # module.nomad.nomad_job.otel_collector will be created
  + resource "nomad_job" "otel_collector" {
      + allocation_ids          = (known after apply)
      + datacenters             = (known after apply)
      + deployment_id           = (known after apply)
      + deployment_status       = (known after apply)
      + deregister_on_destroy   = true
      + deregister_on_id_change = true
      + detach                  = true
      + hcl1                    = false
      + id                      = (known after apply)
      + jobspec                 = (sensitive value)
      + modify_index            = (known after apply)
      + name                    = "otel-collector"
      + namespace               = "default"
      + read_allocation_ids     = false
      + region                  = (known after apply)
      + rerun_if_dead           = false
      + status                  = (known after apply)
      + task_groups             = [
          + {
              + count = 1
              + meta  = (known after apply)
              + name  = "otel-collector"
              + task  = [
                  + {
                      + driver = "docker"
                      + meta   = (known after apply)
                      + name   = "start-collector"
                    },
                ]
            },
        ]
      + type                    = "system"
    }

  # module.nomad.nomad_job.redis[0] will be created
  + resource "nomad_job" "redis" {
      + allocation_ids          = (known after apply)
      + datacenters             = [
          + "us-central1-f",
        ]
      + deployment_id           = (known after apply)
      + deployment_status       = (known after apply)
      + deregister_on_destroy   = true
      + deregister_on_id_change = true
      + detach                  = true
      + hcl1                    = false
      + id                      = (known after apply)
      + jobspec                 = <<-EOT
            job "redis" {
              datacenters = ["us-central1-f"]
              node_pool = "api"
              type = "service"
              priority = 95
            
              group "redis" {
                // Try to restart the task indefinitely
                // Tries to restart every 5 seconds
                restart {
                  interval         = "5s"
                  attempts         = 1
                  delay            = "5s"
                  mode             = "delay"
                }
            
                network {
                  port "redis" {
                    static = "6379"
                  }
                }
            
                service {
                  name = "redis"
                  port = "redis"
            
                  check {
                    type     = "tcp"
                    name     = "health"
                    interval = "10s"
                    timeout  = "2s"
                    port     = "6379"
                  }
                }
            
                task "start" {
                  driver = "docker"
            
                  resources {
                    memory_max = 4096
                    memory     = 2048
                    cpu        = 1000
                  }
            
                  config {
                    network_mode = "host"
                    image        = "redis:7.4.2-alpine"
                    ports        = ["redis"]
                    args = [
                    ]
                  }
                }
              }
            }
        EOT
      + modify_index            = (known after apply)
      + name                    = "redis"
      + namespace               = "default"
      + read_allocation_ids     = false
      + region                  = (known after apply)
      + rerun_if_dead           = false
      + status                  = (known after apply)
      + task_groups             = [
          + {
              + count = 1
              + meta  = (known after apply)
              + name  = "redis"
              + task  = [
                  + {
                      + driver = "docker"
                      + meta   = (known after apply)
                      + name   = "start"
                    },
                ]
            },
        ]
      + type                    = "service"
    }

Plan: 10 to add, 0 to change, 0 to destroy.

Logs after `make apply`:

make apply
./scripts/confirm.sh prod
Please type *production* to manually deploy to prod
production
Proceeding...
/Applications/Xcode.app/Contents/Developer/usr/bin/make -C iac/provider-gcp apply
Applying Terraform for env: prod

module.nomad.nomad_job.clickhouse_migrator[0]: Creating...
module.nomad.nomad_job.redis[0]: Creating...
module.nomad.nomad_job.clean_nfs_cache[0]: Creating...
module.nomad.nomad_job.clickhouse_backup_restore[0]: Creating...
module.nomad.nomad_job.logs_collector: Creating...
module.nomad.nomad_job.api: Creating...
module.nomad.nomad_job.ingress: Creating...
module.nomad.nomad_job.clickhouse_backup[0]: Creating...
module.nomad.nomad_job.otel_collector: Creating...
module.nomad.nomad_job.clean_nfs_cache[0]: Creation complete after 0s [id=filestore-cleanup]
module.nomad.nomad_job.logs_collector: Creation complete after 0s [id=logs-collector]
module.nomad.nomad_job.ingress: Creation complete after 0s [id=ingress]
module.nomad.nomad_job.otel_collector: Creation complete after 0s [id=otel-collector]
module.nomad.nomad_job.clickhouse[0]: Creating...
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to module.nomad.nomad_job.redis[0], provider
│ "module.nomad.provider[\"registry.terraform.io/hashicorp/nomad\"]" produced an unexpected new value: Root resource was present, but
│ now absent.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to module.nomad.nomad_job.clickhouse[0], provider
│ "module.nomad.provider[\"registry.terraform.io/hashicorp/nomad\"]" produced an unexpected new value: Root resource was present, but
│ now absent.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to module.nomad.nomad_job.clickhouse_backup[0], provider
│ "module.nomad.provider[\"registry.terraform.io/hashicorp/nomad\"]" produced an unexpected new value: Root resource was present, but
│ now absent.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to module.nomad.nomad_job.api, provider "module.nomad.provider[\"registry.terraform.io/hashicorp/nomad\"]"
│ produced an unexpected new value: Root resource was present, but now absent.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to module.nomad.nomad_job.clickhouse_backup_restore[0], provider
│ "module.nomad.provider[\"registry.terraform.io/hashicorp/nomad\"]" produced an unexpected new value: Root resource was present, but
│ now absent.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: error applying jobspec: Unexpected response code: 500 (1 error occurred:
│ 	* job "clickhouse-migrator" is in nonexistent node pool "clickhouse")
│ 
│   with module.nomad.nomad_job.clickhouse_migrator[0],
│   on nomad/main.tf line 695, in resource "nomad_job" "clickhouse_migrator":
│  695: resource "nomad_job" "clickhouse_migrator" {
│ 
╵
make[1]: *** [apply] Error 1
make: *** [apply] Error 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Self-hosting e2b on GCP #1767

Sandbox ID or Build ID

Environment

Timestamp of the issue

Frequency

Expected behavior

Actual behavior

Issue reproduction

Additional context

module.nomad.nomad_job.api will be created

module.nomad.nomad_job.clean_nfs_cache[0] will be created

module.nomad.nomad_job.clickhouse[0] will be created

module.nomad.nomad_job.clickhouse_backup[0] will be created

module.nomad.nomad_job.clickhouse_backup_restore[0] will be created

module.nomad.nomad_job.clickhouse_migrator[0] will be created

module.nomad.nomad_job.ingress will be created

module.nomad.nomad_job.logs_collector will be created

module.nomad.nomad_job.otel_collector will be created

module.nomad.nomad_job.redis[0] will be created

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Self-hosting e2b on GCP #1767

Description

Sandbox ID or Build ID

Environment

Timestamp of the issue

Frequency

Expected behavior

Actual behavior

Issue reproduction

Additional context

module.nomad.nomad_job.api will be created

module.nomad.nomad_job.clean_nfs_cache[0] will be created

module.nomad.nomad_job.clickhouse[0] will be created

module.nomad.nomad_job.clickhouse_backup[0] will be created

module.nomad.nomad_job.clickhouse_backup_restore[0] will be created

module.nomad.nomad_job.clickhouse_migrator[0] will be created

module.nomad.nomad_job.ingress will be created

module.nomad.nomad_job.logs_collector will be created

module.nomad.nomad_job.otel_collector will be created

module.nomad.nomad_job.redis[0] will be created

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions