diff --git a/docs/_snippets/_S3_authentication_and_bucket.md b/docs/_snippets/_S3_authentication_and_bucket.md index 1cf34667582..640657c855f 100644 --- a/docs/_snippets/_S3_authentication_and_bucket.md +++ b/docs/_snippets/_S3_authentication_and_bucket.md @@ -20,37 +20,41 @@ import s3_h from '@site/static/images/_snippets/s3/s3-h.png';
Create S3 buckets and an IAM user -This article demonstrates the basics of how to configure an AWS IAM user, create an S3 bucket and configure ClickHouse to use the bucket as an S3 disk. You should work with your security team to determine the permissions to be used, and consider these as a starting point. +This article demonstrates the basics of how to configure an AWS IAM user, create an S3 bucket and configure ClickHouse to use the bucket as an S3 disk. +You should work with your security team to determine the permissions to be used, and consider these as a starting point. ### Create an AWS IAM user {#create-an-aws-iam-user} -In this procedure, we'll be creating a service account user, not a login user. -1. Log into the AWS IAM Management Console. -2. In "users", select **Add users** +In the following steps you'll be creating a service account user (not a login user). + +1. Log into the AWS IAM Management Console. + +2. In the `Users` menu, select `Create user` AWS IAM Management Console - Adding a new user -3. Enter the user name and set the credential type to **Access key - Programmatic access** and select **Next: Permissions** +3. Enter the username and set the credential type to `Access key - Programmatic access` and select `Next: Permissions` Setting user name and access type for IAM user -4. Do not add the user to any group; select **Next: Tags** +4. Do not add the user to any group; select `Next: Tags` Skipping group assignment for IAM user -5. Unless you need to add any tags, select **Next: Review** +5. Unless you need to add any tags, select `Next: Review` Skipping tag assignment for IAM user -6. Select **Create User** +6. Select `Create User` - :::note - The warning message stating that the user has no permissions can be ignored; permissions will be granted on the bucket for the user in the next section - ::: +:::note +The warning message stating that the user has no permissions can be ignored; permissions will be granted on the bucket for the user in the next section +::: Creating the IAM user with no permissions warning -7. The user is now created; click on **show** and copy the access and secret keys. +7. The user is now created; click on `show` and copy the access and secret keys. + :::note Save the keys somewhere else; this is the only time that the secret access key will be available. ::: @@ -66,33 +70,36 @@ Save the keys somewhere else; this is the only time that the secret access key w Copying the ARN of the IAM user ### Create an S3 bucket {#create-an-s3-bucket} -1. In the S3 bucket section, select **Create bucket** + +1. In the S3 bucket section, select `Create bucket` Starting the S3 bucket creation process 2. Enter a bucket name, leave other options default + :::note The bucket name must be unique across AWS, not just the organization, or it will emit an error. ::: + 3. Leave `Block all Public Access` enabled; public access is not needed. Configuring the S3 bucket settings with public access blocked -4. Select **Create Bucket** at the bottom of the page +4. Select `Create Bucket` at the bottom of the page Finalizing S3 bucket creation 5. Select the link, copy the ARN, and save it for use when configuring the access policy for the bucket. -6. Once the bucket has been created, find the new S3 bucket in the S3 buckets list and select the link +6. Once the bucket has been created, find the new S3 bucket in the S3 bucket list and select the link Finding the newly created S3 bucket in the buckets list -7. Select **Create folder** +7. Select `Create folder` Creating a new folder in the S3 bucket -8. Enter a folder name that will be the target for the ClickHouse S3 disk and select **Create folder** +8. Enter a folder name that will be the target for the ClickHouse S3 disk and select `Create folder` Setting the folder name for ClickHouse S3 disk usage @@ -100,15 +107,16 @@ The bucket name must be unique across AWS, not just the organization, or it will Viewing the newly created folder in the S3 bucket -10. Select the checkbox for the new folder and click on **Copy URL** Save the URL copied to be used in the ClickHouse storage configuration in the next section. +10. Select the checkbox for the new folder and click on `Copy URL` Save the URL copied to be used in the ClickHouse storage configuration in the next section. Copying the S3 folder URL for ClickHouse configuration -11. Select the **Permissions** tab and click on the **Edit** button in the **Bucket Policy** section +11. Select the `Permissions` tab and click on the `Edit` button in the `Bucket Policy` section Accessing the S3 bucket policy configuration 12. Add a bucket policy, example below: + ```json { "Version" : "2012-10-17", diff --git a/docs/cloud/guides/data_sources/02_accessing-s3-data-securely.md b/docs/cloud/guides/data_sources/02_accessing-s3-data-securely.md index e90da00c552..addbc11dece 100644 --- a/docs/cloud/guides/data_sources/02_accessing-s3-data-securely.md +++ b/docs/cloud/guides/data_sources/02_accessing-s3-data-securely.md @@ -10,45 +10,55 @@ doc_type: 'guide' import Image from '@theme/IdealImage'; import secure_s3 from '@site/static/images/cloud/security/secures3.png'; import s3_info from '@site/static/images/cloud/security/secures3_arn.png'; -import s3_output from '@site/static/images/cloud/security/secures3_output.jpg'; +import s3_output from '@site/static/images/cloud/security/secures3_output.png'; This article demonstrates how ClickHouse Cloud customers can leverage role-based access to authenticate with Amazon Simple Storage Service (S3) and access their data securely. - -## Introduction {#introduction} - Before diving into the setup for secure S3 access, it is important to understand how this works. Below is an overview of how ClickHouse services can access private S3 buckets by assuming into a role within customers' AWS account. Overview of Secure S3 Access with ClickHouse +
+Overview of Secure S3 Access with ClickHouse +
This approach allows customers to manage all access to their S3 buckets in a single place (the IAM policy of the assumed-role) without having to go through all of their bucket policies to add or remove access. +In the section below, you will learn how to set this up. -## Setup {#setup} - -### Obtaining the ClickHouse service IAM role ARN {#obtaining-the-clickhouse-service-iam-role-arn} +## Obtain the IAM role ARN of your ClickHouse service {#obtaining-the-clickhouse-service-iam-role-arn} -1 - Login to your ClickHouse cloud account. +1. Login to your ClickHouse cloud account. -2 - Select the ClickHouse service you want to create the integration +2. Select the ClickHouse service you want to create the integration -3 - Select the **Settings** tab +3. Select the **Settings** tab -4 - Scroll down to the **Network security information** section at the bottom of the page +4. Scroll down to the **Network security information** section at the bottom of the page -5 - Copy the **Service role ID (IAM)** value belong to the service as shown below. +5. Copy the **Service role ID (IAM)** value belong to the service as shown below. Obtaining ClickHouse service IAM Role ARN -### Setting up IAM assume role {#setting-up-iam-assume-role} +## Set up IAM assume role {#setting-up-iam-assume-role} -#### Option 1: Deploying with CloudFormation stack {#option-1-deploying-with-cloudformation-stack} +The IAM assume role can be setup in one of two ways: +- [Using CloudFormation stack](#option-1-deploying-with-cloudformation-stack) +- [Manually creating an IAM role](#option-2-manually-create-iam-role) -1 - Login to your AWS Account in the web browser with an IAM user that has permission to create & manage IAM role. +### Deploying with CloudFormation stack {#option-1-deploying-with-cloudformation-stack} -2 - Visit [this url](https://us-west-2.console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/quickcreate?templateURL=https://s3.us-east-2.amazonaws.com/clickhouse-public-resources.clickhouse.cloud/cf-templates/secure-s3.yaml&stackName=ClickHouseSecureS3) to populate the CloudFormation stack. +1. Login to your AWS Account in the web browser with an IAM user that has permission to create & manage IAM role. -3 - Enter (or paste) the **IAM Role** belong to the ClickHouse service +2. Visit the following [CloudFormation URL](https://us-west-2.console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/quickcreate?templateURL=https://s3.us-east-2.amazonaws.com/clickhouse-public-resources.clickhouse.cloud/cf-templates/secure-s3.yaml&stackName=ClickHouseSecureS3) to populate the CloudFormation stack. -4 - Configure the CloudFormation stack. Below is additional information about these parameters. +3. Enter (or paste) the **service role ID (IAM)** for your service that you obtained earlier into the input titled "ClickHouse Instance Roles" + You can paste the service role ID exactly as it appears in Cloud console. + +4. Enter your bucket name in the input titled "Bucket Names". If your bucket URL is `https://ch-docs-s3-bucket.s3.eu-central-1.amazonaws.com/clickhouseS3/` then the bucket name is `ch-docs-s3-bucket`. + +:::note +Do not put the full bucket ARN but instead just the bucket name only. +::: + +5. Configure the CloudFormation stack. Below is additional information about these parameters. | Parameter | Default Value | Description | | :--- | :----: | :---- | @@ -58,29 +68,27 @@ This approach allows customers to manage all access to their S3 buckets in a sin | Bucket Access | Read | Sets the level of access for the provided buckets. | | Bucket Names | | Comma separated list of **bucket names** that this role will have access to. | -*Note*: Do not put the full bucket Arn but instead just the bucket name only. - -5 - Select the **I acknowledge that AWS CloudFormation might create IAM resources with custom names.** checkbox +6. Select the **I acknowledge that AWS CloudFormation might create IAM resources with custom names.** checkbox -6 - Click **Create stack** button at bottom right +7. Click the **Create stack** button at the bottom right -7 - Make sure the CloudFormation stack completes with no error. +8. Make sure the CloudFormation stack completes with no error. -8 - Select the **Outputs** of the CloudFormation stack +9. Select the newly created Stack then select the **Outputs** tab of the CloudFormation stack -9 - Copy the **RoleArn** value for this integration. This is what needed to access your S3 bucket. +10. Copy the **RoleArn** value for this integration, which is what you need to access your S3 bucket. CloudFormation stack output showing IAM Role ARN -#### Option 2: Manually create IAM role {#option-2-manually-create-iam-role} +### Manually create IAM role {#option-2-manually-create-iam-role} -1 - Login to your AWS Account in the web browser with an IAM user that has permission to create & manage IAM role. +1. Login to your AWS Account in the web browser with an IAM user that has permission to create & manage IAM role. -2 - Browse to IAM Service Console +2. Browse to the IAM Service Console -3 - Create a new IAM role with the following IAM & Trust policy. +3. Create a new IAM role with the following IAM & Trust policy -Trust policy (Please replace `{ClickHouse_IAM_ARN}` with the IAM Role arn belong to your ClickHouse instance): +Trust policy (Please replace `{ClickHouse_IAM_ARN}` with the IAM Role arn belong to your ClickHouse instance): ```json { @@ -127,22 +135,25 @@ IAM policy (Please replace `{BUCKET_NAME}` with your bucket name): } ``` -4 - Copy the new **IAM Role Arn** after creation. This is what needed to access your S3 bucket. +4. Copy the new **IAM Role Arn** after creation, which is what is needed to access your S3 bucket. ## Access your S3 bucket with the ClickHouseAccess role {#access-your-s3-bucket-with-the-clickhouseaccess-role} -ClickHouse Cloud has a new feature that allows you to specify `extra_credentials` as part of the S3 table function. Below is an example of how to run a query using the newly created role copied from above. +ClickHouse Cloud allows you to specify `extra_credentials` as part of the S3 table function. +Below is an example of how to run a query using the newly created role copied from above. ```sql DESCRIBE TABLE s3('https://s3.amazonaws.com/BUCKETNAME/BUCKETOBJECT.csv','CSVWithNames',extra_credentials(role_arn = 'arn:aws:iam::111111111111:role/ClickHouseAccessRole-001')) ``` -Below is an example query that uses the `role_session_name` as a shared secret to query data from a bucket. If the `role_session_name` is not correct, this operation will fail. +Below is an example query that uses the `role_session_name` as a shared secret to query data from a bucket. +If the `role_session_name` is not correct, this operation will fail. ```sql DESCRIBE TABLE s3('https://s3.amazonaws.com/BUCKETNAME/BUCKETOBJECT.csv','CSVWithNames',extra_credentials(role_arn = 'arn:aws:iam::111111111111:role/ClickHouseAccessRole-001', role_session_name = 'secret-role-name')) ``` :::note -We recommend that your source S3 is in the same region as your ClickHouse Cloud Service to reduce on data transfer costs. For more information, refer to [S3 pricing]( https://aws.amazon.com/s3/pricing/) +We recommend that your source S3 is in the same region as your ClickHouse Cloud Service to reduce on data transfer costs. +For more information, refer to [S3 pricing]( https://aws.amazon.com/s3/pricing/) ::: diff --git a/docs/cloud/onboard/02_migrate/01_migration_guides/07_OSS_to_Cloud/01_clickhouse-to-cloud.md b/docs/cloud/onboard/02_migrate/01_migration_guides/07_OSS_to_Cloud/01_clickhouse-to-cloud_with_remotesecure.md similarity index 98% rename from docs/cloud/onboard/02_migrate/01_migration_guides/07_OSS_to_Cloud/01_clickhouse-to-cloud.md rename to docs/cloud/onboard/02_migrate/01_migration_guides/07_OSS_to_Cloud/01_clickhouse-to-cloud_with_remotesecure.md index 9bd863522b0..9b02d7ce334 100644 --- a/docs/cloud/onboard/02_migrate/01_migration_guides/07_OSS_to_Cloud/01_clickhouse-to-cloud.md +++ b/docs/cloud/onboard/02_migrate/01_migration_guides/07_OSS_to_Cloud/01_clickhouse-to-cloud_with_remotesecure.md @@ -1,5 +1,5 @@ --- -sidebar_label: 'ClickHouse OSS' +sidebar_label: 'Using remoteSecure' slug: /cloud/migration/clickhouse-to-cloud title: 'Migrating between self-managed ClickHouse and ClickHouse Cloud' description: 'Page describing how to migrate between self-managed ClickHouse and ClickHouse Cloud' @@ -16,13 +16,13 @@ import self_managed_04 from '@site/static/images/integrations/migration/self-man import self_managed_05 from '@site/static/images/integrations/migration/self-managed-05.png'; import self_managed_06 from '@site/static/images/integrations/migration/self-managed-06.png'; -# Migrating between self-managed ClickHouse and ClickHouse Cloud +# Migrating between self-managed ClickHouse and ClickHouse Cloud using remoteSecure Migrating Self-managed ClickHouse This guide will show how to migrate from a self-managed ClickHouse server to ClickHouse Cloud, and also how to migrate between ClickHouse Cloud services. The [`remoteSecure`](/sql-reference/table-functions/remote) function is used in `SELECT` and `INSERT` queries to allow access to remote ClickHouse servers, which makes migrating tables as simple as writing an `INSERT INTO` query with an embedded `SELECT`. -## Migrating from Self-managed ClickHouse to ClickHouse Cloud {#migrating-from-self-managed-clickhouse-to-clickhouse-cloud} +## Migrating from self-managed ClickHouse to ClickHouse Cloud {#migrating-from-self-managed-clickhouse-to-clickhouse-cloud} Migrating Self-managed ClickHouse diff --git a/docs/cloud/onboard/02_migrate/01_migration_guides/07_OSS_to_Cloud/02_oss_to_cloud_backups.md b/docs/cloud/onboard/02_migrate/01_migration_guides/07_OSS_to_Cloud/02_oss_to_cloud_backups.md new file mode 100644 index 00000000000..e35e3fae969 --- /dev/null +++ b/docs/cloud/onboard/02_migrate/01_migration_guides/07_OSS_to_Cloud/02_oss_to_cloud_backups.md @@ -0,0 +1,366 @@ +--- +sidebar_label: 'Using BACKUP and RESTORE' +slug: /cloud/migration/oss-to-cloud-backup-restore +title: 'Migrating between self-managed ClickHouse and ClickHouse Cloud with BACKUP/RESTORE' +description: 'Page describing how to migrate between self-managed ClickHouse and ClickHouse Cloud using BACKUP and RESTORE commands' +doc_type: 'guide' +keywords: ['migration', 'ClickHouse Cloud', 'OSS', 'Migrate self-managed to Cloud', 'BACKUP', 'RESTORE'] +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +import Image from '@theme/IdealImage'; +import create_service from '@site/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/create_service.png'; +import service_details from '@site/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/service_details.png'; +import open_console from '@site/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/open_console.png'; +import service_role_id from '@site/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/service_role_id.png'; +import create_new_role from '@site/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/create_new_role.png'; +import backup_s3_bucket from '@site/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/backup_in_s3_bucket.png'; + +# Migrating from Self-Managed ClickHouse to ClickHouse Cloud Using Backup Commands + +## Overview {#overview-migration-approaches} + +There are two primary methods to migrate from self-managed ClickHouse (OSS) to ClickHouse Cloud: + +- Using the [`remoteSecure()`](/cloud/migration/clickhouse-to-cloud) function in which data is directly pulled/pushed. +- Using `BACKUP`/`RESTORE` commands via cloud object storage + +>This migration guide focuses on the `BACKUP`/`RESTORE` approach and offers a practical example +of migrating a database or full service in open source ClickHouse to Cloud via an S3 bucket. + +**Prerequisites** +- You have Docker installed +- You have an [S3 bucket and IAM user](/integrations/s3/creating-iam-user-and-s3-bucket) +- You're able to create a new service ClickHouse Cloud service + +To make the steps in this guide easy to follow along with and reproducible, we'll use one of the docker compose recipes +for a ClickHouse cluster with two shards, and two replicas. + +:::note[Cluster required] +This backup method requires a ClickHouse cluster because tables must be converted from the `MergeTree` engine to `ReplicatedMergeTree`. +If you're running a single instance, follow the steps in ["Migrating between self-managed ClickHouse and ClickHouse Cloud using remoteSecure"](/cloud/migration/clickhouse-to-cloud) instead. +::: + +## OSS preparation {#oss-setup} + +1. Clone the [examples repository](https://github.com/ClickHouse/examples) to your local machine +2. From your terminal, `cd` into `examples/docker-compose-recipes/recipes/cluster_2S_2R` +3. Make sure Docker is running, then start the ClickHouse cluster: + +```bash +docker compose up +``` + +You should see: + +```bash +[+] Running 7/7 + ✔ Container clickhouse-keeper-01 Created 0.1s + ✔ Container clickhouse-keeper-02 Created 0.1s + ✔ Container clickhouse-keeper-03 Created 0.1s + ✔ Container clickhouse-01 Created 0.1s + ✔ Container clickhouse-02 Created 0.1s + ✔ Container clickhouse-04 Created 0.1s + ✔ Container clickhouse-03 Created 0.1s +``` + +From a new terminal window at the root of the folder run the following command to connect to the first node of the cluster: + +```bash +docker exec -it clickhouse-01 clickhouse-client +``` + +### Create sample data {#create-sample-data} + +For this guide, we'll use the New York taxi dataset as sample data. +Follow the first two steps of the [New York taxi data guide](/getting-started/example-datasets/nyc-taxi) to create the table and load data. + +Run the following commands to create a new database and insert data from an S3 bucket into a new table: + +```sql +CREATE DATABASE nyc_taxi; + +CREATE TABLE nyc_taxi.trips_small ( + trip_id UInt32, + pickup_datetime DateTime, + dropoff_datetime DateTime, + pickup_longitude Nullable(Float64), + pickup_latitude Nullable(Float64), + dropoff_longitude Nullable(Float64), + dropoff_latitude Nullable(Float64), + passenger_count UInt8, + trip_distance Float32, + fare_amount Float32, + extra Float32, + tip_amount Float32, + tolls_amount Float32, + total_amount Float32, + payment_type Enum('CSH' = 1, 'CRE' = 2, 'NOC' = 3, 'DIS' = 4, 'UNK' = 5), + pickup_ntaname LowCardinality(String), + dropoff_ntaname LowCardinality(String) +) +ENGINE = MergeTree +PRIMARY KEY (pickup_datetime, dropoff_datetime); +``` + +```sql +INSERT INTO nyc_taxi.trips_small +SELECT + trip_id, + pickup_datetime, + dropoff_datetime, + pickup_longitude, + pickup_latitude, + dropoff_longitude, + dropoff_latitude, + passenger_count, + trip_distance, + fare_amount, + extra, + tip_amount, + tolls_amount, + total_amount, + payment_type, + pickup_ntaname, + dropoff_ntaname +FROM s3( + 'https://datasets-documentation.s3.eu-west-3.amazonaws.com/nyc-taxi/trips_{0..2}.gz', + 'TabSeparatedWithNames' +); +``` + +In the `CREATE TABLE` DDL statement we specified the table engine type as `MergeTree`, however +ClickHouse Cloud works with [`SharedMergeTree`](/cloud/reference/shared-merge-tree). When restoring a backup, ClickHouse automatically converts `ReplicatedMergeTree` to `SharedMergeTree`. +However, you'll need to convert any `MergeTree` tables to `ReplicatedMergeTree` before backing them up. + +Run the following command to `DETACH` the table. + +```sql +DETACH TABLE nyc_taxi.trips_small; +``` + +Then attach it as replicated: + +```sql +ATTACH TABLE nyc_taxi.trips_small AS REPLICATED; +``` + +Finally, restore the replica metadata: + +```sql +SYSTEM RESTORE REPLICA nyc_taxi.trips_small; +``` + +Check that it was converted to `ReplicatedMergeTree`: + +```sql +SELECT engine +FROM system.tables +WHERE name = 'trips_small' AND database = 'nyc_taxi'; + +┌─engine──────────────┐ +│ ReplicatedMergeTree │ +└─────────────────────┘ +``` + +You're now ready to proceed with setting up your Cloud service in preparation for later +restoring a backup from your S3 bucket. + +## Cloud preparation {#cloud-setup} + +You will be restoring your data into a new Cloud service. +Follow the steps below to create a new Cloud service. + + + +#### Open Cloud Console {#open-cloud-console} + +Go to [https://console.clickhouse.cloud/](https://console.clickhouse.cloud/) + +#### Create a new service {#create-new-service} + +create a new service + +#### Configure and create a service {#configure-and-create} + +Choose your desired region and configuration, then click `Create service` + +setup service preferences + +#### Create an access role {#create-an-access-role} + +Open SQL console + +setup service preferences + +### Set up S3 access {#set-up-s3-access} + +To restore your backup from S3, you'll need to configure secure access between ClickHouse Cloud and your S3 bucket. + +1. Follow the steps in ["Accessing S3 data securely"](/cloud/data-sources/secure-s3) to create an access role and obtain the role ARN. + +2. Update the S3 bucket policy you created in ["How to create an S3 bucket and IAM role"](/integrations/s3/creating-iam-user-and-s3-bucket) by adding the role ARN from the previous step. + +Your updated policy for the S3 bucket will look something like this: + +```json +{ + "Version": "2012-10-17", + "Id": "Policy123456", + "Statement": [ + { + "Sid": "abc123", + "Effect": "Allow", + "Principal": { + "AWS": [ +#highlight-start + "arn:aws:iam::123456789123:role/ClickHouseAccess-001", + "arn:aws:iam::123456789123:user/docs-s3-user" +#highlight-end + ] + }, + "Action": "s3:*", + "Resource": [ + "arn:aws:s3:::ch-docs-s3-bucket", + "arn:aws:s3:::ch-docs-s3-bucket/*" + ] + } + ] +} +``` + +The policy includes both ARNs: +- **IAM user** (`docs-s3-user`): Allows your self-managed ClickHouse cluster to back up to S3 +- **ClickHouse Cloud role** (`ClickHouseAccess-001`): Allows your Cloud service to restore from S3 + + + +## Taking the backup (on self-managed deployment) {#taking-a-backup-on-oss} + +To make a backup of a single database, run the following command from clickhouse-client +connected to your OSS deployment: + +```sql +BACKUP DATABASE nyc_taxi +TO S3( + 'BUCKET_URL', + 'KEY_ID', + 'SECRET_KEY' +) +``` + +Replace `BUCKET_URL`, `KEY_ID` and `SECRET_KEY` with your own AWS credentials. +The guide ["How to create an S3 bucket and IAM role"](/integrations/s3/creating-iam-user-and-s3-bucket) +shows you how to obtain these if you do not yet have them. + +If everything is correctly configured you will see a response similar to the one below +containing a unique id assigned to the backup and the status of the backup. + +```response +Query id: efcaf053-75ed-4924-aeb1-525547ea8d45 + +┌─id───────────────────────────────────┬─status─────────┐ +│ e73b99ab-f2a9-443a-80b4-533efe2d40b3 │ BACKUP_CREATED │ +└──────────────────────────────────────┴────────────────┘ +``` + +If you check your previously empty S3 bucket you will now see some folders have appeared: + +backup, data and metadata + +If you're performing a full migration then you can run the following command to backup the entire server: + +```sql +BACKUP +TABLE system.users, +TABLE system.roles, +TABLE system.settings_profiles, +TABLE system.row_policies, +TABLE system.quotas, +TABLE system.functions, +ALL EXCEPT DATABASES INFORMATION_SCHEMA, information_schema, system +TO S3( + 'BUCKET_ID', + 'KEY_ID', + 'SECRET_ID' +) +SETTINGS + compression_method='lzma', + compression_level=3; +``` + +The command above backups up: +- All user databases and tables +- User accounts and passwords +- Roles and permissions +- Settings profiles +- Row policies +- Quotas +- User-defined functions + +If you're using a different CSP, you can use the `TO S3()` (for both AWS and GCP) and `TO AzureBlobStorage()` syntax. + +For very large databases, consider using `ASYNC` to run the backup in the background: + +```sql +BACKUP DATABASE my_database +TO S3('https://your-bucket.s3.amazonaws.com/backup.zip', 'key', 'secret') +ASYNC; + +-- Returns immediately with backup ID +-- Example result: +-- ┌─id──────────────────────────────────┬─status────────────┐ +-- │ abc123-def456-789 │ CREATING_BACKUP │ +-- └─────────────────────────────────────┴───────────────────┘ +``` + +The backup id can then be used to monitor the progress of the backup: + +```sql +SELECT * +FROM system.backups +WHERE id = 'abc123-def456-789' +``` + +It is also possible to take incremental backups. +For more detail on backups in general, the reader is referred to the documentation for [backup and restore](/operations/backup). + +## Restore to ClickHouse Cloud {#restore-to-clickhouse-cloud} + +To restore a single database run the following query from your Cloud service, substituting your AWS credentials below, +setting `ROLE_ARN` equal to the value which you obtained as output of the steps detailed +in ["Accessing S3 data securely"](/cloud/data-sources/secure-s3) + +```sql +RESTORE DATABASE nyc_taxi +FROM S3( + 'BUCKET_URL', + extra_credentials(role_arn = 'ROLE_ARN') +) +``` + +You can do a full service restore in a similar manner: + +```sql +RESTORE + TABLE system.users, + TABLE system.roles, + TABLE system.settings_profiles, + TABLE system.row_policies, + TABLE system.quotas, + ALL EXCEPT DATABASES INFORMATION_SCHEMA, information_schema, system +FROM S3( + 'BUCKET_URL', + extra_credentials(role_arn = 'ROLE_ARN') +) +``` + +If you now run the following query in Cloud you can see that the database and table have been +successfully restored on Cloud: + +```sql +SELECT count(*) FROM nyc_taxi.trips_small; +3000317 +``` diff --git a/docs/integrations/data-ingestion/s3/creating-an-s3-iam-role-and-bucket.md b/docs/integrations/data-ingestion/s3/creating-an-s3-iam-role-and-bucket.md new file mode 100644 index 00000000000..45a420ab47c --- /dev/null +++ b/docs/integrations/data-ingestion/s3/creating-an-s3-iam-role-and-bucket.md @@ -0,0 +1,167 @@ +--- +title: 'How to create an AWS IAM user and S3 bucket' +description: 'How to create an AWS IAM user and S3 bucket.' +keywords: ['AWS', 'IAM', 'S3 bucket'] +slug: /integrations/s3/creating-iam-user-and-s3-bucket +sidebar_label: 'How to create an AWS IAM user and S3 bucket' +doc_type: 'guide' +--- + +import Image from '@theme/IdealImage'; +import s3_1 from '@site/static/images/_snippets/s3/2025/s3-1.png'; +import s3_2 from '@site/static/images/_snippets/s3/2025/s3-2.png'; +import s3_3 from '@site/static/images/_snippets/s3/2025/s3-3.png'; +import s3_4 from '@site/static/images/_snippets/s3/2025/s3-4.png'; +import s3_5 from '@site/static/images/_snippets/s3/2025/s3-5.png'; +import s3_6 from '@site/static/images/_snippets/s3/2025/s3-6.png'; +import s3_7 from '@site/static/images/_snippets/s3/2025/s3-7.png'; +import s3_8 from '@site/static/images/_snippets/s3/2025/s3-8.png'; +import s3_9 from '@site/static/images/_snippets/s3/2025/s3-9.png'; +import s3_10 from '@site/static/images/_snippets/s3/2025/s3-10.png'; +import s3_11 from '@site/static/images/_snippets/s3/2025/s3-11.png'; +import s3_12 from '@site/static/images/_snippets/s3/2025/s3-12.png'; +import s3_13 from '@site/static/images/_snippets/s3/2025/s3-13.png'; +import s3_14 from '@site/static/images/_snippets/s3/2025/s3-14.png'; +import s3_15 from '@site/static/images/_snippets/s3/2025/s3-15.png'; +import s3_16 from '@site/static/images/_snippets/s3/2025/s3-16.png'; +import s3_17 from '@site/static/images/_snippets/s3/2025/s3-17.png'; +import s3_18 from '@site/static/images/_snippets/s3/2025/s3-18.png'; +import s3_19 from '@site/static/images/_snippets/s3/2025/s3-19.png'; +import s3_20 from '@site/static/images/_snippets/s3/2025/s3-20.png'; + +> This guide shows you how you can set up an IAM user and S3 bucket in AWS, +> a prerequisite step for taking backups to S3 or configuring ClickHouse to +> store data on S3 + +## Create an AWS IAM user {#create-an-aws-iam-user} + +In this procedure, we'll be creating a service account user, not a login user. + +1. Log into the AWS IAM Management Console. + +2. In the `Users` tab, select `Create user` + +AWS IAM Management Console - Adding a new user + +3. Enter a user-name + +AWS IAM Management Console - Adding a new user + +4. Select `Next` + +AWS IAM Management Console - Adding a new user + +5. Select `Next` + +AWS IAM Management Console - Adding a new user + +6. Select `Create user` + +The user is now created. +Click on the newly created user + +AWS IAM Management Console - Adding a new user + +7. Select `Create access key` + +AWS IAM Management Console - Adding a new user + +8. Select `Application running outside AWS` + +AWS IAM Management Console - Adding a new user + +9. Select `Create access key` + +AWS IAM Management Console - Adding a new user + +10. Download your access key and secret as a .csv for use later + +AWS IAM Management Console - Adding a new user + +## Create an S3 bucket {#create-an-s3-bucket} + +1. In the S3 bucket section, select **Create bucket** + +AWS IAM Management Console - Adding a new user + +2. Enter a bucket name, leave other options default + +AWS IAM Management Console - Adding a new user + +:::note +The bucket name must be unique across AWS, not just the organization, or it will emit an error. +::: + +3. Leave `Block all Public Access` enabled; public access is not needed. + +AWS IAM Management Console - Adding a new user + +4. Select **Create Bucket** at the bottom of the page + +AWS IAM Management Console - Adding a new user + +5. Select the link, copy the ARN, and save it for use when configuring the access policy for the bucket + +AWS IAM Management Console - Adding a new user + +6. Once the bucket has been created, find the new S3 bucket in the S3 buckets list and select the bucket name which will take you to the page shown below: + +AWS IAM Management Console - Adding a new user + +7. Select `Create folder` + +8. Enter a folder name that will be the target for the ClickHouse S3 disk or backup and select `Create folder` at the bottom of the page + +AWS IAM Management Console - Adding a new user + +9. The folder should now be visible on the bucket list + +AWS IAM Management Console - Adding a new user + +10. Select the checkbox for the new folder and click on `Copy URL`. Save the URL for use in the ClickHouse storage configuration in the next section. + +AWS IAM Management Console - Adding a new user + +11. Select the **Permissions** tab and click on the **Edit** button in the **Bucket Policy** section + +AWS IAM Management Console - Adding a new user + +12. Add a bucket policy, example below + +```json +{ + "Version": "2012-10-17", + "Id": "Policy123456", + "Statement": [ + { + "Sid": "abc123", + "Effect": "Allow", + "Principal": { + "AWS": "arn:aws:iam::782985192762:user/docs-s3-user" + }, + "Action": "s3:*", + "Resource": [ + "arn:aws:s3:::ch-docs-s3-bucket", + "arn:aws:s3:::ch-docs-s3-bucket/*" + ] + } + ] +} +``` + +|Parameter | Description | Example Value | +|----------|-------------|----------------| +|Version | Version of the policy interpreter, leave as-is | 2012-10-17 | +|Sid | User-defined policy id | abc123 | +|Effect | Whether user requests will be allowed or denied | Allow | +|Principal | The accounts or user that will be allowed | arn:aws:iam::782985192762:user/docs-s3-user | +|Action | What operations are allowed on the bucket| s3:*| +|Resource | Which resources in the bucket will operations be allowed in | "arn:aws:s3:::ch-docs-s3-bucket", "arn:aws:s3:::ch-docs-s3-bucket/*" | + +:::note +You should work with your security team to determine the permissions to be used, consider these as a starting point. +For more information on Policies and settings, refer to AWS documentation: +https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-policy-language-overview.html +::: + +13. Save the policy configuration diff --git a/docs/integrations/data-ingestion/s3/index.md b/docs/integrations/data-ingestion/s3/index.md index 60b1034df4e..a2744346cf4 100644 --- a/docs/integrations/data-ingestion/s3/index.md +++ b/docs/integrations/data-ingestion/s3/index.md @@ -686,9 +686,7 @@ The following notes cover the implementation of S3 interactions with ClickHouse. ## Use S3 object storage as a ClickHouse disk {#configuring-s3-for-clickhouse-use} -If you need step-by-step instructions to create buckets and an IAM role, then expand **Create S3 buckets and an IAM role** and follow along: - - +If you need step-by-step instructions to create buckets and an IAM role, please refer to ["How to create an AWS IAM user and S3 bucket"](/integrations/s3/creating-iam-user-and-s3-bucket) ### Configure ClickHouse to use the S3 bucket as a disk {#configure-clickhouse-to-use-the-s3-bucket-as-a-disk} The following example is based on a Linux Deb package installed as a service with default ClickHouse directories. diff --git a/scripts/sed_links.sh b/scripts/sed_links.sh index 53edcbdbb1f..b93c213022e 100755 --- a/scripts/sed_links.sh +++ b/scripts/sed_links.sh @@ -24,6 +24,7 @@ if [[ "$OSTYPE" == "darwin"* ]]; then sed -i '' 's|(/cloud/security/secure-s3#access-your-s3-bucket-with-the-clickhouseaccess-role)|(/cloud/data-sources/secure-s3#access-your-s3-bucket-with-the-clickhouseaccess-role)|g' docs/sql-reference/table-functions/s3.md sed -i '' 's|(/cloud/security/secure-s3#access-your-s3-bucket-with-the-clickhouseaccess-role)|(/cloud/data-sources/secure-s3#access-your-s3-bucket-with-the-clickhouseaccess-role)|g' docs/sql-reference/table-functions/s3Cluster.md sed -i '' 's|(#cuttofirstsignificantsubdomaincustom)|(#cutToFirstSignificantSubdomainCustom)|g' docs/sql-reference/functions/url-functions.md + sed -i '' 's|(/cloud/data-sources/secure-s3#setup)|(/cloud/data-sources/secure-s3)|g' docs/sql-reference/table-functions/s3.md else # Linux sed -i 's|(../../quick-start\.mdx)|(/get-started/quick-start)|g' docs/operations/utilities/clickhouse-local.md @@ -37,4 +38,5 @@ else sed -i 's|(/cloud/security/secure-s3#access-your-s3-bucket-with-the-clickhouseaccess-role)|(/cloud/data-sources/secure-s3#access-your-s3-bucket-with-the-clickhouseaccess-role)|g' docs/sql-reference/table-functions/s3.md sed -i 's|(/cloud/security/secure-s3#access-your-s3-bucket-with-the-clickhouseaccess-role)|(/cloud/data-sources/secure-s3#access-your-s3-bucket-with-the-clickhouseaccess-role)|g' docs/sql-reference/table-functions/s3Cluster.md sed -i 's|(#cuttofirstsignificantsubdomaincustom)|(#cutToFirstSignificantSubdomainCustom)|g' docs/sql-reference/functions/url-functions.md + sed -i 's|(/cloud/data-sources/secure-s3#setup)|(/cloud/data-sources/secure-s3)|g' docs/sql-reference/table-functions/s3.md fi diff --git a/sidebars.js b/sidebars.js index d9ce715b91f..19730381211 100644 --- a/sidebars.js +++ b/sidebars.js @@ -754,7 +754,8 @@ const sidebars = { collapsible: true, items: [ "integrations/data-ingestion/s3/index", - "integrations/data-ingestion/s3/performance" + "integrations/data-ingestion/s3/performance", + "integrations/data-ingestion/s3/creating-an-s3-iam-role-and-bucket" ], }, "integrations/data-sources/postgres", diff --git a/static/images/_snippets/s3/2025/s3-1.png b/static/images/_snippets/s3/2025/s3-1.png new file mode 100644 index 00000000000..2c6fc201325 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-1.png differ diff --git a/static/images/_snippets/s3/2025/s3-10.png b/static/images/_snippets/s3/2025/s3-10.png new file mode 100644 index 00000000000..f812d3f3b65 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-10.png differ diff --git a/static/images/_snippets/s3/2025/s3-11.png b/static/images/_snippets/s3/2025/s3-11.png new file mode 100644 index 00000000000..2bb9591479f Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-11.png differ diff --git a/static/images/_snippets/s3/2025/s3-12.png b/static/images/_snippets/s3/2025/s3-12.png new file mode 100644 index 00000000000..d3f88f8f0d9 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-12.png differ diff --git a/static/images/_snippets/s3/2025/s3-13.png b/static/images/_snippets/s3/2025/s3-13.png new file mode 100644 index 00000000000..661217f9272 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-13.png differ diff --git a/static/images/_snippets/s3/2025/s3-14.png b/static/images/_snippets/s3/2025/s3-14.png new file mode 100644 index 00000000000..63ffe1561f6 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-14.png differ diff --git a/static/images/_snippets/s3/2025/s3-15.png b/static/images/_snippets/s3/2025/s3-15.png new file mode 100644 index 00000000000..6114e34a5cd Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-15.png differ diff --git a/static/images/_snippets/s3/2025/s3-16.png b/static/images/_snippets/s3/2025/s3-16.png new file mode 100644 index 00000000000..928d386fe66 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-16.png differ diff --git a/static/images/_snippets/s3/2025/s3-17.png b/static/images/_snippets/s3/2025/s3-17.png new file mode 100644 index 00000000000..3ae64c7e0f7 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-17.png differ diff --git a/static/images/_snippets/s3/2025/s3-18.png b/static/images/_snippets/s3/2025/s3-18.png new file mode 100644 index 00000000000..fd6c01bfcad Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-18.png differ diff --git a/static/images/_snippets/s3/2025/s3-19.png b/static/images/_snippets/s3/2025/s3-19.png new file mode 100644 index 00000000000..b9dafc192a5 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-19.png differ diff --git a/static/images/_snippets/s3/2025/s3-2.png b/static/images/_snippets/s3/2025/s3-2.png new file mode 100644 index 00000000000..d2897c935e7 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-2.png differ diff --git a/static/images/_snippets/s3/2025/s3-20.png b/static/images/_snippets/s3/2025/s3-20.png new file mode 100644 index 00000000000..1cbf3912c71 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-20.png differ diff --git a/static/images/_snippets/s3/2025/s3-21.png b/static/images/_snippets/s3/2025/s3-21.png new file mode 100644 index 00000000000..a30a78d53d6 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-21.png differ diff --git a/static/images/_snippets/s3/2025/s3-3.png b/static/images/_snippets/s3/2025/s3-3.png new file mode 100644 index 00000000000..615b50f5e4d Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-3.png differ diff --git a/static/images/_snippets/s3/2025/s3-4.png b/static/images/_snippets/s3/2025/s3-4.png new file mode 100644 index 00000000000..f6ccb954a67 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-4.png differ diff --git a/static/images/_snippets/s3/2025/s3-5.png b/static/images/_snippets/s3/2025/s3-5.png new file mode 100644 index 00000000000..722da67b614 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-5.png differ diff --git a/static/images/_snippets/s3/2025/s3-6.png b/static/images/_snippets/s3/2025/s3-6.png new file mode 100644 index 00000000000..a5b88ee8681 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-6.png differ diff --git a/static/images/_snippets/s3/2025/s3-7.png b/static/images/_snippets/s3/2025/s3-7.png new file mode 100644 index 00000000000..b0e1264c194 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-7.png differ diff --git a/static/images/_snippets/s3/2025/s3-8.png b/static/images/_snippets/s3/2025/s3-8.png new file mode 100644 index 00000000000..c45c21865c6 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-8.png differ diff --git a/static/images/_snippets/s3/2025/s3-9.png b/static/images/_snippets/s3/2025/s3-9.png new file mode 100644 index 00000000000..1e18103d1c9 Binary files /dev/null and b/static/images/_snippets/s3/2025/s3-9.png differ diff --git a/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/backup_in_s3_bucket.png b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/backup_in_s3_bucket.png new file mode 100644 index 00000000000..13c8742f17e Binary files /dev/null and b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/backup_in_s3_bucket.png differ diff --git a/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/create_new_role.png b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/create_new_role.png new file mode 100644 index 00000000000..7880b116f78 Binary files /dev/null and b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/create_new_role.png differ diff --git a/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/create_service.png b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/create_service.png new file mode 100644 index 00000000000..46a5b7ede79 Binary files /dev/null and b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/create_service.png differ diff --git a/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/custom_trust_policy.png b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/custom_trust_policy.png new file mode 100644 index 00000000000..2fc553aa101 Binary files /dev/null and b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/custom_trust_policy.png differ diff --git a/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/open_console.png b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/open_console.png new file mode 100644 index 00000000000..db5f7a05275 Binary files /dev/null and b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/open_console.png differ diff --git a/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/service_details.png b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/service_details.png new file mode 100644 index 00000000000..ecee6c317b0 Binary files /dev/null and b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/service_details.png differ diff --git a/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/service_role_id.png b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/service_role_id.png new file mode 100644 index 00000000000..f754418cbe9 Binary files /dev/null and b/static/images/cloud/onboard/migrate/oss_to_cloud_via_backup/service_role_id.png differ diff --git a/static/images/cloud/security/secures3_output.jpg b/static/images/cloud/security/secures3_output.jpg deleted file mode 100644 index 329a041b963..00000000000 Binary files a/static/images/cloud/security/secures3_output.jpg and /dev/null differ diff --git a/static/images/cloud/security/secures3_output.png b/static/images/cloud/security/secures3_output.png new file mode 100644 index 00000000000..e8326debbc3 Binary files /dev/null and b/static/images/cloud/security/secures3_output.png differ