Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/integrations/data-ingestion/data-ingestion-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ For more information check out the pages below:
| [BladePipe](/integrations/bladepipe) | A real-time end-to-end data integration tool with sub-second latency, boosting seamless data flow across platforms. |
| [dbt](/integrations/dbt) | Enables analytics engineers to transform data in their warehouses by simply writing select statements. |
| [dlt](/integrations/data-ingestion/etl-tools/dlt-and-clickhouse) | An open-source library that you can add to your Python scripts to load data from various and often messy data sources into well-structured, live datasets. |
| [Estuary](/integrations/estuary) | A right-time data platform that enables millisecond-latency ETL pipelines with flexible deployment options. |
| [Fivetran](/integrations/fivetran) | An automated data movement platform moving data out of, into and across your cloud data platforms. |
| [NiFi](/integrations/nifi) | An open-source workflow management software designed to automate data flow between software systems. |
| [Vector](/integrations/vector) | A high-performance observability data pipeline that puts organizations in control of their observability data. |
110 changes: 110 additions & 0 deletions docs/integrations/data-ingestion/etl-tools/estuary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
---
sidebar_label: 'Estuary'
slug: /integrations/estuary
description: 'Stream a variety of sources into ClickHouse with an Estuary integration'
title: 'Connect Estuary with ClickHouse'
doc_type: 'guide'
integration:
- support_level: 'partner'
- category: 'data_ingestion'
- website: 'https://estuary.dev'
keywords: ['estuary', 'data ingestion', 'etl', 'pipeline', 'data integration', 'clickpipes']
---

import PartnerBadge from '@theme/badges/PartnerBadge';

# Connect Estuary with ClickHouse

<PartnerBadge/>

[Estuary](https://estuary.dev/) is a right-time data platform that flexibly combines real-time and batch data in simple-to-setup ETL pipelines. With enterprise-grade security and deployment options, Estuary unlocks durable data flows from SaaS, database, and streaming sources to a variety of destinations, including ClickHouse.

Estuary connects with ClickHouse via the Kafka ClickPipe. You do not need to maintain your own Kafka ecosystem with this integration.

## Setup guide {#setup-guide}

**Prerequisites**

* An [Estuary account](https://dashboard.estuary.dev/register)
* One or more [**captures**](https://docs.estuary.dev/concepts/captures/) in Estuary that pull data from your desired source(s)
* A ClickHouse Cloud account with ClickPipe permissions

<VerticalStepper headerLevel="h3">

### Create an Estuary materialization {#1-create-an-estuary-materialization}

To move data from your source collections in Estuary to ClickHouse, you will first need to create a **materialization**.

1. In Estuary's dashboard, go to the [Destinations](https://dashboard.estuary.dev/materializations) page.

2. Click **+ New Materialization**.

3. Select the **ClickHouse** connector.

4. Fill out details in the Materialization, Endpoint, and Source Collections sections:

* **Materialization Details:** Provide a unique name for your materialization and choose a data plane (cloud provider and region)

* **Endpoint Config:** Provide a secure **Auth Token**

* **Source Collections:** Link an existing **capture** or select data collections to expose to ClickHouse

5. Click **Next** and **Save and Publish**.

6. On the materialization details page, note the full name for your ClickHouse materialization. This will look something like `your-tenant/your-unique-name/dekaf-clickhouse`.

Estuary will start streaming the selected collections as Kafka messages. ClickHouse can access this data via a Kafka ClickPipe using Estuary's broker details and the auth token you provided.

### Enter Kafka connection details {#2-enter-kafka-connection-details}

Set up a new Kafka ClickPipe with ClickHouse and enter connection details:

1. In your ClickHouse Cloud dashboard, select **Data sources**.

2. Create a new **ClickPipe**.

3. Choose **Apache Kafka** as your data source.

4. Enter Kafka connection details using Estuary's broker and registry information:

* Provide a name for your ClickPipe
* For the broker, use: `dekaf.estuary-data.com:9092`
* Leave authentication as the default `SASL/PLAIN` option
* For the user, enter your full materialization name from Estuary (such as `your-tenant/your-unique-name/dekaf-clickhouse`)
* For the password, enter the auth token you provided for your materialization

5. Toggle the schema registry option

* For your schema URL, use: `https://dekaf.estuary-data.com`
* The schema key will be the same as the broker user (your materialization name)
* The secret will be the same as the broker password (your auth token)

### Configure incoming data {#3-configure-incoming-data}

1. Select one of your Kafka **topics** (one of your data collections from Estuary).

2. Choose an **offset**.

3. ClickHouse will detect topic messages. You can continue to the **Parse information** section to configure your table information.

4. Choose to create a new table or load data into a matching existing table.

5. Map source fields to table columns, confirming column name, type, and whether it is nullable.

6. In the final **Details and settings** section, you can select permissions for your dedicated database user.

Once you're happy with your configuration, create your ClickPipe.

ClickHouse will provision your new data source and start consuming messages from Estuary. Create as many ClickPipes as you need to stream from all your desired data collections.

</VerticalStepper>

## Additional resources {#additional-resources}

For more on setting up an integration with Estuary, see Estuary's documentation:

* Reference Estuary's [ClickHouse materialization docs](https://docs.estuary.dev/reference/Connectors/materialization-connectors/Dekaf/clickhouse/).

* Estuary exposes data as Kafka messages using **Dekaf**. You can learn more about Dekaf [here](https://docs.estuary.dev/guides/dekaf_reading_collections_from_kafka/).

* To see a list of sources that you can stream into ClickHouse with Estuary, check out [Estuary's capture connectors](https://docs.estuary.dev/reference/Connectors/capture-connectors/).
1 change: 1 addition & 0 deletions sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -1022,6 +1022,7 @@ const sidebars = {
],
},
'integrations/data-ingestion/etl-tools/dlt-and-clickhouse',
'integrations/data-ingestion/etl-tools/estuary',
'integrations/data-ingestion/etl-tools/fivetran/index',
'integrations/data-ingestion/etl-tools/nifi-and-clickhouse',
'integrations/data-ingestion/etl-tools/vector-to-clickhouse',
Expand Down
Binary file added static/images/integrations/logos/estuary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.