Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 42 additions & 13 deletions src/current/_includes/molt/fetch-data-load-output.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
1. Check the output to observe `fetch` progress.

{% if page.name == "migrate-load-replicate.md" %}
<section class="filter-content" markdown="1" data-scope="oracle">
The following message shows the appropriate values for the `--backfillFromSCN` and `--scn` replication flags to use when [starting Replicator](#start-replicator):
<section class="filter-content" markdown="1" data-scope="postgres">
If you included the `--pglogical-replication-slot-name` and `--pglogical-publication-and-slot-drop-and-recreate` flags, a publication named `molt_fetch` is automatically created:

{% include_cached copy-clipboard.html %}
~~~
replication-only mode should include the following replicator flags: --backfillFromSCN 26685444 --scn 26685786
~~~ json
{"level":"info","time":"2025-02-10T14:28:11-05:00","message":"dropping and recreating publication molt_fetch"}
~~~
</section>
{% endif %}

A `starting fetch` message indicates that the task has started:

<section class="filter-content" markdown="1" data-scope="postgres">
Expand All @@ -21,47 +20,67 @@

<section class="filter-content" markdown="1" data-scope="mysql">
~~~ json
{"level":"info","type":"summary","num_tables":3,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-28","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}
{"level":"info","type":"summary","num_tables":3,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}
~~~
</section>

<section class="filter-content" markdown="1" data-scope="oracle">
~~~ json
{"level":"info","type":"summary","num_tables":3,"cdc_cursor":"26685786","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}
{"level":"info","type":"summary","num_tables":3,"cdc_cursor":"backfillFromSCN=26685444,scn=26685786","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}
~~~
</section>

`data extraction` messages are written for each table that is exported to the location in `--bucket-path`:

<section class="filter-content" markdown="1" data-scope="postgres oracle">
~~~ json
{"level":"info","table":"migration_schema.employees","time":"2025-02-10T14:28:11-05:00","message":"data extraction phase starting"}
~~~

~~~ json
{"level":"info","table":"migration_schema.employees","type":"summary","num_rows":200000,"export_duration_ms":1000,"export_duration":"000h 00m 01s","time":"2025-02-10T14:28:12-05:00","message":"data extraction from source complete"}
~~~
</section>

<section class="filter-content" markdown="1" data-scope="mysql">
~~~ json
{"level":"info","table":"public.employees","time":"2025-02-10T14:28:11-05:00","message":"data extraction phase starting"}
~~~

~~~ json
{"level":"info","table":"public.employees","type":"summary","num_rows":200000,"export_duration_ms":1000,"export_duration":"000h 00m 01s","time":"2025-02-10T14:28:12-05:00","message":"data extraction from source complete"}
~~~
</section>

`data import` messages are written for each table that is loaded into CockroachDB:

<section class="filter-content" markdown="1" data-scope="postgres">
~~~ json
{"level":"info","table":"migration_schema.employees","time":"2025-02-10T14:28:12-05:00","message":"starting data import on target"}
~~~

<section class="filter-content" markdown="1" data-scope="postgres">
~~~ json
{"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"0/43A1960","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}
~~~
</section>

<section class="filter-content" markdown="1" data-scope="mysql">
~~~ json
{"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}
{"level":"info","table":"public.employees","time":"2025-02-10T14:28:12-05:00","message":"starting data import on target"}
~~~

~~~ json
{"level":"info","table":"public.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}
~~~
</section>

<section class="filter-content" markdown="1" data-scope="oracle">
~~~ json
{"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"2358840","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}
{"level":"info","table":"migration_schema.employees","time":"2025-02-10T14:28:12-05:00","message":"starting data import on target"}
~~~

~~~ json
{"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"backfillFromSCN=26685444,scn=26685786","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}
~~~
</section>

Expand All @@ -75,7 +94,7 @@

<section class="filter-content" markdown="1" data-scope="mysql">
~~~ json
{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["public.employees","public.payments","public.payments"],"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
~~~

{% if page.name != "migrate-bulk-load.md" %}
Expand All @@ -90,6 +109,16 @@

<section class="filter-content" markdown="1" data-scope="oracle">
~~~ json
{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"2358840","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"backfillFromSCN=26685444,scn=26685786","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
~~~

{% if page.name != "migrate-bulk-load.md" %}
This message shows the appropriate values for the `--backfillFromSCN` and `--scn` flags to use when [starting Replicator](#start-replicator):

{% include_cached copy-clipboard.html %}
~~~
--backfillFromSCN 26685444
--scn 26685786
~~~
{% endif %}
</section>
4 changes: 2 additions & 2 deletions src/current/_includes/molt/fetch-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Cockroach Labs recommends monitoring the following metrics during data load:
You can also use the [sample Grafana dashboard](https://molt.cockroachdb.com/molt/cli/grafana_dashboard.json) to view the preceding metrics.

{% if page.name != "migrate-bulk-load.md" %}
{{site.data.alerts.callout_info}}
Metrics from the `replicator` process are enabled by setting the `--metricsAddr` [replication flag](#replication-flags), and are served at `http://{host}:{port}/_/varz`. <section class="filter-content" markdown="1" data-scope="oracle">To view Oracle-specific metrics from `replicator`, import [this Grafana dashboard](https://replicator.cockroachdb.com/replicator_oracle_grafana_dashboard.json).</section>
{{site.data.alerts.callout_success}}
For details on Replicator metrics, refer to [Replicator Metrics]({% link molt/replicator-metrics.md %}).
{{site.data.alerts.end}}
{% endif %}
107 changes: 98 additions & 9 deletions src/current/_includes/molt/fetch-schema-table-filtering.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,23 @@
MOLT Fetch can restrict which schemas (or users) and tables are migrated by using the following filter flags:
Use the following flags to filter the data to be migrated:

<section class="filter-content" markdown="1" data-scope="mysql">
| Filter type | Flag | Description |
|------------------------|----------------------------|--------------------------------------------------------------------------|
| Table filter | `--table-filter` | POSIX regex matching table names to include across all selected schemas. |
| Table exclusion filter | `--table-exclusion-filter` | POSIX regex matching table names to exclude across all selected schemas. |

{{site.data.alerts.callout_info}}
`--schema-filter` does not apply to MySQL sources because MySQL tables belong directly to the database specified in the connection string, not to a separate schema.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

{{site.data.alerts.end}}
</section>

<section class="filter-content" markdown="1" data-scope="postgres oracle">
| Filter type | Flag | Description |
|------------------------|----------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------|
| Schema filter | `--schema-filter` | [POSIX regex](https://wikipedia.org/wiki/Regular_expression) matching schema names to include; all matching schemas and their tables are moved. |
| Table filter | `--table-filter` | POSIX regex matching table names to include across all selected schemas. |
| Table exclusion filter | `--table-exclusion-filter` | POSIX regex matching table names to exclude across all selected schemas. |

{{site.data.alerts.callout_success}}
Use `--schema-filter` to migrate only the specified schemas, and refine which tables are moved using `--table-filter` or `--table-exclusion-filter`.
{{site.data.alerts.end}}
</section>

<section class="filter-content" markdown="1" data-scope="oracle">
When migrating from Oracle, you **must** include `--schema-filter` to name an Oracle schema to migrate. This prevents Fetch from attempting to load tables owned by other users. For example:
Expand All @@ -19,11 +28,91 @@ When migrating from Oracle, you **must** include `--schema-filter` to name an Or
</section>

{% if page.name != "migrate-bulk-load.md" %}
<section class="filter-content" markdown="1" data-scope="mysql">
{% include molt/fetch-table-filter-userscript.md %}
<section class="filter-content" markdown="1" data-scope="oracle">
#### Table filter userscript

When loading a subset of tables using `--table-filter`, you **must** provide a TypeScript userscript to specify which tables to replicate.

For example, the following `table_filter.ts` userscript filters change events to the specified source tables:

~~~ ts
import * as api from "replicator@v1";

// List the source tables (matching source names and casing) to include in replication
const allowedTables = ["EMPLOYEES", "PAYMENTS", "ORDERS"];

// Update this to your target CockroachDB database and schema name
api.configureSource("molt.migration_schema", {
dispatch: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
// Replicate only if the table matches one of the allowed tables
if (allowedTables.includes(meta.table)) {
let ret: Record<Table, Document[]> = {};
ret[meta.table] = [doc];
return ret;
}
// Ignore all other tables
return null;
},
deletesTo: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
// Optionally filter deletes the same way
if (allowedTables.includes(meta.table)) {
let ret: Record<Table, Document[]> = {};
ret[meta.table] = [doc];
return ret;
}
return null;
},
});
~~~

Pass the userscript to MOLT Replicator with the `--userscript` [flag](#replicator-flags):

~~~
--userscript table_filter.ts
~~~
</section>

<section class="filter-content" markdown="1" data-scope="oracle">
{% include molt/fetch-table-filter-userscript.md %}
<section class="filter-content" markdown="1" data-scope="mysql">
#### Table filter userscript

When loading a subset of tables using `--table-filter`, you **must** provide a TypeScript userscript to specify which tables to replicate.

For example, the following `table_filter.ts` userscript filters change events to the specified source tables:

~~~ ts
import * as api from "replicator@v1";

// List the source tables (matching source names and casing) to include in replication
const allowedTables = ["EMPLOYEES", "PAYMENTS", "ORDERS"];

// Update this to your target CockroachDB database and schema name
api.configureSource("molt.public", {
dispatch: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
// Replicate only if the table matches one of the allowed tables
if (allowedTables.includes(meta.table)) {
let ret: Record<Table, Document[]> = {};
ret[meta.table] = [doc];
return ret;
}
// Ignore all other tables
return null;
},
deletesTo: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
// Optionally filter deletes the same way
if (allowedTables.includes(meta.table)) {
let ret: Record<Table, Document[]> = {};
ret[meta.table] = [doc];
return ret;
}
return null;
},
});
~~~

Pass the userscript to MOLT Replicator with the `--userscript` [flag](#replicator-flags):

~~~
--userscript table_filter.ts
~~~
</section>
{% endif %}
41 changes: 0 additions & 41 deletions src/current/_includes/molt/fetch-table-filter-userscript.md

This file was deleted.

21 changes: 16 additions & 5 deletions src/current/_includes/molt/migration-create-sql-user.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Grant database-level privileges for schema creation within the target database:
GRANT ALL ON DATABASE defaultdb TO crdb_user;
~~~

Grant user privileges to create internal MOLT tables like `_molt_fetch_exceptions` in the public schema:
Grant user privileges to create internal MOLT tables like `_molt_fetch_exceptions` in the `public` CockroachDB schema:

{{site.data.alerts.callout_info}}
Ensure that you are connected to the target database.
Expand All @@ -25,16 +25,18 @@ Ensure that you are connected to the target database.
GRANT CREATE ON SCHEMA public TO crdb_user;
~~~

If you manually created the target schema (i.e., [`drop-on-target-and-recreate`](#table-handling-mode) will not be used), grant the following privileges on the schema:
If you manually defined the target tables (which means that [`drop-on-target-and-recreate`](#table-handling-mode) will not be used), grant the following privileges on the schema:

<section class="filter-content" markdown="1" data-scope="postgres oracle">
{% include_cached copy-clipboard.html %}
~~~ sql
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA migration_schema TO crdb_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA migration_schema
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO crdb_user;
~~~

Grant the same privileges for internal MOLT tables:
Grant the same privileges for internal MOLT tables in the `public` CockroachDB schema:
</section>

{% include_cached copy-clipboard.html %}
~~~ sql
Expand All @@ -47,18 +49,27 @@ Depending on the MOLT Fetch [data load mode](#data-load-mode) you will use, gran

#### `IMPORT INTO` privileges

Grant `SELECT`, `INSERT`, and `DROP` (required because the table is taken offline during the `IMPORT INTO`) privileges on all tables in the [target schema](#create-the-target-schema):
Grant `SELECT`, `INSERT`, and `DROP` (required because the table is taken offline during the `IMPORT INTO`) privileges on all tables being migrated:

<section class="filter-content" markdown="1" data-scope="postgres oracle">
{% include_cached copy-clipboard.html %}
~~~ sql
GRANT SELECT, INSERT, DROP ON ALL TABLES IN SCHEMA migration_schema TO crdb_user;
~~~
</section>

<section class="filter-content" markdown="1" data-scope="mysql">
{% include_cached copy-clipboard.html %}
~~~ sql
GRANT SELECT, INSERT, DROP ON ALL TABLES IN SCHEMA public TO crdb_user;
~~~
</section>

If you plan to use [cloud storage with implicit authentication](#cloud-storage-authentication) for data load, grant the `EXTERNALIOIMPLICITACCESS` [system-level privilege]({% link {{site.current_cloud_version}}/security-reference/authorization.md %}#supported-privileges):

{% include_cached copy-clipboard.html %}
~~~ sql
GRANT EXTERNALIOIMPLICITACCESS TO crdb_user;
GRANT SYSTEM EXTERNALIOIMPLICITACCESS TO crdb_user;
~~~

#### `COPY FROM` privileges
Expand Down
Loading
Loading