Run Bulk Operations from Your Cluster

The CockroachCloud tiers offer different levels of support for the following bulk operations. This page provides information on the availability of these operations in each CockroachCloud cluster tier and examples.

The examples below include details on the storage options available with each of the CockroachCloud tiers.

Examples

For guidance on connecting to your CockroachCloud Free (beta) cluster, visit Connect to a CockroachCloud Free (beta) Cluster.

In CockroachCloud Free (beta) clusters, userfile, a per-user bulk file storage, is the only available storage option for BACKUP, RESTORE, and IMPORT operations.

Note:

userfile is only available as storage for BACKUP, RESTORE, and IMPORT operations on CockroachCloud Free (beta) after upgrading to v21.1.

For information on userfile commands, visit the following pages:

Backup and restore with userfile

We recommend starting backups from a time at least 10 seconds in the past using AS OF SYSTEM TIME. Read our guidance in the Performance section on the BACKUP page.

Note:

Only database and table-level backups are possible when using userfile as storage. Restoring cluster-level backups will not work because userfile data is stored in the defaultdb database, and you cannot restore a cluster with existing table data.

Database and table

When working on the same cluster, userfile storage allows for database and table-level backups.

First, run the following statement to backup a database to a directory in the default userfile space:

BACKUP DATABASE bank TO 'userfile://defaultdb.public.userfiles_$user/bank-backup' AS OF SYSTEM TIME '-10s';

This directory will hold the files that make up a backup; including the manifest file and data files.

Note:

When backing up from a cluster and restoring a database or table that is stored in your userfile space to a different cluster, you can run cockroach userfile get to download the backup files to a local machine and cockroach userfile upload --url {CONNECTION STRING} to upload to the userfile of the alternate cluster.

In cases when your database needs to be restored, run the following:

RESTORE DATABASE bank FROM 'userfile://defaultdb.public.userfiles_$user/bank-backup';

It is also possible to run userfile:///bank-backup as userfile:/// refers to the default path userfile://defaultdb.public.userfiles_$user/.

Once the backup data is no longer needed, delete from the userfile storage:

cockroach userfile delete bank-backup --url {CONNECTION STRING}

If you use cockroach userfile delete {file}, it will take as long as the garbage collection to be removed from disk.

To resolve database or table naming conflicts during a restore, see Troubleshooting naming conflicts.

Import data into your CockroachCloud Free (beta) cluster

To import a table from userfile, use the following command:

icon/buttons/copy
IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
   CSV DATA ('userfile:///test-data.csv');

userfile:/// references the default path (userfile://defaultdb.public.userfiles_$user/).

        job_id       |  status   | fraction_completed |  rows  | index_entries |  bytes
---------------------+-----------+--------------------+--------+---------------+-----------
  599865027685613569 | succeeded |                  1 | 300024 |             0 | 13389972
(1 row)

For more import options, see IMPORT.

Stream data out of your CockroachCloud Free (beta) cluster

Core changefeeds stream row-level changes to a client until the underlying SQL connection is closed.

Note:

Only core changefeeds are available on CockroachCloud Free (beta). To create a changefeed into a configurable sink, like cloud storage or Kafka, use CockroachCloud, which has this feature enabled by default.

To create a core changefeed in CockroachCloud Free (beta), use the following example.

In this example, you'll set up a core changefeed on your CockroachCloud Free (beta) cluster.

  1. As the root user, open the built-in SQL client:

    icon/buttons/copy
    cockroach sql --url {CONNECTION STRING} --format=csv
    
    Note:

    Because core changefeeds return results differently than other SQL statements, they require a dedicated database connection with specific settings around result buffering. In normal operation, CockroachDB improves performance by buffering results server-side before returning them to a client; however, result buffering is automatically turned off for core changefeeds. Core changefeeds also have different cancellation behavior than other queries: they can only be canceled by closing the underlying connection or issuing a CANCEL QUERY statement on a separate connection. Combined, these attributes of changefeeds mean that applications should explicitly create dedicated connections to consume changefeed data, instead of using a connection pool as most client drivers do by default.

    Note:

    To determine how wide the columns need to be, the default table display format in cockroach sql buffers the results it receives from the server before printing them to the console. When consuming core changefeed data using cockroach sql, it's important to use a display format like csv that does not buffer its results. To set the display format, use the --format=csv flag when starting the built-in SQL client, or set the \set display_format=csv option once the SQL client is open.

  2. Enable the kv.rangefeed.enabled cluster setting:

    icon/buttons/copy
    > SET CLUSTER SETTING kv.rangefeed.enabled = true;
    
  3. Create table foo:

    icon/buttons/copy
    > CREATE TABLE foo (a INT PRIMARY KEY);
    
  4. Insert a row into the table:

    icon/buttons/copy
    > INSERT INTO foo VALUES (0);
    
  5. Start the core changefeed:

    icon/buttons/copy
    > EXPERIMENTAL CHANGEFEED FOR foo;
    
    table,key,value
    foo,[0],"{""after"": {""a"": 0}}"
    
  6. In a new terminal, add another row:

    icon/buttons/copy
    cockroach sql --url {CONNECTION STRING} -e "INSERT INTO foo VALUES (1)"
    
  7. Back in the terminal where the core changefeed is streaming, the following output has appeared:

    foo,[1],"{""after"": {""a"": 1}}"
    

    Note that records may take a couple of seconds to display in the core changefeed.

  8. To stop streaming the changefeed, enter CTRL+C into the terminal where the changefeed is running.

For further information on changefeeds, read Stream Data Out of CockroachDB and CHANGEFEED FOR.

See also

For guidance on connecting to your CockroachCloud cluster, visit Connect to Your CockroachCloud Cluster.

The examples below use Amazon S3 for demonstration purposes. For guidance on connecting to other storage options or using other authentication parameters, read Use Cloud Storage for Bulk Operations.

Backup and restore your CockroachCloud data

Cockroach Labs runs full backups daily and incremental backups hourly for every CockroachCloud cluster. The full backups are retained for 30 days, while incremental backups are retained for 7 days. For more information, read Restore Data From a Backup.

The following examples show how to run manual backups and restores:

Backup a cluster

To take a full backup of a cluster:

icon/buttons/copy
> BACKUP INTO \
's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

Backup a database

To take a full backup of a single database:

icon/buttons/copy
> BACKUP DATABASE bank \
INTO 's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

To take a full backup of multiple databases:

icon/buttons/copy
> BACKUP DATABASE bank, employees \
INTO 's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

Backup a table or view

To take a full backup of a single table or view:

icon/buttons/copy
> BACKUP bank.customers \
INTO 's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

To resolve database or table naming conflicts during a restore, see Troubleshooting naming conflicts.

View the backup subdirectories

New in v21.1: BACKUP ... INTO adds a backup to a collection within the backup destination. The path to the backup is created using a date-based naming scheme. To view the backup paths in a given destination, use SHOW BACKUPS:

icon/buttons/copy
> SHOW BACKUPS IN 's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}';
        path
------------------------
2021/03/23-213101.37
2021/03/24-172553.85
2021/03/24-210532.53
(3 rows)

When you restore a backup, add the backup's subdirectory path (e.g., 2021/03/23-213101.37) to the storage URL.

Restore a cluster

To restore a full cluster:

icon/buttons/copy
> RESTORE FROM 's3://{BUCKET NAME}/{path/to/backup/subdirectory}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}';

To view the available subdirectories, use SHOW BACKUPS.

Restore a database

To restore a database:

icon/buttons/copy
> RESTORE DATABASE bank FROM 's3://{BUCKET NAME}/{path/to/backup/subdirectory}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}';

To view the available subdirectories, use SHOW BACKUPS.

Note:

RESTORE DATABASE can only be used if the entire database was backed up.

Restore a table

To restore a single table:

icon/buttons/copy
> RESTORE TABLE bank.customers FROM 's3://{BUCKET NAME}/{path/to/backup/subdirectory}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}';

To restore multiple tables:

icon/buttons/copy
> RESTORE TABLE bank.customers, bank.accounts FROM 's3://{BUCKET NAME}/{path/to/backup/subdirectory}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}';

To view the available subdirectories, use SHOW BACKUPS.

For more information on taking backups and restoring to your cluster, read the following pages:

Import data into your CockroachCloud cluster

To import a table into your cluster:

icon/buttons/copy
> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('s3://{BUCKET NAME}/{customer-data}?AWS_ACCESS_KEY_ID={ACCESS KEY}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}')
;

Read the IMPORT page for more examples and guidance.

Export data out of CockroachCloud

The following example exports the customers table from the bank database into a cloud storage bucket in CSV format:

EXPORT INTO CSV
  's3://{BUCKET NAME}/{customer-export-data}?AWS_ACCESS_KEY_ID={ACCESS KEY}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}'
  WITH delimiter = '|' FROM TABLE bank.customers;

Read the EXPORT page for more examples and guidance.

Stream data out of CockroachCloud

Change data capture (CDC) provides efficient, distributed, row-level changefeeds into a configurable sink for downstream processing such as reporting, caching, or full-text indexing.

A changefeed targets an allowlist of tables, called "watched rows". Each change to a watched row is emitted as a record to a configurable sink, like Kafka or a cloud storage sink. You can manage your changefeeds with create, pause, resume, or cancel in this version of CockroachCloud.

Create a changefeed connected to Kafka

icon/buttons/copy
> CREATE CHANGEFEED FOR TABLE name, name2, name3
  INTO 'kafka://host:port'
  WITH updated, resolved;
+--------------------+
|       job_id       |
+--------------------+
| 360645287206223873 |
+--------------------+
(1 row)
Note:

Currently, changefeeds connected to Kafka versions < v1.0 are not supported in CockroachDB v21.1.

For more information on how to create a changefeed connected to Kafka, see Stream Data Out of CockroachDB Using Changefeeds and CREATE CHANGEFEED.

Create a changefeed connected to a cloud storage sink

Warning:

This is an experimental feature. The interface and output are subject to change.

icon/buttons/copy
> CREATE CHANGEFEED FOR TABLE name, name2, name3
  INTO 'experimental-s3://host?parameters'
  WITH updated, resolved;
+--------------------+
|       job_id       |
+--------------------+
| 360645287206223873 |
+--------------------+
(1 row)

For more information on how to create a changefeed connected to a cloud storage sink, see Stream Data Out of CockroachDB Using Changefeeds and CREATE CHANGEFEED.

See also

YesYes NoNo