Copy Data

Feedback


Copy data function is used to copy data to database. Currently only copying to HBase is supported. Supported data engines include UDB and GDB. The UDB engine contains local files and the files that have been registered to iServer, and supports creating field indexes, building pyramid on registered UDB datasets. If copying the dataset with the same name to the database multiple times, the default action is to add a new dataset.

To solve the problem of large-volume data copy failure, we propose "batch copy", which optimizes the process of copying data, i.e. supports batch copying to the database for the registered UDB dataset. The specific batch is as follows:

Point dataset:

  1. When the amount of data is less than 4 million, batch copying is not performed.

  2. When the amount of data is more than 4 million, 2 million are copied per batch. For example: 5 million points of data are copied in 3 batches.

Line dataset:

  1. When the amount of data is less than 2 million, batch copying is not performed.

  2. When the amount of data is more than 2 million, 1 million are copied per batch. For example: 3 million lines of data is copied in 3 batches.

Region dataset:

  1. When the amount of data is less than 1 million, batch copying is not performed.

  2. When the amount of data is more than 1 million, 500,000 are copied per batch. For example: 1.5 million regions of data is copied in 3 batches.

When creating a copying data task, you need to set the following parameters: