Skip to main content

Configuration

Create your conf.yaml configuration file to source your production database.

encryption_key: $MY_PRIVATE_ENC_KEY # optional - encrypt data on datastore
source:
connection_uri: postgres://user:password@host:port/db # you can use $DATABASE_URL
datastore:
aws:
bucket: $BUCKET_NAME
region: $S3_REGION
credentials:
access_key_id: $ACCESS_KEY_ID
secret_access_key: $AWS_SECRET_ACCESS_KEY
destination:
connection_uri: postgres://user:password@host:port/db # you can use $DATABASE_URL
info

Environment variables are substituted by their value at runtime. An error is thrown if the environment variable does not exist.

Run the app for the source:

replibyte -c conf.yaml

Source and Destination

Replibyte supports multiple databases.

Transformer

A transformer is useful to change/hide the value of a specified column. Replibyte provides pre-made transformers. You can also build your own Transformer in web assembly.

Here is a list of all the transformers available.

iddescriptiondoc
transientDoes not modify the valuelink
randomRandomize value but keep the same length (string only). [AAA]->[BBB]link
first-nameReplace the string value by a first namelink
emailReplace the string value by an email addresslink
keep-first-charKeep only the first char for strings and digit for numberslink
phone-numberReplace the string value by a phone numberlink
credit-cardReplace the string value by a credit card numberlink
redactedObfuscate your sensitive data (>3 characters strings only). [4242 4242 4242 4242]->[424**]link

Datastore

A Datastore is where Replibyte store the created dump to make them accessible from the destination databases.

Cloud Service ProviderS3 service nameS3 compatible
Amazon Web ServicesS3Yes (Original)
Google Cloud PlatformCloud StorageYes
Microsoft AzureBlob StorageYes
Digital OceanSpacesYes
ScalewayObject StorageYes
MinioObject StorageYes
info

Any datastore compatible with the S3 protocol is a valid datastore.

Example

Here is a configuration file including some transformations and different options like the database subset.

encryption_key: $MY_PRIVATE_ENC_KEY # optional - encrypt data on datastore
source:
connection_uri: postgres://user:password@host:port/db # you can use $DATABASE_URL
database_subset: # optional - downscale database while keeping it consistent
database: public
table: orders
strategy_name: random
strategy_options:
percent: 50
passthrough_tables:
- us_states
transformers: # optional - hide sensitive data
- database: public
table: employees
columns:
- name: last_name
transformer_name: random
- name: birth_date
transformer_name: random-date
- name: first_name
transformer_name: first-name
- name: email
transformer_name: email
- name: username
transformer_name: keep-first-char
- database: public
table: customers
columns:
- name: phone
transformer_name: phone-number
only_tables: # optional - dumps only specified tables.
- database: public
table: orders
- database: public
table: customers
datastore:
aws:
bucket: $BUCKET_NAME
region: $S3_REGION
credentials:
access_key_id: $ACCESS_KEY_ID
secret_access_key: $AWS_SECRET_ACCESS_KEY
destination:
connection_uri: postgres://user:password@host:port/db # you can use $DATABASE_URL