How it works
This page describes how Jitsu sources works internally. This page is still work in progress and only basic concepts are explained here
Data may be synchronized by time chunks (if data source supports data loading by time intervals) or all data is loaded together. This depends on the type of data source and defined at driver implementation (an entity that loads data). Jitsu stores information about synchronized chunks at meta.storage
(meta storage configuration is described at General Configuration). Time chunk is synchronized if
- it is not synchronized yet
- time chunk covers the current moment
- time chunk covers the previous period to the current one (in case some data is loaded after the period ends)
The result of synchronization is a replica of data from the data source with some enriched fields.
collection_id
contains the type of collection (see documentation on collections below)$server.unique_id_field
column name depends onserver.unique_id_field
configuration. (eventn_ctx_event_id
by default) - a hash of the synchronized objecttime_interval
field stores information about what synchronization intervalinterval_start
field stores information about start of synchronization intervalinterval_end
field stores information about the end of synchronization interval
Re-sync#
Once source is synced, Jitsu writes a sync state in Redis therefore new syncs won't load data that has been already synced. For re-sync source use:
- Clear Cache endpoint
- Once Jitsu's cache is cleared - schedule a new sync task