Singer Based Sources
Singer is an open-source project that provides 100+ API connectors (so-called 'taps') to different platforms. Jitsu supports Singer as an of the connectors backend (the other one being Airbyte and native connectors
Singer configuration is a set of JSON objects (see specification):
Name | Description |
---|---|
Config (required) | JSON object contains authorization keys, account ids, start date (date for downloading since). JSON structure depends on the tap. |
Catalog | JSON object contains all streams (object types) and fields to download. If not provided, Jitsu will do discover and save catalog with all available streams. JSON structure is standardized, but stream and field names depend on the tap. |
State | JSON payload contains bookmarks that specify an initial state. It is used when you need to download not all data. |
Properties | Deprecated . JSON object contains streams and fields schema like Catalog. Used by some legacy taps (e.g. Facebook tap) |
General configuration#
Jitsu executes singer taps with Python within Virtual Environment. python3
and venv
should be installed
on a machine running Jitsu. If you're deploying Jitsu with Docker, that part can be skipped,
Configurations:
singer-bridge:
python: /path/to/python #Optional. Default value is 'python3'
venv_dir: /path/to/venv_dir #Optional. Default value is './venv'
log:
path: /home/eventnative/logs #or "global" constant for writing logs to stdout
rotation_min: 60 #Optional. Default value is 1440 (24 hours)
max_backups: 5 #Optional. Default value is 0 (no limit)
To add a source use following snippet:
sources:
...
jitsu_singer_facebook:
type: singer
destinations: [ "postgres_destination_id" ]
config:
tap: tap-facebook
config: /home/eventnative/data/config/facebook_config.json
properties: /home/eventnative/data/config/facebook_props.json
initial_state: /home/eventnative/data/config/facebook_initial_state.json
jitsu_singer_shopify:
type: singer
destinations: [ "clickhouse_destination_id" ]
config:
tap: tap-shopify
config: '{"config_key1":"value"}'
catalog: '{"field_1":"value"}'
JSON configuration parameters such as config
, catalog
, state
and properties
can be a raw JSON or JSON string or path to local JSON file
Table Names#
Jitsu creates tables with names $sourceID_$SingerStreamName
by default. For instance, table with name jitsu_singer_shopify_orders
will be created according to the following configuration:
sources:
...
jitsu_singer_shopify:
type: singer
destinations: [ "clickhouse_destination_id" ]
config:
tap: tap-shopify
config: ...
catalog: '{"streams":[{"stream": "orders", ...}]}'
Table names might be overridden by adding stream_table_names
configuration parameter:
sources:
...
jitsu_singer_shopify:
type: singer
destinations: [ "clickhouse_destination_id" ]
config:
tap: tap-shopify
config: ...
catalog: '{"streams":[{"stream": "orders", ...}, {"stream": "products", ...}]}'
stream_table_names:
orders: my_orders
products: my_products
Also, table name overriding might be a part of Singer catalog.json. Just add destination_table_name
string into each stream:
sources:
...
jitsu_singer_shopify:
type: singer
destinations: [ "clickhouse_destination_id" ]
config:
tap: tap-shopify
config: ...
catalog: '{"streams":[{"stream": "orders", "destination_table_name":"my_orders", ...}, {"stream": "products", "destination_table_name":"my_products", ...}]}'
In both examples table with names my_orders
and my_products
will be created.