This is a Jitsu.Classic documentation. For the lastest version, please visit docs.jitsu.com. Read about differences here.
Apify Dataset
Overview
Apify is a web scraping and web automation platform providing both ready-made and custom solutions, an open-source SDK for web scraping, proxies, and many other tools to help you build and run web automation jobs at scale. The results of a scraping job are usually stored in Apify Dataset. This connector allows you to automatically sync the contents of a dataset to your chosen destination. To sync data from a dataset, all you need to know is its ID. You will find it in Apify console under storages.The source is using Airbyte docker image (@airbyte/source-apify-dataset). Learn more how Airbyte-based sources work
How to connect
Obtain Apify Dataset ID.Connection Parameters
Parameter | Documentation |
---|---|
datasetId *string (required) | ID of the dataset you would like to load to Airbyte. |
clean boolean (not required) | If set to true, only clean items will be downloaded from the dataset. See description of what clean means in Apify API docs. If not sure, set clean to false. |