The incremental key is a field that indicates the last update point (or value) for the rows in a data source. The benefits of using it in Panoply is to help speed up the data collection process. Panoply uses the incremental key's value to skip unchanged rows and fetch only the updated and new rows from the data source since the last successful data collection.
The incremental key is a feature you can configure on select Panoply data sources, specifically on most database data sources (Postgres, MySQL, MongoDB, etc.) and the Salesforce data source.
The Incremental Key article of the Panoply documentation provides more information about this feature.
Configuring an incremental key
To configure the incremental key for your data source, locate the Incremental Key field/option in its configuration page. It's usually under the Advanced tab (see screenshot on the previous section) or in the Advanced section of the General tab.
Next, depending on what data source you're using, you can either manually provide the field name or select a predefined one to use as the incremental key:
- For database data sources, you can specify any column from your origin table as the key. The suggested column type is either a timestamp or an incremental integer column, as these types are easily updated or incremented.
- For the Salesforce data source, you can only select the Updated Since field as the incremental key.
Lastly, you can also provide a starting value for your incremental key in the Optional starting value field. Panoply uses this field's value to determine the records to collect during data collection. The next section explains in detail the role of this field in the data collection process.
The data collection process
The steps below explain how Panoply collects data using an incremental key. To help you understand the process clearly, a sample table consisting of eight records is provided as an example. The table's ranked_at
column is used as the incremental key.
Once you've defined an incremental key and began collecting data, the data collection process is as follows:
-
Panoply first examines the Optional starting value field to determine what records get fetched from the origin table.
If no value is found in the Optional starting value field, all records from the origin table are collected.
If a value is present, Panoply uses it as the criteria or starting point for selecting what rows to fetch. The rows that will be fetched are those that have a higher or greater value in their incremental key column (
ranked_at
in our example) than the value in the Optional starting value field.Applying this in the sample records, having no optional starting value retrieves all eight records. However, if the starting value is, let's say,
15
, only five records are fetched because their incremental key column has a value higher than15
. - Once the relevant records are fetched from the origin table, Panoply saves them to your data source's destination table. These fetched records are either new ones or existing records that were updated since the last successful data collection, or a mixture of both. New records are inserted to the destination table while the updated ones are updated using the upsert mechanism.
- The data collection completes when all the fetched records are saved into the destination table without encountering any errors. The data source then keeps track of the last updated or maximum value of the incremental key field by storing it in the Optional starting value field. You will need to do a browser reload/refresh for the value to appear.
Based on our example, the value35
appears in the Optional starting value field because it is the maximum value among the fetched records.
Unless changed or edited, whatever the value in your data source's Optional starting value field during this point will be used for your next data collection.
NOTES:
- When using an incremental key with database data sources (Postgres, MySQL, etc.), you should only select/collect records from a single table.
- If you intend to provide an optional starting value, ensure that the format is correct, specifically for timestamp or date fields.
- The example we've provided uses a database data source for explaining the data collection process using an incremental key. Even so, the concept still applies to the Salesforce data source that uses a date/timestamp field as its incremental key.
Comments
0 comments
Please sign in to leave a comment.