To join or combine data from the main table with its subtables, you need to use the SQL JOIN
clause in your SELECT
queries. For example, if you have two tables named users
and users_address
, and would like to combine their data, you can run a query like this:
SELECT * FROM users JOIN users_phonenumbers ON users.id = users_address.users_id
The SQL join works because the primary key of the users
table (main/parent table), i.e., id
, matches the foreign key of the users_address
table (subtable/child table), i.e., users_id
. Therefore, to properly execute a join, you must know the main table's primary key and its matching foreign key in its subtables.
Why join tables?
The most common use case for joining the main table with its subtables in Panoply is to combine their data altogether. You might have tables that have been created from a data source that supports nested data. Some examples of these data sources are document-oriented NoSQL databases like MongoDB and Amazon DynamoDB, JSON files, or API-based data sources like Salesforce, Shopify, etc.
A primer on how Panoply deals with nested data
For example, let's say this is the data (JSON) that Panoply will ingest from your data source:
{ "firstName":"John", "lastName":"Smith", "isAlive":true, "age":27, "address":{ "streetAddress":"21 2nd Street", "city":"New York", "state":"NY", "postalCode":"10021-3100" }, "phoneNumbers":[ { "type":"home", "number":"212 555-1234" }, { "type":"office", "number":"646 555-4567" } ], "children": [], "spouse":null }
After ingesting, Panoply will create a total of three tables. One main table for the single object and two subtables for its address
and phoneNumbers
attributes.
Say the destination table for the nested data is specified as users
, then the names of the generated tables will be:
users
(main table);users_address
; andusers_phonenumbers
.
All you need to do next is to review the structure of these tables to determine what columns to use in your SQL join. By convention, the primary key column of the main table is named id
, and will also be the foreign key in its subtables, using this naming convention:
main/parent table name + "_id" suffix
In our example, the foreign key in the subtables is named users_id
.
Lastly, for a detailed guide on how Panoply deals with nested data, you can read these articles from the Panoply documentation:
Comments
0 comments
Please sign in to leave a comment.