What is the use of referential integrity in tableau?
Tableau can handle a large amount of data and I’m pretty sure that at one point or another we all have become frustrated at the amount of time it takes the Tableau reports to load. Referential Integrity is one way in which the Tableau processing time can be reduced.
What is referential integrity?
Referential integrity is a concept which ensures that relationships between databases/tables remain consistent, i.e. it ensures that the references to data are valid. In a database, a relationship between two tables is created by using a foreign key. The table containing the primary key is the referencing table and the table containing the foreign key is the referenced table. There are certain techniques that can be applied to ensure that changes made to the linked table are reflected in the primary table.
Referential integrity in Tableau:
In Tableau workbooks, containing data sources with more than one table, the performance of queries can be improved by selecting the option to Assume Referential Integrity from the Datamenu. On selecting this option, Tableau will include the joined tables in the query only if the fields in the table are referenced in the view. The Assume Referential Integrity option in Tableau can only affect the performance from Tableau’s end.
Let’s see how referential integrity actually works in Tableau. Consider two tables Sales and Product Catalog, as given below:
Since, every product that is sold must have a listing in the Product Catalog, every row in the Sales table should have a matching row in the Product Catalog table.
If we join the above two tables on Product ID, we will get the following joined table:
Assume that the data in the above table is for the sales in a particular region, say ‘India’. Hence, if we are trying to build a view to show the total sales amount by region, then for the region ‘India’ the total sales will be the sum of ‘Sale Amount’ in the above table.
When we drag and drop, ‘Sale Amount’ field into the view, the query created will be like:
SELECT SUM ([Sales Amount]) FROM [Sales] S INNER JOIN [Product Catalog] P ON S.ProductID = P.ProductID
Now, if we select the option Assume Referential Integrity, we are telling Tableau that the joined tables have referential integrity. Because of this, Tableau understands that it does not need any rows from the Product Catalog table and hence modifies the query to obtain the total sales amount from the Sales table. The modified query may look like this:
SELECT SUM ([Sales Amount]) FROM [Sales]
The modified query does not have any joins, thereby reducing the query processing time. Referential Integrity does not work with data sources that contain only one table.
Steps to implement referential integrity in Tableau:
Step 1: Select a data source for which you would like Tableau to assume referential integrity
Step 2: From the Data menu, select Assume Referential Integrity
To find clusters in a view in Tableau, follow these steps.
Create a view.
Drag Cluster from the Analytics pane into the view, and drop it on in the target area in the view:
You can also double-click Cluster to find clusters in the view.
Two things happen when you drop or double-click Cluster:
Tableau adds Clusters on Color, and colors the marks in your view by cluster.
Tableau displays a Clusters dialog box, where you can customize the cluster.
Customize the cluster results by doing either of the following in the Clusters dialog box:
Drag new fields from the Data pane into the Variables area of the Clusters dialog box.
When you add variables, measures are aggregated using the default aggregation for the field; dimensions are aggregated using ATTR, which is the standard way that Tableau aggregates dimensions.
Specify the number of clusters. If you do not specify a value, Tableau will go as high as 25 clusters in trying to determine the number of clusters. If you specify a value, you can choose any number between 2 and 50.
When you finish customizing the cluster results, click the X in the upper-right corner of the Clusters dialog box to close it:
Note: You can move the cluster field from Color to another shelf in the view. However, you cannot move the cluster field from the Filters shelf to the Data pane.
To edit Clusters you have previously created, right-click (Control-click on a Mac) the Clusters field on Color and choose Edit clusters.
For an example showing the process of creating clusters with sample data (world economic indicators),