Understand the trade-offs between using a live connection and an extract

Understand the trade-offs between using a live connection and an extract

Understand the trade-offs between using a live connection and an extract

EXPLANATION

The purpose of this chapter is to guide you through a connection and data source management strategy. A strategy like this tries to answer a fairly simple question: for a given scenario, should your users access live data or should they use extracts?

First of all, some data sources will not allow extracts—they will only allow live connections. Obviously, if that’s the case, you don’t have to make a decision: use the live connection.

That easy choice aside, there’s a long answer to this simple question. As you experiment with different approaches and learn more about the many variables that have an impact data performance, access, freshness, and the ins and outs of specific databases, you’ll formulate an answer that works for your users in your organization.

However, we understand that you need to get your users connected to data today. So in this section we provide some guiding principles that you can use to make sound data access decisions as you roll out your shiny new Tableau Server.

Guiding principle: If performance is more important than data freshness, use an extract
Extracts are great for enabling flow for your data analysts. When an extract is embedded in a workbook, all of the data is already available to Tableau Server, which stores the extract in a high-performance database. This generally results in good performance. When users drag dimensions and measures, apply filters, and add visualizations, they see the results immediately. Because users are interacting with a snapshot of data and are not working directly with live data, the underlying source of data is not taxed as users analyze and visualize the data in Tableau.

A note about using Tableau Server for warehousing: if the workbooks that people in your organization are using are really hitting a database hard with repeated queries for fresh data, you might be tempted to use Tableau Server to host extracts in an attempt to offload queries from the relational databases that people are using. Generally, we don’t recommend using extracts just to offload queries. This isn’t an economical use of Tableau Server, which is designed for data analysis, not data warehousing. If you find that users are creating a lot of extracts because performance suffers when they use live connections to data, you should consider performance optimizations at the database rather than warehousing extracts on Tableau Server.

Guiding principle: If real-time data is required for business decisions, use a live connection
Many data analysis scenarios require real-time data. For example, finance operations that model transactions during trading hours usually require real-time data. Similarly, polling scenarios often require near real-time data freshness to provide quick analysis. Generally, if the data analyses that your users are working on require data freshness that is measured in minutes or seconds, workbooks should be built using a live connection.

Extracts can be refreshed frequently, but as we explained earlier, these updates can be processor-intensive and can slow the performance of the server. At the same time, heavy use of live connections, especially with complex workbooks, can stress traditional databases. Therefore, you’ll need to make sure that the Tableau Server processes are appropriately scaled for heavy use of live connections and your databases are up to the task of the query load from Tableau Server. (For information about tuning your server, see the Alerts, Monitoring, and Tuning chapter.)

Remember that the results from queries using live connections might be cached on the server. Therefore, if you require true real-time data, make sure that you configure the cache to refresh more often, as we discussed earlier under Configure data connection caching.

Guiding principle: If a workbook contains sensitive data, use a live connection
As we were saying earlier, you must decide whether you’ll allow users to embed credentials in workbooks and data sources when they publish. Your organization’s security and privacy policy should dictate whether you allow users to embed credentials.

If your organization enforces user-level permissions to databases, use a live connection for workbooks that connect to those databases. That way, users who interact with workbooks and data sources that require authentication will be prompted for credentials. For data sources that allow Run As User access, such as SQL Server, Microsoft Analysis Services, and Oracle, make sure that you’ve configured the Run As User account with appropriate access to the database resources.

Compare the performance of extracts and live connections
People often ask which is faster: an extract or a live connection? If you’ve read all the way through this, you understand that the answer is “it depends.”

In the end, the best way to answer this question is to build a workbook with a live connection to your database. In most cases, the performance differences are obvious as you build your workbook and view the results.

For more in-depth analysis, Tableau includes tools (more information below) that you can use to measure workbook performance on both Tableau Server and Tableau Desktop. Use those tools to profile the performance of the workbook that uses the live connection. When you’ve got that data, change the workbook to use an extract and then measure performance again.

 

Understand the trade-offs between using a live connection and an extract

Leave a Reply

Your email address will not be published. Required fields are marked *