Approximate COUNT DISTINCT

We all have written queries that use COUNT DISTINCT to get the unique number of non-NULL values from a table. This process can generate a noticeable performance hit especially for larger tables with millions of rows. Many times, there is no way around this. To help mitigate this overhead SQL Server 2019 introduces us to approximating the distinct count with the new APPROX_COUNT_DISTINCT function. The function approximates the count within a 2% precision to the actual answer at a fraction of the time.

Let’s see this in action.
In this example, I am using the AdventureworksDW2016CTP3 sample database which you can download here.
    SET STATISTICS IO ON 
    SELECT COUNT(DISTINCT([SalesOrderNumber])) as DISTINCTCOUNT 
    FROM [dbo].[FactResellerSalesXL_PageCompressed] 


SQL Server Execution Times
CPU time = 3828 ms, elapsed time = 14281 ms.
    SELECT APPROX_COUNT_DISTINCT ( [SalesOrderNumber]) as APPROX_DISTINCTCOUNT 
    FROM [dbo].[FactResellerSalesXL_PageCompressed] 


SQL Server Execution Times
CPU time = 7390 ms, elapsed time = 4071 ms.

APPROX_COUNT_DISTINCT Function In SQL

You can see the elapsed time is significantly lower! Great improvement using this new function.

The first time I did this, I did it wrong. A silly typo with a major result difference. So take a moment and learn from my mistake.

Note that I use COUNT(DISTINCT(SalesOrderNumber)) not DISTINCT COUNT (SalesOrderNumber ). This makes all the difference. If you do it wrong, the numbers will be way off as you can see from the below result set. You’ll also find that the APPROX_DISTINCTCOUNT will return much slower than the Distinct Count; which is not expected.

APPROX_COUNT_DISTINCT Function In SQL

Remember COUNT(DISTINCT expression) evaluates the expression for each row in a group and returns the number of unique, non-null values, which is what APPROX_COUNT_DISTINCT does. DISTINCT COUNT (expression) just returns a row count of the expression, there is nothing DISTINCT about it.

HostForLIFE.eu SQL Server 2016 Hosting
HostForLIFE.eu is European Windows Hosting Provider which focuses on Windows Platform only. We deliver on-demand hosting solutions including Shared hosting, Reseller Hosting, Cloud Hosting, Dedicated Servers, and IT as a Service for companies of all sizes.