ActiveRecord uniq, count and distinct

ActiveRecord has two methods to remove duplicates. Method uniq and option distinct: true in method count. I thought uniq.count and count(distinct: true) were identical. Indeed, uniq.count still counts duplicates, and count(distinct: true) must be used here.

In simple words, use uniq to get unique result set, use count(distinct: true) to count unique result.

For example, user has many activities, and I want to get all users having a specific type of activities:

users = User.joins(:activities).where(
  activities: { activity_type: 'purchase'}
)

Because a user may have multiple activities with the same type, the result above may contain duplicate users. Method uniq can be used here to remove the duplicates:

users = users.uniq

But users.uniq.count generates SQL like below:

SELECT DISTINCT COUNT(*) ...

This SQL counts all records with duplicates, and apply DISTINCT on the count, which has only one row. So DISTINCT has no effect here.

On the other hand, users.count(distinct: true) generates SQL below, which removes duplicates first, then count the result.

SELECT COUNT(DISTINCT users.id) ...
comments powered by Disqus