<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>~iany/ Database</title><link>https://blog.iany.me/tags/database/</link><description>Recent content in Database «~iany/»</description><language>en-US</language><managingEditor>me@iany.me (Ian Yang)</managingEditor><webMaster>me@iany.me (Ian Yang)</webMaster><copyright>CC-BY-SA 4.0</copyright><lastBuildDate>Tue, 16 Jul 2013 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.iany.me/tags/database/index.xml" rel="self" type="application/rss+xml"/><item><title>ActiveRecord uniq, count and distinct</title><link>https://blog.iany.me/2013/07/active-record-uniq-count-and-distinct/</link><pubDate>Tue, 16 Jul 2013 00:00:00 +0000</pubDate><author>me@iany.me (Ian Yang)</author><guid>https://blog.iany.me/2013/07/active-record-uniq-count-and-distinct/</guid><description>&lt;p&gt;&lt;code&gt;ActiveRecord&lt;/code&gt; has two methods to remove duplicates. Method &lt;code&gt;uniq&lt;/code&gt; and option &lt;code&gt;distinct: true&lt;/code&gt; in method &lt;code&gt;count&lt;/code&gt;. I thought &lt;code&gt;uniq.count&lt;/code&gt; and &lt;code&gt;count(distinct: true)&lt;/code&gt; were identical. Indeed, &lt;code&gt;uniq.count&lt;/code&gt; still counts duplicates, and &lt;code&gt;count(distinct: true)&lt;/code&gt; must be used here.&lt;/p&gt;
&lt;p&gt;In simple words, use &lt;code&gt;uniq&lt;/code&gt; to get unique result set, use &lt;code&gt;count(distinct: true)&lt;/code&gt; to count unique result.&lt;/p&gt;
&lt;p&gt;For example, user has many activities, and I want to get all users having a specific type of activities:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-ruby"&gt;users = User.joins(:activities).where(
activities: { activity_type: 'purchase'}
)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Because a user may have multiple activities with the same type, the result above may contain duplicate users. Method &lt;code&gt;uniq&lt;/code&gt; can be used here to remove the duplicates:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;users = users.uniq
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But &lt;code&gt;users.uniq.count&lt;/code&gt; generates SQL like below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;SELECT DISTINCT COUNT(*) ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This SQL counts all records with duplicates, and apply &lt;code&gt;DISTINCT&lt;/code&gt; on the count, which has only one row. So &lt;code&gt;DISTINCT&lt;/code&gt; has no effect here.&lt;/p&gt;
&lt;p&gt;On the other hand, &lt;code&gt;users.count(distinct: true)&lt;/code&gt; generates SQL below, which removes duplicates first, then count the result.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;SELECT COUNT(DISTINCT users.id) ...
&lt;/code&gt;&lt;/pre&gt;</description><category domain="https://blog.iany.me/post/">Posts</category><category domain="https://blog.iany.me/tags/active-record/">Active Record</category><category domain="https://blog.iany.me/tags/database/">Database</category><category domain="https://blog.iany.me/tags/rails/">Rails</category></item></channel></rss>