Grouping a timestamp field by date in Ruby On Rails / PostgreSQL

Recently I’ve been working a lot with dates and tonight I had a hard time finding information about how to group a model by date when the model only have a timestamp column.

Lets pretend we have a user model with a created_at attribute which is stored as a timestamp.

If we want to plot a simple graph showing signups per date this is the way to extract them:

1
User.count(:group => "DATE(created_at)")

Thats all there is to it. From this you will get an OrderedHash containing the date and count of users for each date.

One small thing to remember if you are using PostgreSQL. You will need the DATE() function in any ORDER/SELECT statements because PostgreSQL will only select/order by fields thats in the GROUP BY statement and this is a pretty healthy behavior that many databases (no names here) don’t care about. Think about it. If you are grouping all users by date and use * as selector how would the database know which username or email address to display?

This is what happens if you try to group by created_at without DATE() ->

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
>> User.count(:order => 'created_at DESC', :group => ["DATE(created_at)"])                                                
ActiveRecord::StatementInvalid: PGError: ERROR:  column "users.created_at" must appear in the GROUP BY clause or be used in an aggregate function
: SELECT count(*) AS count_all, DATE(created_at) AS date_created_at FROM "users"  GROUP BY DATE(created_at)  ORDER BY created_at DESC
        from /tmp/app/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract_adapter.rb:212:in `log'
        from /tmp/app/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:507:in `
execute'
        from /tmp/app/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:985:in `select_raw'

        from /tmp/app/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:972:in `select'
        from /tmp/app/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:7:in `
select_all_without_query_cache'
        from /tmp/app/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/query_cache.rb:62:in `select_all'

        from /tmp/app/vendor/rails/activerecord/lib/active_record/calculations.rb:255:in `execute_grouped_calculation'
        from /tmp/app/vendor/rails/activerecord/lib/active_record/calculations.rb:132:in `
calculate'
        from /tmp/app/vendor/rails/activerecord/lib/active_record/calculations.rb:130:in `catch'

        from /tmp/app/vendor/rails/activerecord/lib/active_record/calculations.rb:130:in `calculate'
        from /tmp/app/vendor/rails/activerecord/lib/active_record/calculations.rb:48:in `
count'
        from (irb):18

And this is the right way to do it:

1
2
>> User.count(:order => 'DATE(created_at) DESC', :group => ["DATE(created_at)"])
=> #<OrderedHash {"2009-10-14"=>22, "2009-10-25"=>4, "2009-11-04"=>8, "2009-10-15"=>9, "2009-10-26"=>16, "2009-11-05"=>9, "2009-10-16"=>193, "2009-10-27"=>14, "2009-11-06"=>9, "2009-10-17"=>49, "2009-10-28"=>15, "2009-11-07"=>6, "2009-10-18"=>36, "2009-10-29"=>8, "2009-10-19"=>116, "2009-10-30"=>15>

As you may notice the Hash does not look ordered. But if you loop over it you will get them in order:

1
2
3
4
5
6
7
8
9
10
11
12
>> User.count(:order => 'DATE(created_at) DESC', :group => ["DATE(created_at)"]).each {|u| puts "#{u[0]} -> #{u[1]}" }
2009-11-07 -> 6
2009-11-06 -> 9
2009-11-05 -> 9
2009-11-04 -> 8
2009-11-03 -> 14
2009-11-02 -> 20
2009-11-01 -> 10
2009-10-31 -> 6
2009-10-30 -> 15
2009-10-29 -> 8
..

And thats how you group a timestamp field with just its date part.

Cheers!

Posted in Databases, PostgreSQL, Rails, Ruby at November 8th, 2009. Trackback URI: trackback Tags: , , , , Written by: 
  • sarvesh

    Hi,

    I am new to postgresql.
    It saved my time. Thanks for the post.

    Regards
    Sarvesh.N

  • Koen

    Hi,

    This looks great but what happens if you don’t have a signup on a given day?; the query should return ’0′ for that date but I’m afraid that with your example that’s not going to work. Or am I missing something?

  • Mathias Stjernström

    Hi Koen.

    You are not missing something and you are correct. I usually solve this in the presentation layer, letting the database just return the data it has. I’am not even sure you could solve this without some kind of loop or subquery for each date, but I think its faster to solve it while presenting the data.

    Thanks.

It's past my bedtime is using WP-Gravatar