Issues with SQL query using SUM
Issues with SQL query using SUM
The relevant part of my database schema looks like this (Ruby on Rails migration code, but should be easy to read):
create_table "team_memberships" do |t| t.integer "team_id" t.integer "user_id" end create_table "users" do |t| t.integer "id" t.string "slug" end create_table "performance_points" do |t| t.integer "id" t.integer "user_id", t.date "date", t.integer "points", t.integer "team_id" end
I want a query that returns a list of users sorted by the total amount of performance points they have received since a certain date. Note that one "performance_points" row does not equal one point, we need to sum the "points"
The query I have so far looks like this:
SELECT u.id, u.slug, SUM(pp.points) AS total FROM users u JOIN performance_points pp ON pp.user_id = u.id JOIN team_memberships tm ON tm.team_id = pp.team_id AND tm.user_id = pp.user_id WHERE (pp.date > '2015-08-02 13:57:14.042221') GROUP BY pp.id, u.id ORDER BY total DESC LIMIT 50
The first three results are:
"id","slug","total" 32369,"andreas-jensen-9de10dec-0f88-427f-b135-62cebea611c8",245 23752,"kenneth-kjaerstad",95 34179,"marius-mork-rydal",93
To check that results are correct I count the points for each user. However the second one seems to be wrong. I run this query with Kenneth's id:
SELECT SUM(performance_points.points) FROM performance_points WHERE performance_points.user_id = 23752 AND (date > '2015-08-02 13:57:14.042221')
I get: 84
. Looking at all Kenneth's performance points with:
SELECT performance_points.points FROM performance_points WHERE performance_points.user_id = 23752 AND (date > '2015-08-02 13:57:14.042221')
We get:
"points" -10 1 -2 95
-10 + 1 - 2 + 95 is indeed 84 so I dunno whats going on with the first query. Why is the total 95?
I'm running PostgreSQL version 9.3.5
Answer by minatverma for Issues with SQL query using SUM
Try below query and let us know the answer, if it works :
SELECT u.id, u.slug, SUM(pp.points) AS total FROM users u INNER JOIN (select user_id,date,team_id, SUM(points) as points from performance_points group by user_id,date,team_id) pp ON pp.user_id = u.id INNER JOIN (select team_id, user_id from team_memberships group by team_id, user_id) tm ON tm.team_id = pp.team_id AND tm.user_id = pp.user_id WHERE (pp.date > '2015-08-02 13:57:14.042221') GROUP BY u.id, u.slug ORDER BY total DESC LIMIT 50 ;
Answer by Hambone for Issues with SQL query using SUM
Without seeing all of your data it's a little hard to guess, but maybe a CTE to pre-process the performance points would do it:
with pp_totals as ( select user_id, sum (points) as points from performance_points where date > '2015-08-02 13:57:14.042221' group by user_id ) SELECT u.id, u.slug, pp.points AS total FROM users u JOIN pp_totals pp ON pp.user_id = u.id JOIN team_memberships tm ON tm.user_id = u.user_id ORDER BY pp.points DESC limit 50
If this doesn't do it, can you create a SQL Fiddle and post it to your question?
Answer by philipxy for Issues with SQL query using SUM
If slug
is unique per user:
SELECT u.id, u.slug, SUM(pp.points) AS total FROM users u JOIN performance_points pp ON u.id = pp.user_id WHERE pp.date > '2015-08-02 13:57:14.042221' GROUP BY u.id, u.slug ORDER BY total DESC LIMIT 50
Otherwise you can't SELECT
slug
because it's not a grouping column, so there are multiple values of it in each group. You want to GROUP BY user_id
in performance_points
to get total
per user_id
then JOIN
with users
to get slug
s.
SELECT id, slug, total FROM users JOIN ( SELECT user_id, SUM(points) AS total FROM performance_points WHERE date > '2015-08-02 13:57:14.042221' GROUP BY user_id) t ON id = user_id ORDER BY total DESC LIMIT 50
(It's not clear why you are JOIN
ing with team_membership
. Presumably performance_points (user_id,team_id)
is a foreign key into it, ie all such pairs are already in it.)
Answer by shawnt00 for Issues with SQL query using SUM
I took your query and added a filter to limited to a single user. You should now see four rows for user kenneth-kjaerstad:
SELECT u.id, u.slug, SUM(pp.points) AS total FROM users u JOIN performance_points pp ON pp.user_id = u.id JOIN team_memberships tm ON tm.team_id = pp.team_id AND tm.user_id = pp.user_id WHERE pp.date > '2015-08-02 13:57:14.042221' and u.id = 23752 GROUP BY pp.id, u.id
The problem was that the sort pushed all the other rows way down this list and you never saw that there were three others for him besides the one at the top of the ranking.
The reason is that your grouping is wrong as you just want a total per user. pp.id
should in fact be unique for every row in your results and it's pointless to have a group by on that column at all.
Also I'll note that there doesn't seem to be a purpose in your join to the team_memberships
table unless you need to guarantee that a team membership exists for each pairs of user and team ids from the points table. Here's the fix:
SELECT u.id, min(u.slug) as slug, SUM(pp.points) AS total FROM users u JOIN performance_points pp ON pp.user_id = u.id JOIN team_memberships tm ON tm.team_id = pp.team_id AND tm.user_id = pp.user_id WHERE pp.date > '2015-08-02 13:57:14.042221' GROUP BY u.id ORDER by total desc
This answer is essentially equivalent to @philipxy and @Hambone's. As you can see it's not strictly necessary to use some of the constructs they chose. Hopefully my explanation of what went wrong is helpful whichever approach you prefer.
Answer by David Pedersen for Issues with SQL query using SUM
I have discovered that there actually wasn't a problem with the query, but with the data. There were some users who were on multiple teams more than once and that have issues.
Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72
0 comments:
Post a Comment