'Selecting values from column defined by aggregate function
I'm currently struggling with a query and need some help with it.
I've got two tables:
messages {
ts_send,
message,
conversations_id
}
conversations {
id
}
I want to select the messages having the latest ts_send from each conversation. So if I got 3 conversations, I would end up with 3 messages.
I started writing following query but I got confused how I should compare the max(ts_send) for each conversation.
SELECT c.id, message, max(ts_send) FROM messages m
JOIN conversations c ON m.conversations_id = c.id
WHERE c.id IN ('.implode(',', $conversations_ids).')
GROUP by c.id
HAVING max(ts_send) = ?';
Maybe the query is wrong in general, just wanted to share my attempt.
Solution 1:[1]
SELECT c.id, m.message, m.ts_send
FROM conversations c LEFT JOIN messages m
ON c.id = m.conversations_id
WHERE m.ts_send =
(SELECT MAX(m2.ts_send)
FROM messages m2
WHERE m2.conversations_id = m.conversations_id)
The LEFT JOIN ensures that you have a row for each conversation, whether it has messages or not. It may be unnecessary if that is not possible in your model. in that case:
SELECT m.conversations_id, m.message, m.ts_send
FROM messages m
WHERE m.ts_send =
(SELECT MAX(m2.ts_send)
FROM messages m2
WHERE m2.conversations_id = m.conversations_id)
Solution 2:[2]
MySql optimises JOINs much better than correlated subqueries, so I'll walk through the join approach.
The first step is to get the maximum ts_send
per conversation:
SELECT conversations_id, MAX(ts_send) AS ts_send
FROM messages
GROUP BY conversations_id;
You then need to JOIN
this back to the messages table to get the actual message. The join on conversation_id and MAX(ts_send) ensures that only the latest message is returned for each conversation:
SELECT messages.conversations_id,
messages.message,
Messages.ts_send
FROM messages
INNER JOIN
( SELECT conversations_id, MAX(ts_send) AS ts_send
FROM messages
GROUP BY conversations_id
) MaxMessage
ON MaxMessage.conversations_id = messages.conversations_id
AND MaxMessage.ts_send = messages.ts_send;
The above should get you what you are after, unless you also need conversations returned where there have been no messages. In which case you will need to select from conversations
and LEFT JOIN to the above query:
SELECT conversations.id,
COALESCE(messages.message, 'No Messages') AS Message,
messages.ts_send
FROM conversations
LEFT JOIN
( SELECT messages.conversations_id,
messages.message,
Messages.ts_send
FROM messages
INNER JOIN
( SELECT conversations_id, MAX(ts_send) AS ts_send
FROM messages
GROUP BY conversations_id
) MaxMessage
ON MaxMessage.conversations_id = messages.conversations_id
AND MaxMessage.ts_send = messages.ts_send
) messages
ON messages.conversations_id = conversations.id;
EDIT
The latter option of selecting all conversations regardless of whether they have a message would be better achived as follows:
SELECT conversations.id,
COALESCE(messages.message, 'No Messages') AS Message,
messages.ts_send
FROM conversations
LEFT JOIN messages
ON messages.conversations_id = conversations.id
LEFT JOIN
( SELECT conversations_id, MAX(ts_send) AS ts_send
FROM messages
GROUP BY conversations_id
) MaxMessage
ON MaxMessage.conversations_id = messages.conversations_id
AND MaxMessage.ts_send = messages.ts_send
WHERE messages.ts_send IS NULL
OR MaxMessage.ts_send IS NOT NULL;
Thanks here goes to spencer7593, who suggested the above solution.
Solution 3:[3]
If you would just like the largest ts_send
from each unique conversations_id
, you can use this code:
SELECT *
FROM messages
WHERE CONCAT(conversations_id, '_', ts_send) IN ( SELECT CONCAT(conversations_id, '_', MAX(ts_send))
FROM messages
GROUP BY conversations_id );
What this code does is, it creates pairs of the conversations_id
, together with the largest ts_send
. It then compares that to all pairs in the entire table.
Solution 4:[4]
SELECT c.id, m.message, m.ts_send
FROM
messages m, conversations c,
(SELECT conversations_id, MAX(ts_send) as ts_send
from messages
group by conversations_id) s
where s.conversations_id=m.conversations_id and s.ts_send=m.ts_send and
c.id=m.conversations_id
Solution 5:[5]
I think this is what you are saying.
mysql> create table messages (ts_send int, message char(30), id int);
Query OK, 0 rows affected (0.01 sec)
mysql> create table conversations (id int);
Query OK, 0 rows affected (0.01 sec)
Now some data.
mysql> insert into conversations values ( 1), (2), (3), (4);
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> insert into messages values ( 4, 'abc', 1 ),
(5, 'pqr', 1),
(4, 'abc', 2),
(5, 'abc', 3),
(6, 'abc', 4);
Query OK, 5 rows affected (0.01 sec)
Records: 5 Duplicates: 0 Warnings: 0
And then the query.
mysql> select messages.id, message, ts_send
from messages
where ROW(id, ts_send) in
(select messages.id, max(ts_send)
from messages, conversations
where messages.id = conversations.id group by id);
+------+---------+---------+
| id | message | ts_send |
+------+---------+---------+
| 1 | pqr | 5 |
| 2 | abc | 4 |
| 3 | abc | 5 |
| 4 | abc | 6 |
+------+---------+---------+
4 rows in set (0.00 sec)
mysql>
Right?
* THIS WAS EDITED to reflect the comment from Marty McVry.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Community |
Solution 3 | Marty McVry |
Solution 4 | Hedinn |
Solution 5 |