'What is the optimized/best way to retrieve data from two tables?
I have two tables:
post
table:
|post_id | post_title |
+--------+------------+
| 1 | Post 1 |
| 2 | Post 2 |
| 3 | Post 3 |
post_creator
table:
|post_id | creator |
+--------+---------+
| 1 | John |
| 1 | Smith |
| 1 | Mike |
| 2 | Bob |
| 3 | Peter |
| 3 | Brad |
When I join these tables it looks like this.
SELECT *
FROM post p
JOIN post_creator c ON p.post_id = c.post_id
|post_id | post_title | post_id | creator|
+----------------------------------------+
| 1 | Post 1 | 1 | John |
| 1 | Post 1 | 1 | Smith |
| 1 | Post 1 | 1 | Mike |
| 2 | Post 2 | 2 | Bob |
| 3 | Post 3 | 3 | Peter |
| 3 | Post 3 | 3 | Brad |
I want to grab each post with it's creators. But in this case my joined result has same post repeated again and again because of the creator.
What I did was first I fetched all data from post table. Then I looped that result and inside the loop I fetched all creators of each posts. But in this case it query again and again for each content to get the creators.
$sql = "SELECT * FROM post";
$stmt = $conn->prepare($sql);
$stmt->execute();
$res = $stmt->fetchAll(PDO::FETCH_OBJ);
$dataObj = new stdClass;
$dataArr = [];
foreach($res as $post){
$sql = "SELECT creator FROM post_creator WHERE post_id=$post->post_id";
$stmt = $conn->prepare($sql);
$stmt->execute();
$creators = $stmt->fetchAll(PDO::FETCH_OBJ);
$dataObj->post_id = $post->post_id
$dataObj->post_title = $post->title
$dataObj->creators = $creators;
array_push($dataArr, $dataObj);
}
So finally my dataArr
has this kind of a structure.
[
{
post_id: 1,
post_title: Post 1,
creators:[John, Smith, Mike]
},
{
post_id: 2,
post_title: Post 2,
creators:[Bob]
},
{
post_id: 2,
post_title: Post 1,
creators:[Peter, Brad]
},
]
This is what I wanted. Now I can loop this and render to a view.
Are there any optimized/better way to get this result without looping and querying again and again?
Solution 1:[1]
I think you need to use group_concat
to group your creators
.
SELECT p.post_id, post_title, group_concat(creator)
FROM post p
JOIN post_creator using(post_id)
group by p.post_id
Additionally, this:
$sql = "SELECT creator FROM post_creator WHERE post_id=$post->post_id";
$stmt = $conn->prepare($sql);
$stmt->execute();
is improper usage of a prepared statement. It should be written as:
$sql = "SELECT creator FROM post_creator WHERE post_id=?";
$stmt = $conn->prepare($sql);
$stmt->execute(array($post->post_id));
if it were needed, but it is not. Always bind values, never put direct to SQL.
Solution 2:[2]
I'd say there are 3 different roads you could follow, all of whom have some benefit or another.
Option 1. Simple SELECT query with JOIN (and overlapping rows)
This is more or less what you've already tried, with the first query you listed; which resulted in duplicate rows.
It's fairly trivial to modify your application code to deal with the dupes, and simply fold the creators into the same array/object. The overhead is almost nil as well. From a relational database design point-of-view, this method is still the best practice.
SELECT p.post_id
, p.post_title
, c.creator
FROM post p
LEFT JOIN post_creator c
ON p.post_id = c.post_id
ORDER BY p.post_id ASC
.
/* $rows = ...query...; */
$posts = [];
foreach ($rows as $row) {
if (!isset($posts[( $row['post_id'] )])) {
// this is a new post_id
$post = [];
$post['id'] = $row['post_id'];
$post['creators'] = [];
$post['creators'][] = $row['creator'];
$posts[( $row['post_id'] )] = $post;
} else {
// this is just an additional creator
$posts[( $row['post_id'] )]['creators'][] = $row['creator'];
}
}
Option 2. Multivalue columns (arrays or json)
A slightly more pragmatic solution for non-purists can be to have your query produce output columns which contain more than one value. This generally means either a JSON or an ARRAY column. The exact details depend on your choice of database system.
In either case, you'd combine it with the SQL GROUP BY
feature.
Let's assume you use MySQL and prefer the JSON type; you'd then go with a query such as:
SELECT p.post_id
, p.post_title
, JSON_ARRAYAGG(c.creator) AS creators
FROM post p
LEFT JOIN post_creator c
ON p.post_id = c.post_id
GROUP BY p.post_id
ORDER BY p.post_id ASC
This way, you'll only receive one record per post, and you'll get a value such as ['Mike', 'Paul', 'Susan']
which json_decode()
can turn into a proper PHP array.
Option 3. Fullblown documents
Another alternative that kind of builds upon option #2 is to go entirely with JSON, and abandon the relational recordset altogether.
Most modern DBMS have plenty of JSON functionality and the format you yourself listed as dataArr
, could be fully produced by the database in response to a single SELECT
query.
This way, the query would always result in just 1 row with 1 single column, which holds the entire dataArr
combining all those posts (which again, can be turned into a native PHP array or object tree with json_decode
, just like before).
While the result of this method can be very neat (depending on the way your application is written), some may wonder why you're using an RDBMS and not something like MongoDB.
Overall i'd recommend Option 1.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | user3783243 |
Solution 2 | Raxi |