'sqlite: select display few rows around matching rows

I have a table of the form

select rowid,* from t;
rowid f
----- -----
1     aaa
2     bbb
3     ccc
4     ddd
5     eee
6     fff
7     ggg
8     aaa
9     bbb
10    ccc
11    ddd
12    eee
13    fff
14    ggg

Id like to select n row before and m row after a given row match, i.e for instance for rows that match f='ccc' with n=m=1 I'd like to get

2     bbb
3     ccc
4     ddd
9     bbb
10    ccc
11    ddd

The rowid is sequential in my setup so I guess we can play with it. I tried thing along the line of

select rowid,f from t where rowid between 
  (select rowid-1 from t where f='ccc') and 
  (select rowid+1 from t where f='ccc');

rowid f
----- -----
2     bbb
3     ccc
4     ddd

But the result is obviously wrong I got only the 1st occurence of the 'ccc' match. I guess I got to join or may be recursive cte, but I am affraid it is beyound my knowlegde so far :) Thanx in advance.



Solution 1:[1]

A scalar subquery can return only a single value.

You could do two self joins, but it would be simpler to use set operations:

SELECT * FROM t
WHERE rowid IN (SELECT rowid - 1 FROM t WHERE f = 'ccc'
                UNION ALL
                SELECT rowid     FROM t WHERE f = 'ccc'
                UNION ALL
                SELECT rowid + 1 FROM t WHERE f = 'ccc');

Larger values of n and m require more subqueries. If there are too many, you can use a join:

SELECT *
FROM t
WHERE rowid IN (SELECT t.rowid
                FROM t
                JOIN (SELECT rowid - ? AS n,
                             rowid + ? AS m
                      FROM t
                      WHERE f = 'ccc'
                ) AS ranges
                ON t.rowid BETWEEN ranges.n AND ranges.m);

Solution 2:[2]

I come with a solution that is not optimal I think but I am not able to simplify (remove) the temp (intermediate) table.

select rowid,f from t;
rowid f
----- -----
1     aaa
2     bbb
3     ccc
4     ddd
5     eee
6     fff
7     ggg
8     aaa
9     bbb
10    ccc
11    ddd
12    eee
13    fff
14    ggg

create table u as
    select t2.rowid x,t1.rowid+2 y from t t1 // +2 ==> 2 rows after 'ccc'
    join t t2 on t1.rowid=t2.rowid+1         // +1 ==> 1 row before 'ccc'
    where t1.f='ccc';

select * from u;
x     y
----- -----
2     5
9     12

select t.rowid,t.f from t inner
join u on t.rowid>=u.x and t.rowid<=u.y'
rowid f
----- -----
2     bbb     1 before
3     ccc <== match
4     ddd     2 after
5     eee
9     bbb     1 before
10    ccc <== match
11    ddd     2 after
12    eee

I think I am set with what I need, but optimisations are welome :)

Solution 3:[3]

I might be overlooking something, but the provided solutions that suggest adding/subtracting from values of the rowid column could be improved upon. They will face issues should rowid ever be missing a value (Which I'm aware was stated to never be the case in the top post, but in general is an assumption that's often not true).

By using sqlite's row_number() you can have a solution that circumvents that problem and also can be used to fetch the entries "around" your row matches based on any arbitrary order you want, not just based on rowid.

Together with Common Table Expressions this can even be made somewhat readable, though should you have a larger amount of row-matches this will still be a slow query.

What you'll conceptually be doing is:

  1. Do a pre-select on your table (here cte_t) to get all possible values that could be a valid hit and attach a row-number to each entry
  2. Do a select on that pre-select to fetch the specific row that you actually want and get only its row-number (here targetRows)
  3. "join" the two by pretty much just multiplying the two tables generated in 1) and 2).

Now you can easily select for all entries whose row-number is in a specific range around the target's row-number using ABS

WITH 
    cte_t AS (
        SELECT *, row_number() OVER (ORDER BY t.rowid) AS rownum
        FROM t
        -- If you can make this cte smaller by removing all entries that can't possibly be the solution with the appropriate WHERE clause, you can make the entire query substantially faster
    ), 
    targetRows AS(
        SELECT rownum AS targetRowNum
        FROM cte_t
        WHERE f = 'ccc' -- This should be the WHERE condition that defines the entries that match your query exactly and for which you want to get the entries around them
    )
SELECT cte_t.rowid, cte_t.f
FROM cte_t, targetRows -- This is basically multiplying both tables with one another, this part will be horribly slow if targetRows gets larger
WHERE ABS(cte_t.rownum - targetRows.targetRowNum) <= 1; --Get all entries in targetRows as well as those whose rownum is 1 larger or 1 lower than the rownum of a targetRow

This will return

rowid   f
2   bbb
3   ccc
4   ddd
9   bbb
10  ccc
11  ddd

Here a good resource about this.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Phi
Solution 3 Philipp Doerner