'Getting the median value from the database

A median is defined as a number separating the higher half of a data set from the lower half. Query the median of the Northern Latitudes (LAT_N) from STATION and round your answer to decimal places.

Input Format

The STATION table is described as follows:

Field : Type
ID    : NUMBER
CITY  : VARCHAR2(21)
STATE : VARCHAR2(2)
LAT_N : NUMBER
LONG_W: NUMBER

where LAT_N is the northern latitude and LONG_W is the western longitude.

I could only manage to get the row index for the median value with

select floor((count(lat_n)+1)/2) from station;

which is row index 250. The next step is to use this value to extract out the lat_n value at row index 250. How do I transform to SQL?



Solution 1:[1]

You can run the following code using Oracle.

SELECT ROUND(MEDIAN(LAT_N), 4)
FROM STATION;

Solution 2:[2]

The simplest method is to use the median() function:

select median(lat_n)
from stations;

You can round the value using functions such as round() or to_char().

Solution 3:[3]

For mySQL This will work,

   SELECT ROUND(S1.LAT_N, 4) 
    FROM STATION AS S1 
    WHERE (SELECT ROUND(COUNT(S1.ID)/2) - 1 
           FROM STATION) = 
          (SELECT COUNT(S2.ID) 
           FROM STATION AS S2 
           WHERE S2.LAT_N > S1.LAT_N);

refrence : https://nikkesan.gitbook.io/hackerrank/practice-1/sql/aggregation/untitled-2

Solution 4:[4]

SQL SERVER:

DECLARE @slno int;
SET @slno= (select COUNT(*) from STATION);
with cte (ind,LAT_N)
as
( select ROW_NUMBER() OVER(ORDER BY LAT_N ASC) AS ind, LAT_N from STATION
)
select 
CAST(LAT_N AS DECIMAL(15,4))
from cte
where ind= (@slno+1)/2
order by LAT_N
  1. In cte - add row number to each row and name the column as ind.
  2. In variable slno - add the total number of rows
  3. print Lat_N where the index calculated in step 1 falls in the middle such that the middle position will be slno+1/2 calculated in step 2.

Solution 5:[5]

Easy explanation of the sample median: In individual series (if number of observation is very low) first one must arrange all the observations in order. Then count(n) is the total number of observation in given data.

If n is odd then Median (M) = value of ((n + 1)/2)th item term.

If n is even then Median (M) = value of [(n/2)th item term + (n/2 + 1)th item term]/2

CTE in Sql Server

Query:

declare @cont int,@reccount int,@first int, @second int;
set @reccount=(select COUNT(id) from STATION);
IF @reccount%2=0 --even rows
set @cont=(((@reccount)/2)+(((@reccount)/2)+1))/2; 
ELSE 
set @cont=((select count(id) from station)+1)/2;
--creating CTE
WITH Station_CTE (rowNum, Lat_n)
AS (
select row_number() over(order by lat_n desc) as 'rowNum',lat_n from station
)
SELECT top(1) SUBSTRING(convert(varchar(30),ROUND((scte.lat_n),4)),1,(CHARINDEX('.',ROUND((scte.lat_n),4),1)+4)) from Station s inner join Station_CTE as scte on scte.rowNum =@cont

Expected Output:

83.8913

Solution 6:[6]

select round(lat_n,4) from
(select lat_n, rank() over(order by lat_n) as rank1 
from station) dummyTable
where rank1 = 
(select 
case 
when mod(count(*),2) = 0 then round(count(*)/2)
else round((count(*)+1)/2)
end as rank1
from station s2)
  1. Creating a dummy table with "lat_n" ordered ascendingly
  2. Using the windows function "rank() over()" to give row number for each record
  3. Then filtering for the median row using the rank and retrieving the 'lat_n' value for that row

Hope this helped.

Thanks.

Solution 7:[7]

Here is one option. We can assign a row number, while at the same time computing the total table count. Then, for tables with an even number of records, we can take the average of the middle two records as the median. For odd numbered tables, we can just take the middle record.

WITH cte AS (
    SELECT s.*,
        ROW_NUMBER() OVER (ORDER BY lat_n) rn,
        COUNT(*) OVER () cnt
    FROM station s
)

SELECT AVG(lat_n)
FROM cte
WHERE
    (MOD(cnt, 2) = 0 AND rn IN (FLOOR(cnt/2), FLOOR(cnt/2) + 1)) OR
    (MOD(cnt, 2) = 1 AND rn = FLOOR(cnt/2) + 1);

Demo

Note: For tables with an even number of records, it should be obvious that there is no exact median/middle record. But many statisticians just report the average of the middle pair of records in this case.

Solution 8:[8]

declare @Position int
set @position = round((select count(LAT_N)/2 from STATION),0)

select convert(decimal(20,4),LAT_N) from STATION
order by LAT_N
offset @position rows
fetch next 1 row only

Solution 9:[9]

This would work with mySQL:

  • Sort the values from column lat_n
  • Index to each row.
  • If the count of the values in the column is odd, find the value of the item in the middle of the sorted list. Or if is is even, find the values of the two items in the middle of the sorted list, calculate the average and return it.

SET @rowindex := -1;

SELECT
   round(AVG(lat_n),4)
FROM
   (SELECT @rowindex:=@rowindex + 1 AS rowindex, lat_n
    FROM station
    ORDER BY lat_n) 
    AS g
WHERE
g.rowindex IN (FLOOR(@rowindex / 2) , CEIL(@rowindex / 2));

Solution 10:[10]

MySQL :

select round(s.lat_n,4) from station s where (select round(count(id)/2)-1 from station) = (select count(s1.id) from station s1 where s1.lat_n > s.lat_n);

Two sub-queries will be equated when subquery-1 i.e, index count of half of the rows in the table,and sub query 2 which returns the number of s1.lat_n that are greater than the corresponding s.lat_n .

Solution 11:[11]

with cte as ( SELECT ROW_NUMBER()OVER(ORDER BY LAT_N)rw,LAT_N FROM STATION group by LAT_N )

select cast(lat_n as decimal(10,4)) from cte
where rw=
(select (count(rw)/2 )+1 from cte)

Solution 12:[12]

This solution is more generic.

We get the total length and add 1 if the number is odd.

Then we sort and only get the median item.

SET @length := ( SELECT COUNT(*) FROM STATION);
SET @offset :=  (IF(@length  % 2 <> 0 , @length+1 , @length)/2)-1 ;

PREPARE STMT FROM 'SELECT ROUND(LAT_N,4) FROM STATION ORDER BY LAT_N LIMIT ? , 1';
EXECUTE STMT USING @offset;

Solution 13:[13]

This would work Oracle:

SELECT round(median(lat_n),4) from station;

Solution 14:[14]

MYSQL code :

SELECT ROUND(lat_n,4)
---------------------
FROM (
     SELECT lat_n,
     ROW_NUMBER() OVER(ORDER BY lat_n ) AS row_num
     FROM station
     )  AS new_table
--------------------
WHERE row_num = (
                SELECT CEIL(AVG(row_num)) 
                FROM (
                      SELECT 
                      ROW_NUMBER() OVER(ORDER BY lat_n ) AS row_num
                      FROM station
                      )  AS new_table
                 )
-------------------

Solution 15:[15]

We all know what the median is when the number of elements is odd and even it is the (n+1)/2th term when odd and ((n/2)th term + (n/2)+1th term)/2 so use this code.

set @r1=0,@r2=0,@r3=0,@r4=0;
select
case
when mod(count(cnt),2)=0 then 
                        ((select round(lat_n,4) from (select @r3:=@r3+1 as cnt2, lat_n from station order by lat_n) temp2 where cnt2=(@r3)/2)+(select round(lat_n,4) from (select @r4:=@r4+1 as cnt3, lat_n from station order by lat_n) temp3 where cnt3=(@r4+1)/2))/2
else (select round(lat_n,4) from (select @r2:=@r2+1 as cnt1, lat_n from station order by lat_n) temp1 where cnt1=(@r2+1)/2)
end
from (select @r1:=@r1+1 as cnt, lat_n from station ) temp;

Solution 16:[16]

MySQL Code :

SELECT ROUND (LAT_N,4)

FROM ( SELECT
ROW_NUMBER() OVER (ORDER BY LAT_N DESC) AS ROWNUMBER, LAT_N FROM STATION ) 
AS LAT_N

WHERE ROWNUMBER = ( SELECT ((COUNT(LAT_N) + 1) /2) 
       FROM STATION)

Solution 17:[17]

This worked for me as MS SQL Server solution:

select format(round(b.lat_n, 4), ".####")
from station a
join (select id, lat_n, row_number()over(order by lat_n) as rn
from station) b
on a.id = b.id
where b.rn = (select floor((count(lat_n)+1)/2) from station)

Solution 18:[18]

I found another solution that is more robust than hardcoding the '+1' and more efficient than including an 'IF' statement.

WITH Tbl AS( select row_number() over (order by lat_n) rc, lat_n from station)

Select cast(avg(lat_n) as Decimal(15,4)) from tbl where rc = (select ceiling(max(rc)/2.0) from tbl)