Human Traffic of Stadium (SQL #601)

This task requires to find Human Traffic of Stadium. We need to write a SQL query to extract records with three or more consecutive rows that have an ID and people count greater than or equal to 100. The resulting table should be sorted in ascending order by visit_date.

Problem Statement

Link: LeetCode

Complexity: Hard

TableStadium

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| id            | int     |
| visit_date    | date    |
| people        | int     |
+---------------+---------+
visit_date is the primary key for this table.
Each row of this table contains the visit date and visit id to the stadium with the number of people during the visit.
No two rows will have the same visit_date, and as the id increases, the dates increase as well.

Write an SQL query to display the records with three or more rows with consecutive id‘s, and the number of people is greater than or equal to 100 for each.

Return the result table ordered by visit_date in ascending order.

The query result format is in the following example.

Example 1:

Input: 
Stadium table:
+------+------------+-----------+
| id   | visit_date | people    |
+------+------------+-----------+
| 1    | 2017-01-01 | 10        |
| 2    | 2017-01-02 | 109       |
| 3    | 2017-01-03 | 150       |
| 4    | 2017-01-04 | 99        |
| 5    | 2017-01-05 | 145       |
| 6    | 2017-01-06 | 1455      |
| 7    | 2017-01-07 | 199       |
| 8    | 2017-01-09 | 188       |
+------+------------+-----------+
Output: 
+------+------------+-----------+
| id   | visit_date | people    |
+------+------------+-----------+
| 5    | 2017-01-05 | 145       |
| 6    | 2017-01-06 | 1455      |
| 7    | 2017-01-07 | 199       |
| 8    | 2017-01-09 | 188       |
+------+------------+-----------+
Explanation: 
The four rows with ids 5, 6, 7, and 8 have consecutive ids and each of them has >= 100 people attended. Note that row 8 was included even though the visit_date was not the next day after row 7.
The rows with ids 2 and 3 are not included because we need at least three consecutive ids.

Solution Explanation:
References:

Human Traffic of Stadium, solution to this problem involves using the ‘LAG()‘ and ‘LEAD()‘ functions to get the previous and next values of ‘people‘, respectively. Result is stored in temporary table ‘tmp‘ and then the ‘Stadium‘ table is joined with ‘tmp‘ that contains the ‘id‘, ‘visit_date‘, ‘people‘, ‘prev‘, and ‘next‘ columns.

The same result can be obtained using other methods like having a sub query and use of ‘Exists‘ but that would be more complex. Also, those queries will not be as cost effective as this one.

LAG and LEAD

LAG and LEAD are SQL analytic functions that allow you to access data from a previous or subsequent row in a query result set. LAG provides access to a previous row, while LEAD provides access to the following row.

Here is an example of LAG and LEAD function (‘Stadium‘ table):

SELECT
  id,
  visit_date,
  people,
  LAG(people, 1) OVER (ORDER BY id) AS prev,
  LEAD(people, 1) OVER (ORDER BY id) AS next
FROM
Stadium
Output:

| id | visit_date | people | prev | next |
| -- | ---------- | ------ | ---- | ---- |
| 1  | 2017-01-01 | 10     | null | 109  |
| 2  | 2017-01-02 | 109    | 10   | 150  |
| 3  | 2017-01-03 | 150    | 109  | 99   |
| 4  | 2017-01-04 | 99     | 150  | 145  |
| 5  | 2017-01-05 | 145    | 99   | 1455 |
| 6  | 2017-01-06 | 1455   | 145  | 199  |
| 7  | 2017-01-07 | 199    | 1455 | 188  |
| 8  | 2017-01-09 | 188    | 199  | null |

Create a Temporary Table

with tmp as
(
    select  id,
            visit_date,
            people,
            Lag(people,1) OVER (ORDER BY id) as prev,
            Lead(people,1) OVER (ORDER BY id) as next
    from Stadium
)

Join the ‘Stadium‘ table with the ‘tmptable on the condition that the absolute difference between the ‘id‘ values is less than ‘2‘ and the ‘prev‘, ‘people‘, and ‘next‘ values are greater than or equal to ‘100

select distinct
        a.id,
        a.visit_date,
        a.people
from Stadium a, tmp t 
where Abs(a.id-t.id)<2 and t.prev>=100 and t.next>=100 and t.people>=100
order by a.visit_date

Select the distinct values of ‘id‘, ‘visit_date‘, and ‘people‘ from the ‘Stadium‘ table. Order the result by ‘visit_date‘ in ascending order.

Note:

Distinct will be required as the result set may contain duplicates based on input result set.
Output without distinct:

| id | visit_date | people |
| -- | ---------- | ------ |
| 5  | 2017-01-05 | 145    |
| 6  | 2017-01-06 | 1455   |
| 6  | 2017-01-06 | 1455   |
| 7  | 2017-01-07 | 199    |
| 7  | 2017-01-07 | 199    |
| 8  | 2017-01-09 | 188    |

Solution Code:
with tmp as
(
    select  id,
            visit_date,
            people,
            Lag(people,1) OVER (ORDER BY id) as prev,
            Lead(people,1) OVER (ORDER BY id) as next
    from Stadium
)

select distinct
        a.id,
        a.visit_date,
        a.people
from Stadium a, tmp t 
where Abs(a.id-t.id)<2 and t.prev>=100 and t.next>=100 and t.people>=100
order by a.visit_date
Input
| id | visit_date | people |
| -- | ---------- | ------ |
| 1  | 2017-01-01 | 10     |
| 2  | 2017-01-02 | 109    |
| 3  | 2017-01-03 | 150    |
| 4  | 2017-01-04 | 99     |
| 5  | 2017-01-05 | 145    |
| 6  | 2017-01-06 | 1455   |
| 7  | 2017-01-07 | 199    |
| 8  | 2017-01-09 | 188    |

Output
| id | visit_date | people |
| -- | ---------- | ------ |
| 5  | 2017-01-05 | 145    |
| 6  | 2017-01-06 | 1455   |
| 7  | 2017-01-07 | 199    |
| 8  | 2017-01-09 | 188    |

Expected
| id | visit_date | people |
| -- | ---------- | ------ |
| 5  | 2017-01-05 | 145    |
| 6  | 2017-01-06 | 1455   |
| 7  | 2017-01-07 | 199    |
| 8  | 2017-01-09 | 188    |

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top