Introduction
Window Functions were introduced in ANSI SQL:2003, making them a relatively new feature. They allow functions to be applied to a set of rows using the OVER
clause.
One of the most common use cases for window functions is analytical queries, specifically identifying islands and gaps in a dataset.
- Islands are groups of continuous data sequences.
- Gaps are missing sequences between data points.
Dataset
Here’s the initial dataset:
col1 |
---|
2 |
3 |
11 |
12 |
13 |
27 |
33 |
34 |
35 |
42 |
Creating the Dataset
|
|
Expected Output
After identifying islands in the dataset, we should get the following result:
start_range | end_range |
---|---|
2 | 3 |
11 | 13 |
27 | 27 |
33 | 35 |
42 | 42 |
Algorithmic Approach
To solve this, we need to group numbers that belong to the same continuous sequence (island).
col1 | grp |
---|---|
2 | a |
3 | a |
11 | b |
12 | b |
13 | b |
27 | c |
33 | d |
34 | d |
35 | d |
42 | e |
Each group (grp) represents an island, where numbers belong to a continuous sequence. For example, group D represents the sequence 33, 34, 35.
SQL Implementation
We can use window functions to achieve this:
|
|
Output
col1 | grp |
---|---|
2 | 1 |
3 | 1 |
11 | 8 |
12 | 8 |
13 | 8 |
27 | 21 |
33 | 26 |
34 | 26 |
35 | 26 |
42 | 32 |
Explanation
- By subtracting
ROW_NUMBER() OVER(ORDER BY col1)
fromcol1
, we create a unique identifier (grp
) for each island. - Numbers in the same sequence get the same
grp
value.
Final Query
Now, we just need to group by the grp
column to get the start and end of each island:
|
|
Final Output
start_range | end_range |
---|---|
2 | 3 |
11 | 13 |
27 | 27 |
33 | 35 |
42 | 42 |
Conclusion
By leveraging window functions, we efficiently identify islands in a dataset using row numbering and grouping techniques. This method can be extended to various real-world use cases, such as:
- Tracking continuous patient visits in healthcare analytics.
- Identifying consecutive login days in user behavior analysis.
- Detecting gaps in financial transactions for fraud detection.
Credits
Pictures
References
- SQL Example were taken from “T-SQL Window Functions, Second Edition For data analysis and beyond” by Itzik Ben-Gan