Featured image of post Using Window Functions to find Islands

Using Window Functions to find Islands

Introduction

Window Functions were introduced in ANSI SQL:2003, making them a relatively new feature. They allow functions to be applied to a set of rows using the OVER clause.

One of the most common use cases for window functions is analytical queries, specifically identifying islands and gaps in a dataset.

  • Islands are groups of continuous data sequences.
  • Gaps are missing sequences between data points.

Dataset

Here’s the initial dataset:

col1
2
3
11
12
13
27
33
34
35
42

Creating the Dataset

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
DROP TABLE IF EXISTS table1;

CREATE TEMPORARY TABLE table1(
  col1 INT NOT NULL GENERATED BY DEFAULT AS IDENTITY
);

INSERT INTO table1(col1)
VALUES (2), (3), (11), (12), (13), (27), (33), (34), (35), (42);

SELECT col1
FROM table1
ORDER BY col1;

Expected Output

After identifying islands in the dataset, we should get the following result:

start_range end_range
2 3
11 13
27 27
33 35
42 42

Algorithmic Approach

To solve this, we need to group numbers that belong to the same continuous sequence (island).

col1 grp
2 a
3 a
11 b
12 b
13 b
27 c
33 d
34 d
35 d
42 e

Each group (grp) represents an island, where numbers belong to a continuous sequence. For example, group D represents the sequence 33, 34, 35.

SQL Implementation

We can use window functions to achieve this:

1
2
3
4
SELECT 
  col1,
  col1 - ROW_NUMBER() OVER(ORDER BY col1) AS grp
FROM table1;

Output

col1 grp
2 1
3 1
11 8
12 8
13 8
27 21
33 26
34 26
35 26
42 32

Explanation

  • By subtracting ROW_NUMBER() OVER(ORDER BY col1) from col1, we create a unique identifier (grp) for each island.
  • Numbers in the same sequence get the same grp value.

Final Query

Now, we just need to group by the grp column to get the start and end of each island:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
WITH grp AS (
  SELECT 
    col1,
    col1 - ROW_NUMBER() OVER(ORDER BY col1) AS grp
  FROM table1
)
SELECT 
  MIN(col1) AS start_range, 
  MAX(col1) AS end_range
FROM grp
GROUP BY grp
ORDER BY start_range;

Final Output

start_range end_range
2 3
11 13
27 27
33 35
42 42

Conclusion

By leveraging window functions, we efficiently identify islands in a dataset using row numbering and grouping techniques. This method can be extended to various real-world use cases, such as:

  • Tracking continuous patient visits in healthcare analytics.
  • Identifying consecutive login days in user behavior analysis.
  • Detecting gaps in financial transactions for fraud detection.

Credits

Pictures

  1. Easter island

References

  1. SQL Example were taken from “T-SQL Window Functions, Second Edition For data analysis and beyond” by Itzik Ben-Gan

Desiderata by Max Ehrmann

Go placidly amid the noise and the haste, and remember what peace there may be in silence. As far as possible without surrender be on good terms with all persons. Speak your truth quietly and clearly; and listen to others, even to the dull and the ignorant, they too have their story. Avoid loud and aggressive persons, they are vexations to the spirit.

If you compare yourself with others, you may become vain or bitter; for always there will be greater and lesser persons than yourself. Enjoy your achievements as well as your plans. Keep interested in your own career, however humble; it is a real possession in the changing fortunes of time.

Exercise caution in your business affairs, for the world is full of trickery. But let not this blind you to what virtue there is; many persons strive for high ideals, and everywhere life is full of heroism. Be yourself. Especially do not feign affection. Neither be cynical about love; for in the face of all aridity and disenchantment it is as perennial as the grass. Take kindly the counsel of the years, gracefully surrendering the things of youth.

Nurture strength of spirit to shield you in sudden misfortune. But do not distress yourself with dark imaginings. Many fears are born of fatigue and loneliness. Beyond a wholesome discipline, be gentle with yourself. You are a child of the universe, no less than the trees and the stars; you have a right to be here. And whether or not it is clear to you, no doubt the universe is unfolding as it should.

Therefore, be at peace with God, whatever you conceive Him to be. And whatever your labors and aspirations in the noisy confusion of life, keep peace in your soul. With all its sham, drudgery and broken dreams; it is still a beautiful world. Be cheerful.

Strive to be happy.

Built with Hugo
Theme Stack designed by Jimmy