|
| 1 | +--- |
| 2 | +title: Window Functions |
| 3 | +hide_title: false |
| 4 | +hide_table_of_contents: false |
| 5 | +keywords: |
| 6 | + - stackql |
| 7 | + - infrastructure-as-code |
| 8 | + - configuration-as-data |
| 9 | + - cloud inventory |
| 10 | +description: Query and Deploy Cloud Infrastructure and Resources using SQL |
| 11 | +image: "/img/stackql-featured-image.png" |
| 12 | +--- |
| 13 | + |
| 14 | +Window functions perform calculations across a set of rows that are related to the current row. Unlike aggregate functions that return a single result for a group of rows, window functions return a value for each row while considering a "window" of related rows. |
| 15 | + |
| 16 | +See also: |
| 17 | +[[` SELECT `]](/docs/language-spec/select) [[` WITH (CTEs) `]](/docs/language-spec/with) [[ Window Function Reference ]](/docs/language-spec/functions/window/row_number) |
| 18 | + |
| 19 | +* * * |
| 20 | + |
| 21 | +## Syntax |
| 22 | + |
| 23 | +Window functions use the `OVER` clause to define the window specification: |
| 24 | + |
| 25 | +```sql |
| 26 | +SELECT <windowFunction> OVER ( <windowSpec> ) FROM <multipartIdentifier>; |
| 27 | +``` |
| 28 | + |
| 29 | +*windowSpec::=* |
| 30 | + |
| 31 | +```sql |
| 32 | +[ PARTITION BY <fieldList> ] [ ORDER BY <fieldList> [ ASC | DESC ] ] |
| 33 | +[ ROWS BETWEEN <frameStart> AND <frameEnd> ] |
| 34 | +``` |
| 35 | + |
| 36 | +## Available Window Functions |
| 37 | + |
| 38 | +| Category | Functions | |
| 39 | +|----------|-----------| |
| 40 | +| **Ranking** | `ROW_NUMBER()`, `RANK()`, `DENSE_RANK()`, `NTILE()` | |
| 41 | +| **Offset** | `LAG()`, `LEAD()`, `FIRST_VALUE()`, `LAST_VALUE()`, `NTH_VALUE()` | |
| 42 | +| **Distribution** | `PERCENT_RANK()`, `CUME_DIST()` | |
| 43 | +| **Aggregate** | `SUM()`, `COUNT()`, `AVG()`, `MIN()`, `MAX()` with `OVER` clause | |
| 44 | + |
| 45 | +* * * |
| 46 | + |
| 47 | +## Examples |
| 48 | + |
| 49 | +### Ranking with `ROW_NUMBER`, `RANK`, and `DENSE_RANK` |
| 50 | + |
| 51 | +Use ranking functions to assign rankings to rows based on column values. |
| 52 | + |
| 53 | +```sql |
| 54 | +-- Rank contributors by contribution count |
| 55 | +SELECT |
| 56 | + login, |
| 57 | + contributions, |
| 58 | + ROW_NUMBER() OVER (ORDER BY contributions DESC) as row_num, |
| 59 | + RANK() OVER (ORDER BY contributions DESC) as rank, |
| 60 | + DENSE_RANK() OVER (ORDER BY contributions DESC) as dense_rank |
| 61 | +FROM github.repos.contributors |
| 62 | +WHERE owner = 'stackql' |
| 63 | + AND repo = 'stackql'; |
| 64 | +``` |
| 65 | + |
| 66 | +### Running totals and percentages |
| 67 | + |
| 68 | +Aggregate functions like `SUM` and `COUNT` can be used as window functions to calculate running totals and percentages. |
| 69 | + |
| 70 | +```sql |
| 71 | +-- Running total and percentage of contributions |
| 72 | +SELECT |
| 73 | + login, |
| 74 | + contributions, |
| 75 | + SUM(contributions) OVER (ORDER BY contributions DESC) as running_total, |
| 76 | + SUM(contributions) OVER () as total_contributions, |
| 77 | + ROUND(100.0 * contributions / SUM(contributions) OVER (), 2) as pct_of_total |
| 78 | +FROM github.repos.contributors |
| 79 | +WHERE owner = 'stackql' |
| 80 | + AND repo = 'stackql'; |
| 81 | +``` |
| 82 | + |
| 83 | +### Comparing rows with `LAG` and `LEAD` |
| 84 | + |
| 85 | +Use `LAG` and `LEAD` to access values from previous or subsequent rows. |
| 86 | + |
| 87 | +```sql |
| 88 | +-- Compare each release to previous release |
| 89 | +SELECT |
| 90 | + tag_name, |
| 91 | + name, |
| 92 | + published_at, |
| 93 | + LAG(tag_name, 1) OVER (ORDER BY published_at) as previous_release, |
| 94 | + LEAD(tag_name, 1) OVER (ORDER BY published_at) as next_release, |
| 95 | + julianday(published_at) - julianday(LAG(published_at, 1) OVER (ORDER BY published_at)) as days_since_last_release |
| 96 | +FROM github.repos.releases |
| 97 | +WHERE owner = 'stackql' |
| 98 | + AND repo = 'stackql' |
| 99 | +ORDER BY published_at; |
| 100 | +``` |
| 101 | + |
| 102 | +### Creating groups with `NTILE` |
| 103 | + |
| 104 | +Use `NTILE` to divide rows into a specified number of groups. |
| 105 | + |
| 106 | +```sql |
| 107 | +-- Group contributors into quartiles |
| 108 | +SELECT |
| 109 | + login, |
| 110 | + contributions, |
| 111 | + NTILE(4) OVER (ORDER BY contributions DESC) as quartile, |
| 112 | + CASE NTILE(4) OVER (ORDER BY contributions DESC) |
| 113 | + WHEN 1 THEN 'Top Contributors' |
| 114 | + WHEN 2 THEN 'Active Contributors' |
| 115 | + WHEN 3 THEN 'Moderate Contributors' |
| 116 | + WHEN 4 THEN 'Occasional Contributors' |
| 117 | + END as tier |
| 118 | +FROM github.repos.contributors |
| 119 | +WHERE owner = 'stackql' |
| 120 | + AND repo = 'stackql'; |
| 121 | +``` |
| 122 | + |
| 123 | +### Using `PARTITION BY` for grouped analysis |
| 124 | + |
| 125 | +Use `PARTITION BY` to perform window calculations within groups. |
| 126 | + |
| 127 | +```sql |
| 128 | +-- Track cumulative issues by state |
| 129 | +SELECT |
| 130 | + number, |
| 131 | + title, |
| 132 | + state, |
| 133 | + created_at, |
| 134 | + ROW_NUMBER() OVER (ORDER BY created_at) as issue_sequence, |
| 135 | + COUNT(*) OVER (ORDER BY created_at) as cumulative_issues, |
| 136 | + COUNT(*) OVER (PARTITION BY state ORDER BY created_at) as cumulative_by_state |
| 137 | +FROM github.issues.issues |
| 138 | +WHERE owner = 'stackql' |
| 139 | + AND repo = 'stackql' |
| 140 | +ORDER BY created_at; |
| 141 | +``` |
| 142 | + |
| 143 | +### Using window frames |
| 144 | + |
| 145 | +Specify a frame clause to control which rows are included in the window calculation. |
| 146 | + |
| 147 | +```sql |
| 148 | +-- Calculate moving average of commit activity |
| 149 | +WITH weekly_totals AS ( |
| 150 | + SELECT |
| 151 | + week, |
| 152 | + SUM(json_each.value) as commits_this_week |
| 153 | + FROM github.repos.stats_commit_activity, JSON_EACH(days) |
| 154 | + WHERE owner = 'stackql' |
| 155 | + AND repo = 'stackql' |
| 156 | + GROUP BY week |
| 157 | +) |
| 158 | +SELECT |
| 159 | + week, |
| 160 | + commits_this_week, |
| 161 | + ROUND(AVG(commits_this_week) OVER (ORDER BY week ROWS BETWEEN 3 PRECEDING AND CURRENT ROW), 1) as four_week_moving_avg, |
| 162 | + SUM(commits_this_week) OVER (ORDER BY week) as cumulative_commits |
| 163 | +FROM weekly_totals |
| 164 | +ORDER BY week; |
| 165 | +``` |
| 166 | + |
| 167 | +For detailed documentation on each window function, see the [Window Function Reference](/docs/language-spec/functions/window/row_number). |
| 168 | + |
| 169 | +For more information, see [https://sqlite.org/windowfunctions.html](https://sqlite.org/windowfunctions.html). |
0 commit comments