Introduction
A window function is a ‘fancy’ name given to a function that performs calculations across a set of table rows ’window’, somehow related to the current row. Unlike aggregate functions, window functions don’t return a single result for a group of rows, the rows maintain their individuality. In the background, however, the function can access more than one row, allowing complex analysis without altering the original dataset. They can compute moving averages, rank items, perform cumulative sums and other analyses.
The SQL dialect of choice is Bigquery for the examples below.
Anatomy of a window function
A window function typically consists of :
-
function_name:
This is usually an operation you'd like to perform. It could be a COUNT, SUM, AVG or RANK to name a few. It is the function that performs the window operation.
-
argument_list (optional):
Arguments specific to that function, some functions have them, and others don’t.
-
OVER:
Defines the window frame, a set of records where the function will be calculated. It is considered the bedrock of window functions.
Basic window function structure
-
window_specification:
Defines the specifications of the window.
-
window_frame_clause:
Defines the window frame and determines what to include in it. If this clause is used, ORDER BY is required. Only aggregate analytics functions use a window frame clause.
(Bonus)
-
named_window:
The name of an existing window that was defined with a WINDOW clause.
-
named_window_expression:
Defines a named window, with the format:
...
WINDOW window_name AS (
PARTITION BY tag
ORDER BY post_date)
Rules when working with named window functions
-
The named window specifications can be further defined in the window specifications clause.
-
Redundant definitions are not allowed; an ORDER BY clause in the named window and the window specification clause causes an error.
Named window function structure
-
The order of clauses matters; PARTITION BY clause must come first, followed by ORDER BY and window_frame_clause. If a named window is present, its window specifications are processed first.
-
A named window and PARTITION BY can’t appear together in the window specification. If you need a PARTITION BY, add it to the named window.
-
You can’t refer to a named window in an ORDER BY clause, an outer query or any subquery.
Window function Examples
Using the dataset provided, let’s answer a few questions using window functions:
-
Rank employees by salary within each department.
USE sql_test_db;
SELECT
d.department_name,
CONCAT(e.first_name, ' ', e.last_name) AS name,
e.salary,
RANK() OVER(PARTITION BY e.department_id ORDER BY e.salary DESC) as salary_rank
FROM employees AS e
LEFT OUTER JOIN departments AS d ON
e.department_id = d.department_id
-
Calculate the running total of order amounts for each employee.
USE sql_test_db;
SELECT
CONCAT(e.first_name, ' ', e.last_name) AS name,
order_date,
order_amount,
SUM(o.order_amount) OVER(PARTITION BY e.employee_id ORDER BY order_date) AS running_total
FROM orders AS o
LEFT OUTER JOIN employees AS e ON
o.employee_id = e.employee_id
-
Find the average salary for each department
USE sql_test_db;
SELECT
d.department_name,
CONCAT(e.first_name, ' ', e.last_name) AS name,
e.salary,
AVG(e.salary) OVER(PARTITION BY e.department_id) AS avg_salary_in_dept
FROM departments AS d
LEFT OUTER JOIN employees AS e ON
d.department_id = e.department_id
Named window function examples
-
Find the difference between an employee’s salary and the highest salary in their department
USE sql_test_db;
SELECT
d.department_name,
CONCAT(e.first_name, ' ', e.last_name) AS name,
e.salary,
MAX(e.salary) OVER(max_salary) - salary AS a
FROM departments AS d
LEFT OUTER JOIN employees AS e ON
d.department_id = e.department_id
WINDOW max_salary AS(
PARTITION BY e.department_id
)
-
Identify the top 2 orders (by amount) for each employee
SELECT
employee_id,
order_id,
order_amount,
RANK() OVER (ordered_amount) AS order_rank
FROM orders
WINDOW ordered_amount AS(
PARTITION BY employee_id
ORDER BY order_amount DESC
)
Try this:
Calculate the percentage of each employee’s total sales compared to their department's total sales.
Filtering window functions
The QUALIFY clause can filter the results of a window function. For a detailed review read this article about the QUALIFY clause.
Conclusion
Window functions are powerful tools that are unavoidable when performing analytic operations. This article will guide you through the basics to enable you to move to the next stage. Hopefully, you will walk away with something new.