We can't go past CASE WHEN clause especially when we are speaking of the data aggregation. Despite the fact that the SQL itself is not a programming language, it gives us a great opportunity to use control flow in queries without recourse to the stored procedure. Disadvantages of stored procedure are quite high threshold of entry and inability of writing the cross-platform code.
The principle of clause is a very simple design:
CASE WHEN condition1 THEN value1 WHEN condition2 THEN value2 … WHEN conditionN THEN valueN ELSE valueElse END
If the condition in WHEN is true, then in place of the whole clause substitutes the corresponding value. If all conditions are false, then the value from the ELSE part will be taken.
Think about where it might be useful. For example we have a table of products with a price, discount and field that determines whether or not to apply the discount.
SELECT CASE WHEN use_discount THEN discount * price ELSE price END as price FROM products
A real case can differ by the presence of a separate table with discounts (then we add JOIN to the query) or a discount on the order date (then we complicate the condition with 'BETWEEN since AND till' statement).
One more example from real life. In order not to clog the head of the reader dozen JOINs we take from example only part associated with CASE WHEN.
Orders and their positions are stored in the database. Positions have statuses: open, edited by the client, closed. status of the order changes depending on the status and position number. It can be: empty, open, edited by the client, closed, transferring. The order is empty if it hasn't any positions. "Transferring" if the status of the positions doesn't match. If all positions have equal status, the same status is assigned to the order. Our task is to display a list of orders with stamped statuses for them. This list will be very convenient for sales managers working with CRM.
Let's try to collect our request from pieces. First, we combine two tables and group them by order's id:
SELECT o.id FROM orders o JOIN positions p ON o.id = p.order_id GROUP BY o.id
Now try to pull for each order his state:
SELECT o.id, CASE WHEN count(DISTINCT p.status_id) = 1 THEN max(p.status_id) WHEN count(p.status_id) = 0 THEN 'Empty' ELSE 'Transferring' END FROM orders o JOIN positions p ON o.id = p.order_id GROUP BY o.id
In the first condition we are using a little hack: if all positions in the order have the same status, then we can perform on it max, min or avg functions. Get exactly the value that we need as the result. Unfortunately, we can not use the status_id field itself, since it is not specified in the GROUP BY.
CASE WHEN clause awesome, but more awesome is the ability to put inside it other CASE WHEN clauses.
Probably, sales managers would not be very happy to look at status_id field, so we replace it with the text, using nested CASE clauses:
SELECT o.id, CASE WHEN count(DISTINCT p.status_id) = 1 THEN CASE max(p.status_id) WHEN 0 THEN 'Opened' WHEN 1 THEN 'Edited by the client' WHEN 2 THEN 'Closed' END WHEN count(p.status_id) = 0 THEN 'Empty' ELSE 'Transferring' END AS status FROM orders o LEFT JOIN positions p ON p.order_id = o.id GROUP BY o.id
The attentive reader will notice that the outer and inner CASEs are slightly different. In the outer clause conditions are within the WHEN, in the inner, we compare the max(p.status_id) with the values listed in the WHEN. Both alternatives are acceptable and are used depending on the situation. Is there any field that you want to compare with multiple values? Put it to the CASE. Do you have many different conditions? Leave blank CASE and describe conditions in the WHEN part.
SQL is still not a programming language, but its ability to simply and clearly manipulate data amaze and greatly simplifies the life and work of a programmer, same as DBA and analysts.