---
title: "Understanding GROUP BY in PostgreSQL (With Examples)"
description: "The GROUP BY clause allows you to group rows from SELECT statements in one or more columns. Learn how to use it."
section: "Postgres basics"
---

> **TimescaleDB is now Tiger Data.**

*Written by Sarah Conway
*

Regardless of whether you’re using PostgreSQL or TimescaleDB, you can expect exactly the same behavior from using the `GROUP BY` clause: it’ll allow you to return rows from [`SELECT` statements](https://www.timescale.com/learn/understanding-postgresql-select) in groups of one or more columns. It groups all rows that have the same value in all the columns listed into a single group row. This will remove any redundancy and is particularly useful when used with aggregate functions such as `SUM()`, `AVG()`, `COUNT()`, `MIN()`, or `MAX()`. 


The behavior of `GROUP BY` is very similar to window functions, yet there is a significant difference: when using `GROUP BY`, rows are collapsed into groups, so each field can no longer be individually accessed. [Window functions ](https://www.timescale.com/learn/postgresql-window-functions)allow you to reference information for each specific record in addition to viewing the result of the window function.


In short, you'll want to use `GROUP BY` when you're looking to eliminate duplicate rows, aggregate the result set, and squash rows prior to performing calculations. 

The (simplified) syntax for this clause as part of the overall `SELECT` statement is as follows:


`SELECT <expressions> 
FROM <tables> 
GROUP BY <condition>;
`

The full syntax of the `GROUP BY` clause itself is:


`GROUP BY [ ALL | DISTINCT ] grouping_element [, ...]
`

The order in which columns are specified does not matter.

And, of course, the use of this clause can be part of a complex `SELECT` statement as well. The following full syntax can be used as reference:


`SELECT [ ALL | DISTINCT | DISTINCT ON (<distinct_expressions>) ]
<expressions>
FROM <tables>
[WHERE <conditions>]
[GROUP BY <expressions>]
[HAVING <condition>]
[ORDER BY <expression> [ ASC | DESC | USING <operator> ] [ NULLS FIRST FIRST | NULLS LASTLAST ]]
[LIMIT [ <number_rows> | ALL]
[OFFSET OFFSET <offset_value> [ ROW | ROWS ]]
[FETCH FETCH { FIRST FIRST | NEXT } [ <fetch_rows> ] { ROW | ROWS } ONLY]
[FOR { UPDATE | SHARE } OF <table> [ NOWAIT ]];
`

You’re able to use `GROUP BY` with any combination of possible clauses within a `SELECT` statement. It’s evaluated after the `FROM` and `WHERE` clauses, prior to [`HAVING`](#), `DISTINCT`, `ORDER BY`, and [`LIMIT`](#).

## Examples

The following examples make use of a basic table called `customers` that contains the following fictional data:

| customer_id | firstname | lastname | state |
| --- | --- | --- | --- |
| 1 | Cheyenne | Holmes | Arizona |
| 2 | Alejandro | Ruiz | California |
| 3 | Martha | Parker | California |
| 4 | Robert | James | Florida |
| 5 | Danny | Angelo | North Carolina |





### Using PostgreSQL GROUP BY Without an Aggregate Function

The simplest form of the `GROUP BY` clause is to use it without an aggregate function on a single table, as follows:

`SELECT customer_id FROM orders GROUP BY customer_id;
`

When used in this manner, it works similarly to `SELECT` `DISTINCT` in that it will remove duplicate rows from a returned result set. In our case, we’re choosing to return all orders and are grouping by the `customer_id`; this means that we are essentially choosing to only return the `customer_id` of customers that have placed an order.

Result:

| customer_id |
| --- |
| 3 |
| 2 |
| 1 |




***Note****: When using *`*GROUP BY*`* without pairing it with an aggregate function in the above manner, for readability, it is encouraged to instead use *`*SELECT DISTINCT*`* as it serves functionally the same purpose with more concise wording. For example, the above query would be greatly simplified and re-written as *`*SELECT DISTINCT*`* *`*customer_id FROM orders;*`



### Using PostgreSQL GROUP BY With an Aggregate Function

Aggregate functions can be paired with `GROUP BY` for quick and condensed result sets for analytical or informational purposes. Retrieving these summarized and calculated results is the primary use case for `GROUP BY`. 

A simple example that groups the results by a single column is to count the number of customers for each state:

`SELECT COUNT(customer_id), state FROM customers GROUP BY state;
`

Results:

| count | state |
| --- | --- |
| 2 | California |
| 1 | Arizona |
| 1 | North Carolina |
| 1 | Florida |




### Using PostgreSQL GROUP BY With a JOIN

To list the number of orders made by each customer, you can join both tables and group the results by the customer name. For this example, a table was created named `orders` that contains the following fictional data:


| order_id | customer_id | order_total |
| --- | --- | --- |
| 1 | 1 | 85 |
| 2 | 2 | 21 |
| 3 | 1 | 101 |
| 4 | 3 | 44 |
| 5 | 2 | 3 |


We’ll select the columns for each customer’s first and last name, along with the count of orders for each customer, and sort these results across multiple columns using `GROUP BY`.


`SELECT customers.firstname, customers.lastname, COUNT(orders.order_id)
FROM orders
LEFT JOIN customers ON orders.customer_id = customers.customer_id
GROUP BY firstname, lastname;
`

Results:

| firstname | lastname | count |
| --- | --- | --- |
| Martha | Parker | 1 |
| Cheyenne | Holmes | 2 |
| Alejandro | Ruiz | 2 |




### Using PostgreSQL GROUP BY With a JOIN and ORDER BY

Continuing to build on the previous example, we can also sort the results in ascending or descending order by adding on `ORDER BY`. Ascending (`ASC`) order is the default. Else, you can specify `DESC` to get customers who have placed the most orders first in the result set, as shown here:

`SELECT customers.firstname, customers.lastname, COUNT(orders.order_id)
FROM orders
LEFT JOIN customers ON orders.customer_id = customers.customer_id
GROUP BY firstname, lastname
ORDER BY count DESC;
`

***Note****: If attempting to run a query similar to this for analytical purposes, using the *[*<u>LIMIT</u>*](https://www.postgresql.org/docs/current/queries-limit.html)* clause would also likely be helpful for more performant and efficient results.*


Result:

| firstname | lastname | count |
| --- | --- | --- |
| Cheyenne | Holmes | 2 |
| Alejandro | Ruiz | 2 |
| Martha | Parker | 1 |





### Using PostgreSQL GROUP BY With HAVING

To filter over grouped rows, you can specify the `HAVING` clause to remove groups from the result set that do not satisfy a specified condition. This clause occurs after all grouping and aggregation of data, meaning after all rows are returned, records that do not meet the specified condition are removed from the resultset. 

As an example referencing the two tables we’ve already created and queries we’ve already tried, we can combine prior queries to return the count of customers in each state that have placed more than one order:

`SELECT customers.firstname, customers.lastname, customers.state, COUNT(orders.order_id) AS total_number_of_orders 
FROM customers 
LEFT JOIN orders 
ON customers.customer_id = orders.customer_id 
GROUP BY firstname, lastname, state 
HAVING COUNT(orders.order_id) > 1;
`

Result:

| firstname | lastname | state | total_number_of_orders |
| --- | --- | --- | --- |
| Alejandro | Ruiz | California | 2 |
| Cheyenne | Holmes | Arizona | 2 |



### Using PostgreSQL GROUP BY With WHERE

What about the `WHERE` clause? That’ll need to be used before `GROUP BY` in a `SELECT` statement, as records are filtered with `WHERE` before the execution of `HAVING`. As a result, using `WHERE` leads to a faster result over `HAVING`. Both statements can be used within the same query; however, if the `WHERE` clause can provide the desired result on its own, the use of this clause should be prioritized to ensure an efficiently designed query.

If we want to look for the count of orders for a customer whose name closely matches “Cheyenn,” we’d use the following SQL statement:

`SELECT customers.firstname, customers.lastname, customers.state, COUNT(orders.order_id) AS total_number_of_orders 
FROM customers 
LEFT JOIN orders 
ON customers.customer_id = orders.customer_id 
WHERE firstname LIKE 'Cheyen%' 
GROUP BY firstname, lastname, state;
`

Result:

| firstname | lastname | state | total_number_of_orders |
| --- | --- | --- | --- |
| Cheyenne | Holmes | Arizona | 2 |



## 
Next Steps

To learn more about the `GROUP BY` clause and how to use it in PostgreSQL, you can check out PostgreSQL’s documentation on [<u>SELECT</u>](https://www.postgresql.org/docs/current/sql-select.html) (and the `GROUP BY` clause) and [<u>aggregate expressions</u>](https://www.postgresql.org/docs/current/functions-aggregate.html). The [<u>Table Expressions</u>](https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-GROUP) section of the documentation also contains information relevant to the `GROUP BY` clause.

It’s possible to achieve even more complex operations by combining `GROUP BY` with `GROUPING SETS`, `CUBE`, and `ROLLUP`. More information about these three concepts can be found in the [<u>official PostgreSQL documentation</u>](https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-GROUPING-SETS).

If you want to experiment with `GROUP BY` in a natively optimized and performant PostgreSQL database, all while enjoying the benefits of automatic data partitioning and [columnar](https://www.tigerdata.com/blog/building-columnar-compression-in-a-row-oriented-database) compression for further performance improvements and a reduced storage footprint, [<u>create a free Timescale account today</u>](https://console.cloud.timescale.com/signup).