Choose shard count in Azure Cosmos DB for PostgreSQL

Applies to: ✅ PostgreSQL

Important

Azure Cosmos DB for PostgreSQL is on a retirement path and no longer recommended for new projects. Instead, use one of these two services:

For PostgreSQL workloads: use the Elastic Clusters feature of Azure Database For PostgreSQL to use the horizontal scale-out and distributed PostgreSQL features contained within the open source Citus extension.
For NoSQL workloads, use Azure Cosmos DB for NoSQL for a distributed database solution that includes a 99.999% availability service level agreement (SLA), instant autoscale, and automatic failover across multiple regions.

Choosing the shard count for each distributed table is a balance between the flexibility of having more shards, and the overhead for query planning and execution across them. If you decide to change the shard count of a table after distributing, you can use the alter_distributed_table function.

Multi-tenant SaaS use case

The optimal choice varies depending on your access patterns for the data. For instance, in the Multi-Tenant SaaS Database use-case we recommend choosing between 32 - 128 shards. For smaller workloads say <100 GB, you could start with 32 shards and for larger workloads you could choose 64 or 128. This choice gives you the leeway to scale from 32 to 128 worker machines.

Real-time analytics use case

In the Real-Time Analytics use-case, shard count should be related to the total number of cores on the workers. To ensure maximum parallelism, you should create enough shards on each node such that there is at least one shard per CPU core. We typically recommend creating a high number of initial shards, for example, 2x or 4x the number of current CPU cores. Having more shards allows for future scaling if you add more workers and CPU cores.

Keep in mind that, for each query, Azure Cosmos DB for PostgreSQL opens one database connection per shard, and that these connections are limited. Be careful to keep the shard count small enough that distributed queries won’t often have to wait for a connection. Put another way, the connections needed, (max concurrent queries * shard count), shouldn't exceed the total connections possible in the system, (number of workers * max_connections per worker).

Next steps

Learn more about cluster performance options.
Scale a cluster up or out
Rebalance shards

Feedback

Was this page helpful?

Last updated on 2025-10-30