Amazon Aurora PostgreSQL Limitless Database Now Generally Available
Amazon has announced the general availability of the Amazon Aurora PostgreSQL Limitless Database, a new feature for Amazon Aurora that introduces serverless horizontal scaling through sharding. This development allows users to exceed existing limits for write throughput and storage by distributing database workloads across multiple Aurora writer instances while still managing it as a single database.
Understanding the Architecture
The architecture of the Aurora PostgreSQL Limitless Database is built on a two-layer structure. This design consists of multiple database nodes organized into a DB shard group, which are categorized as either routers or shards. This separation enables dynamic scaling based on workload requirements.
- Routers: These nodes handle SQL connections, direct SQL commands to the appropriate shards, ensure overall system consistency, and relay query results back to clients.
- Shards: These nodes store subsets of tables and maintain full data copies. They are responsible for processing queries sent by routers.
Users can organize their data into three types of tables:
- Sharded Tables: These tables are distributed across multiple shards. The division is based on specific columns known as shard keys, which are instrumental in scaling large, data-intensive tables.
- Reference Tables: These tables store a full copy of data on every shard. By doing so, they optimize join queries by reducing unnecessary data movement. They are typically used for stable reference data like product catalogs or ZIP codes.
- Standard Tables: These tables function similarly to regular Aurora PostgreSQL tables and are placed on a single shard to optimize join queries. Users have the flexibility to convert standard tables into either sharded or reference tables.
Getting Started with Aurora PostgreSQL Limitless Database
To begin using the Aurora PostgreSQL Limitless Database, users can employ the AWS Management Console or the AWS Command Line Interface (CLI). The process involves creating a new database cluster that utilizes this feature, adding a DB shard group, and querying your data.
Steps to Create an Aurora PostgreSQL Limitless Database Cluster
- Create a New Database Cluster: Access the Amazon RDS console and select "Create database." Choose "Aurora (PostgreSQL Compatible)" as the engine, and then select "Aurora PostgreSQL with Limitless Database" for PostgreSQL 16.4 compatibility.
- Configure the DB Shard Group: Assign a name to your DB shard group and specify values for both minimum and maximum capacity, measured in Aurora Capacity Units (ACUs). These values will determine the initial number of routers and shards in the shard group. The system dynamically adjusts the node’s capacity based on current utilization.
- Deploy the DB Shard Group: Decide on the level of redundancy for the DB shard group by choosing options like no compute redundancy, one standby in a different Availability Zone, or two standbys across different zones.
Once your DB shard group is created, it will be visible on the Databases page. You can manage the shard group by changing capacity, splitting shards, or adding routers as needed.
Creating Tables in Aurora PostgreSQL Limitless Database
With the Aurora PostgreSQL Limitless Database, users can create sharded, reference, and standard tables. These tables can be created or converted from existing standard tables to take advantage of distributed or replicated structures.
For instance, you might create a sharded table named "items" using specific shard keys, such as item_id and item_cat. Similarly, reference tables can be created to store data like colors, which remain consistent across all shards.
Querying Aurora PostgreSQL Limitless Database Tables
The Aurora PostgreSQL Limitless Database supports standard PostgreSQL syntax for queries, making it accessible for users familiar with PostgreSQL. You can load data into these tables using the COPY command or a dedicated data loading utility. Queries are executed on routers and shards, with the router coordinating the process.
The system employs two main querying methods:
- Single-Shard Query: This approach is used when all data needed for the query resides on a single shard, allowing the entire operation to be processed there.
- Distributed Query: This method is used when a query involves multiple shards. The router manages a distributed transaction, coordinating operations across the participating shards.
Important Considerations
- Compute: Users are limited to one DB shard group per DB cluster, with a maximum capacity range of 16–6144 ACUs. The number of routers and shards is initially determined by the maximum capacity set at creation.
- Storage: The database supports only the Amazon Aurora I/O-Optimized storage configuration, with a maximum shard capacity of 128 TiB and a reference table size limit of 32 TiB. Users can reclaim space using the PostgreSQL vacuuming utility.
- Monitoring: Users can monitor their database using Amazon CloudWatch, CloudWatch Logs, or Performance Insights. New statistics functions and views are available for enhanced monitoring and diagnostics.
Availability
The Amazon Aurora PostgreSQL Limitless Database is now available with PostgreSQL 16.4 compatibility in multiple AWS regions, including US East (N. Virginia), US East (Ohio), US West (Oregon), and several regions in Asia Pacific and Europe.
For more detailed information, users can refer to the Amazon Aurora User Guide and other AWS resources. This advancement promises to deliver unprecedented scalability and flexibility for database management, making it an exciting option for enterprises looking to manage large-scale, data-intensive applications seamlessly.
For more Information, Refer to this article.