Amazon Redshift integrates with Aurora PostgreSQL and DynamoDB seamlessly.

NewsAmazon Redshift integrates with Aurora PostgreSQL and DynamoDB seamlessly.

Today marks a significant advancement in data management with the general availability of Amazon Aurora PostgreSQL-Compatible Edition and Amazon DynamoDB zero-ETL integrations with Amazon Redshift. This development is a game-changer for businesses seeking to streamline their data processing workflows. The zero-ETL integration, a concept that eliminates the need for the traditional extract, transform, and load (ETL) processes, allows businesses to make their transactional or operational data readily available in Amazon Redshift for analytics and machine learning. This seamless integration automates data replication, ensuring that data is consistently updated and available for analytical processing.

With zero-ETL integrations, businesses can run comprehensive analytics on data from various applications without the hassle of constructing and managing multiple data pipelines. This integration simplifies the process of consolidating data from diverse relational and non-relational sources into a singular data warehouse. In this article, we’ll delve into how you can get started with Amazon Aurora PostgreSQL and Amazon DynamoDB zero-ETL integrations with Amazon Redshift.

Understanding Zero-ETL Integration

To implement a zero-ETL integration, you need to define a source and designate Amazon Redshift as the target. The integration automatically replicates data from the source to the target data warehouse, ensuring a seamless data flow into Amazon Redshift. It also monitors the health of the pipeline to ensure data integrity and availability.

Let’s break down the process of setting up these integrations, focusing on how to create zero-ETL integrations that replicate data from Aurora PostgreSQL and DynamoDB to Amazon Redshift. We’ll also explore how to choose multiple tables or databases from Aurora PostgreSQL to replicate data to a single Amazon Redshift cluster. This integration offers flexibility without the burden of managing numerous ETL pipelines.

Getting Started with Aurora PostgreSQL Zero-ETL Integration

Before setting up your database, it’s crucial to create a custom cluster parameter group since Aurora PostgreSQL zero-ETL integration with Amazon Redshift requires specific parameters. In the Amazon RDS console, navigate to the Parameter groups section and create a new parameter group. Name it custom-pg-aurora-postgres-zero-etl, select Aurora PostgreSQL for the Engine type, and choose aurora-postgresql16 for the Parameter group family. This integration is compatible with PostgreSQL versions 16.4 and above.

Once the parameter group is created, modify it by selecting it from the Parameter groups page. Click on Actions and choose Edit to set the following parameters:

  • rds.logical_replication=1
  • aurora.enhanced_logical_replication=1
  • aurora.logical_replication_backup=0
  • aurora.logical_replication_globaldb=0

    After saving these changes, proceed to create an Aurora PostgreSQL database. During the setup, choose Aurora PostgreSQL (compatible with PostgreSQL 16.4 or above) from the available versions and select the custom cluster parameter group (custom-pg-aurora-postgres-zero-etl) in the Additional configuration section.

    Once your database is ready, connect to the Aurora PostgreSQL cluster, create a database named books, and a table called book_catalog. Insert some sample data to use with the zero-ETL integration.

    To initiate a zero-ETL integration, use an existing Amazon Redshift data warehouse. For assistance with creating and managing Amazon Redshift resources, refer to the Amazon Redshift Getting Started Guide.

    In the Amazon RDS console, navigate to the Zero-ETL integrations tab and select Create zero-ETL integration. Enter postgres-redshift-zero-etl as the Integration identifier and provide a description. Follow the prompts to select your source database and apply data filtering options using the database.schema.table pattern. Include your book_catalog table from the Aurora PostgreSQL books database. The * wildcard in filters will replicate all book_catalog tables across all schemas within the books database.

    Next, browse Redshift data warehouses and select your target Amazon Redshift data warehouse. Specify authorized principals and integration source on the target to allow Amazon Aurora to replicate data into the warehouse. You can choose to let Amazon RDS handle these configurations automatically or configure them manually in Amazon Redshift.

    After configuring the case sensitivity parameter and resource policy for the data warehouse, proceed to add tags and encryption. Review your setup and create the zero-ETL integration.

    Once the integration succeeds, access the integration details and create a database from the integration. Enter zeroetl_aurorapg as the Destination database name and finalize the setup.

    When the database is created, return to the Aurora PostgreSQL integration page to query data. Running a select query in the zeroetl_aurorapg database should show that the data in the book_catalog table is successfully replicated to Amazon Redshift.

    By selecting multiple tables or databases from the Aurora PostgreSQL source database, you can replicate additional data to the same Amazon Redshift cluster. To add more tables, update the Data filtering options to include new tables using the database.schema.table format. For example, you can create another table named publisher in the Aurora PostgreSQL cluster and insert sample data. Modify the Data filtering options to include the publisher table for replication.

    Switch to the Amazon Redshift Query editor to verify that the new publisher table and its records are replicated to the data warehouse.

    Getting Started with DynamoDB Zero-ETL Integration

    To create an Amazon DynamoDB zero-ETL integration, use an existing DynamoDB table named Book_Catalog. In the Amazon Redshift console, navigate to Zero-ETL integrations, select Create zero-ETL integration, and choose Create DynamoDB integration. Provide a name and description for the integration and proceed to the next steps.

    Choose your DynamoDB table and specify a resource policy with authorized principals and integration sources. Enable point-in-time recovery (PITR) on the source table before creating the integration. Amazon DynamoDB can apply the necessary configurations for you, or you can do it manually.

    Select your existing Amazon Redshift Serverless data warehouse as the target and proceed to add tags and encryption. Review your settings and create the DynamoDB integration.

    Create a database from the integration by entering zeroetl_dynamodb as the Destination database name. Once the database is created, query the data to confirm that the data from the DynamoDB Book_Catalog table is replicated to Amazon Redshift.

    Insert another entry into the DynamoDB Book_Catalog table and verify that the new record is replicated to the data warehouse by refreshing the select query in Amazon Redshift.

    Benefits and Availability

    Zero-ETL integrations between Aurora PostgreSQL and DynamoDB with Amazon Redshift unify data from multiple database clusters, unlocking valuable insights. Amazon Redshift supports cross-database queries and materialized views based on multiple tables, simplifying analytics, improving operational efficiency, and optimizing costs. With zero-ETL, there’s no need to manage complex ETL pipelines.

    Aurora PostgreSQL zero-ETL integration with Amazon Redshift is now available in several AWS Regions, including US East (N. Virginia), US East (Ohio), US West (Oregon), and others in Asia Pacific and Europe. Amazon DynamoDB zero-ETL integration is available in all commercial, China, and GovCloud AWS Regions.

    For pricing details, refer to the Amazon Aurora and Amazon DynamoDB pricing pages. To learn more about these integrations, consult the Working with Aurora zero-ETL integrations with Amazon Redshift and Amazon Redshift Zero-ETL integrations documentation.

For more Information, Refer to this article.

Neil S
Neil S
Neil is a highly qualified Technical Writer with an M.Sc(IT) degree and an impressive range of IT and Support certifications including MCSE, CCNA, ACA(Adobe Certified Associates), and PG Dip (IT). With over 10 years of hands-on experience as an IT support engineer across Windows, Mac, iOS, and Linux Server platforms, Neil possesses the expertise to create comprehensive and user-friendly documentation that simplifies complex technical concepts for a wide audience.
Watch & Subscribe Our YouTube Channel
YouTube Subscribe Button

Latest From Hawkdive

You May like these Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.