Amazon RDS-Redshift zero-ETL integration enables real-time analytics, now available

NewsAmazon RDS-Redshift zero-ETL integration enables real-time analytics, now available

Amazon RDS for MySQL Zero-ETL Integration with Amazon Redshift: A Comprehensive Guide

In the evolving landscape of big data and analytics, integrating data from various sources quickly and efficiently is critical. One of the latest advancements in this realm is the Zero-ETL (Extract, Transform, Load) integration technology, which aims to provide seamless data unification across different applications and data sources. This technology is designed to offer holistic insights by breaking down data silos—a common problem where data is stored in isolated systems, making it hard to access and analyze comprehensively.

Today, we’re excited to announce a significant milestone in this technology: the general availability of Zero-ETL integration between Amazon RDS for MySQL and Amazon Redshift. This integration brings several new features, including data filtering, support for multiple integrations, and the ability to configure Zero-ETL setups using AWS CloudFormation templates. Let’s delve into these features and understand how you can leverage them for your data strategy.

What is Zero-ETL Integration?

Zero-ETL integration simplifies the process of making petabytes of transactional data available in Amazon Redshift almost instantaneously after it is written into Amazon RDS for MySQL. This fully managed, no-code solution eliminates the need to create your own ETL jobs, thereby simplifying data ingestion, reducing operational overhead, and potentially lowering data processing costs.

Last year, we announced the general availability of Zero-ETL integration for Amazon Aurora MySQL-Compatible Edition and a preview for Aurora PostgreSQL-Compatible Edition, Amazon DynamoDB, and RDS for MySQL. Now, with the GA release of Zero-ETL for Amazon RDS for MySQL, businesses can take advantage of even more features to streamline their data workflows.

Key Features of Zero-ETL Integration

Data Filtering

Data filtering is an essential feature for companies looking to optimize their data processing and storage costs. By selectively replicating only the necessary subset of data from production databases, businesses can significantly reduce overhead. Additionally, data filtering helps in excluding personally identifiable information (PII) from datasets, which is crucial for compliance with data protection regulations.

For instance, a healthcare provider might wish to exclude sensitive patient information when generating aggregate reports for analysis. Similarly, an e-commerce business might want to share customer spending patterns with its marketing team while excluding any identifying information. However, certain scenarios, such as fraud detection, require all data to be available in near real-time for accurate inferences, making filtering less applicable.

You can enable filtering either during the initial setup of the Zero-ETL integration or by modifying an existing integration. The filtering options are accessible in the "Source" step of the Zero-ETL creation wizard. Users can enter filter expressions to include or exclude specific databases or tables, formatted as database*.table*. Multiple expressions can be added and will be evaluated from left to right.

When modifying an existing integration, new filtering rules will apply from the moment you confirm the changes, and Amazon Redshift will drop tables that no longer match the filter criteria.

For a detailed guide on setting up data filters for Amazon Aurora Zero-ETL integrations, you can refer to the AWS blog post, as the steps are very similar.

Multiple Integrations from a Single Database

With the new release, you can now create up to five Zero-ETL integrations from a single Amazon RDS for MySQL database to different Amazon Redshift data warehouses. This feature allows different teams within an organization to access transactional data while maintaining ownership of their specific data warehouses.

For example, you can use data filtering to fan out different sets of data to development, staging, and production Amazon Redshift clusters from the same production database. This flexibility enables better data management and enhances collaboration among teams.

Another potential use case is the consolidation of Amazon Redshift clusters by replicating data to different warehouses. You can leverage Amazon Redshift materialized views to explore data, power Amazon Quicksight dashboards, share data, train machine learning models using Amazon SageMaker, and more.

Setting Up Zero-ETL Integration

To get started with Zero-ETL integration, you can follow a step-by-step walkthrough provided in the AWS blog post, which describes the setup process for Aurora MySQL-Compatible. The experience is very similar for Amazon RDS for MySQL.

Technical Requirements and Availability

The Zero-ETL integration is available for Amazon RDS for MySQL versions 8.0.32 and later, Amazon Redshift Serverless, and Amazon Redshift RA3 instance types. It is supported in various AWS regions, and you can find a detailed list of supported regions in the AWS documentation.

Besides using the AWS Management Console, you can also set up Zero-ETL integration via the AWS Command Line Interface (AWS CLI) and by using an AWS SDK such as boto3, the official AWS SDK for Python. Detailed instructions can be found in the AWS documentation.

Conclusion

The general availability of Zero-ETL integration between Amazon RDS for MySQL and Amazon Redshift marks a significant advancement in data management and analytics. This solution allows businesses to replicate data for near real-time analytics without the need to build and manage complex data pipelines. The inclusion of features like data filtering and multiple integrations further enhances the flexibility and usability of this technology.

Whether you aim to optimize data processing costs, comply with data protection regulations, or empower different teams with specific data sets, Zero-ETL integration provides a robust solution. We encourage you to explore this technology and see how it can benefit your organization.

For more information and a deeper dive into setting up Zero-ETL integrations, refer to the AWS documentation.

— Matheus Guimaraes

For more Information, Refer to this article.

Neil S
Neil S
Neil is a highly qualified Technical Writer with an M.Sc(IT) degree and an impressive range of IT and Support certifications including MCSE, CCNA, ACA(Adobe Certified Associates), and PG Dip (IT). With over 10 years of hands-on experience as an IT support engineer across Windows, Mac, iOS, and Linux Server platforms, Neil possesses the expertise to create comprehensive and user-friendly documentation that simplifies complex technical concepts for a wide audience.
Watch & Subscribe Our YouTube Channel
YouTube Subscribe Button

Latest From Hawkdive

You May like these Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.