From data lake to AI-ready analytics: Introducing new data source with S3 Tables in Amazon Quick
Query Apache Iceberg tables in S3 without moving data, reducing latency and cost.
Amazon QuickSight, the unified AI-powered analytics service, now supports Amazon S3 Tables (Apache Iceberg format) as a native data source. This allows organizations to directly query and visualize data stored in S3 table buckets without needing intermediate data warehouses or OLAP systems. The feature works with both Direct Query and SPICE (Super-fast, Parallel, In-memory Calculation Engine) modes, offering flexibility depending on latency and concurrency needs. Key benefits include removing data movement overhead, enabling near real-time analytics, and scaling seamlessly across massive datasets. This is particularly valuable for enterprises using open table formats like Apache Iceberg as a single source of truth.
In a typical architecture, transaction events from sources like point-of-sale systems, mobile apps, and IoT devices are streamed via Amazon Kinesis Data Streams into Data Firehose, which writes directly to an S3 table bucket. QuickSight's native connector then queries this data in near real-time, allowing business users to analyze fraud trends, approval rates, and regional patterns using natural language—no specialized ML expertise required. This reduces dependency on batch processing and complex data pipelines. The solution targets organizations in finance, retail, and other sectors where rapid, data-driven decisions are critical. By combining QuickSight's agentic AI capabilities with S3 Tables, Amazon reinforces its data lake-centric strategy.
- Amazon QuickSight now supports direct querying of Apache Iceberg tables in S3 table buckets, eliminating intermediate data layers.
- Works with both Direct Query and SPICE modes, enabling near real-time analytics on large-scale datasets without data movement.
- Use case shows streaming transaction data via Kinesis and Firehose for real-time fraud detection and regional trend analysis.
Why It Matters
Streamlines modern data lake architectures, reducing latency and operational complexity for AI-driven analytics at scale.