Skip to main content
  1. CloudCertPro - Learn the Architecture Behind the Certification
  2. >
  3. Azure Cloud Knowledge Hub - CloudCertPro
  4. >
  5. Azure Domains Learning Hub: Master Azure by Capability Domains
  6. >
  7. Azure Database Domain

Azure Database Domain

Azure Databases provide managed, scalable, and secure data persistence for transactional workloads, analytical systems, and AI-powered applications. This domain page explains databases as a core architectural layer—not as a catalog of products—focusing on the models, design decisions, consistency trade-offs, and patterns that shape cloud-native data systems. It connects databases to enterprise architecture and modern AI workloads, and is structured for reuse across multiple Azure certifications.


1. Overview
#

What Are Databases in Cloud Architecture
#

In cloud architecture, databases are managed services that store, organize, and retrieve structured and semi-structured data. They provide transactional guarantees, query languages, indexing, and high availability mechanisms. Unlike raw storage (which manages bytes and blobs), databases add a logical data model—tables, documents, graphs, key-value pairs—and enforce constraints, relationships, and access patterns through a query engine.

Databases as Structured Data Management Systems
#

Databases transform raw storage into a reliable system of record. They deliver:

  • Data integrity through ACID transactions (relational) or tunable consistency (NoSQL).
  • Declarative access via SQL or specialized APIs, enabling complex queries, joins, and aggregations.
  • Concurrency control for simultaneous reads and writes across distributed clients.
  • Built-in high availability through replication, failover, and automated backups.

Every cloud application that manages orders, inventory, user profiles, session state, or analytical reports depends on a database.

Why Database Selection Is a Critical Architecture Decision
#

The choice of database directly impacts:

  • Scalability: how easily the system handles growth in data volume and request rate.
  • Consistency: whether every read returns the latest write, or if the system trades off strict correctness for performance.
  • Query capability: what questions can be asked efficiently without application-level computation.
  • Cost: pricing models vary widely—provisioned throughput, serverless, vCore-based, or storage-based.
  • Operational complexity: managed services differ in patching, replication configuration, and scaling options.

A mismatch between data model and access pattern leads to performance bottlenecks, high costs, and brittle application code. Database architecture is therefore a first-order design decision.


2. Core Database Types in Azure
#

Azure provides a spectrum of managed database services, each optimized for a specific class of workload.

Relational Databases
#

Relational databases store data in tables with predefined schemas, enforce referential integrity, and support powerful SQL queries with joins and transactions.

  • Azure SQL Database: a fully managed, intelligent relational database service based on the latest stable SQL Server engine. It offers single databases, elastic pools for multi-tenant cost management, and a serverless tier that auto-scales and pauses during inactivity. Key characteristics:

    • ACID transactions, strong consistency.
    • Built-in intelligence (automatic tuning, indexing).
    • Scalable up to 100 TB with Hyperscale tier.
    • Suitable for modern cloud-born applications requiring SQL with minimal management.
  • Azure SQL Managed Instance: a fully managed instance with near-100% SQL Server compatibility, including SQL Agent, cross-database queries, and CLR. It is the target for migrating legacy SQL Server workloads that need instance-level features not available in Azure SQL Database.

  • SQL Server on Azure Virtual Machines: for scenarios requiring full OS and SQL Server control, such as custom configurations, third-party extensions, or regulatory constraints. The highest operational overhead but maximum flexibility.

NoSQL Databases
#

Azure Cosmos DB is a globally distributed, multi-model NoSQL database. It supports multiple data models with the same backend:

  • Document model (core SQL API): store and query JSON documents, with automatic indexing of all properties.
  • Key-value (Table API): schema-less key-value store compatible with Azure Table Storage.
  • Graph (Gremlin API): for highly connected data.
  • Column-family (Cassandra API): for wide-column workloads.
  • MongoDB API: compatibility with MongoDB wire protocol.

Key architectural features:

  • Global distribution: turnkey replication to multiple Azure regions with multi-master or single-master configuration.
  • Tunable consistency: five well-defined consistency levels from strong to eventual, enabling trade-offs between performance, latency, and correctness.
  • Elastic scale: automatically partitions data using a chosen partition key; scale throughput or storage independently.
  • SLA-backed latency: <10 ms read and <15 ms write at the 99th percentile for same-region operations.

Cosmos DB is ideal for applications requiring global presence, flexible schemas, and massive scale, such as IoT, gaming, personalization, and real-time AI data.

Open-Source Databases
#

Azure provides fully managed versions of popular open-source databases:

  • Azure Database for PostgreSQL: supports the Postgres engine with options for single server, flexible server (recommended), and Hyperscale (Citus) for sharded multi-tenant analytics. Ideal for applications that depend on PostGIS, JSONB, or open-source extensions.
  • Azure Database for MySQL: flexible server architecture with built-in high availability and maintenance windows. Common for web and LAMP-stack applications.
  • Azure Database for MariaDB (retired/transitioning, but flexible server for MySQL covers most use cases).

These services offer automated patching, backups, and scaling while retaining full community compatibility.

In-Memory Caching
#

Azure Cache for Redis provides a high-throughput, low-latency data store for caching, session state, message brokering, and real-time analytics. It offers:

  • Tiers: Basic (single node), Standard (replicated), Premium (persistence, clustering, VNet), Enterprise/Enterprise Flash (Redis Enterprise with active-active geo-replication, module support, and enhanced performance).
  • Common patterns: cache-aside, session store, pub/sub, leaderboard, rate limiter.
  • Architectural role: reduces latency for hot data, offloads transactional databases, and enables real-time features.

Redis complements both relational and NoSQL databases by absorbing read traffic and accelerating response times.

Specialized Databases
#

  • Azure Elastic SAN (Storage Area Network) is technically a storage service, but can provide shared block storage for databases running on VMs (e.g., SQL Server FCI). Not a database service itself.
  • Azure Synapse Analytics: a parallel, massively parallel processing (MPP) data warehouse for OLAP workloads (covered in analytics contexts).
  • Azure Data Explorer: a fast, fully managed data analytics service for real-time analysis of large volumes of telemetry and time-series data, with Kusto Query Language.

OLTP vs OLAP vs Cache Classification
#

Understanding workload type is fundamental to database selection:

  • OLTP (Online Transaction Processing): high volumes of small, fast, concurrent transactions; require ACID guarantees. Services: Azure SQL Database, SQL Managed Instance, PostgreSQL, MySQL, Cosmos DB (for NoSQL OLTP).
  • OLAP (Online Analytical Processing): complex queries over large datasets, often with aggregations and joins. Services: Azure Synapse Analytics, Databricks SQL, Cosmos DB (with analytical store), PostgreSQL Hyperscale (Citus) for multi-tenant analytics.
  • Caching layer: sub-millisecond reads of frequently accessed data. Service: Azure Cache for Redis.

Relational vs Document vs Key-Value Models
#

  • Relational: tables with fixed schemas, joins, referential integrity. Best for structured data with well-defined relationships (e.g., financial records, ERP systems).
  • Document: JSON documents with flexible schema, nested objects and arrays. Best for content management, product catalogs, user profiles where schema evolves rapidly.
  • Key-value: simple lookup by key. Best for session state, shopping carts, configuration, and caching.

Azure Cosmos DB spans document and key-value (plus others), while relational services cover the relational model.


3. Database Architecture Concepts
#

Consistency Models
#

  • Strong consistency: after a write completes, all subsequent reads return the latest value. In Azure SQL Database and PostgreSQL, strong consistency is the default (synchronous replication to a secondary replica via Always On or similar). Cosmos DB offers strong consistency with a performance penalty and regional availability constraints.
  • Eventual consistency: reads may return stale data, but the system converges over time. Cosmos DB offers eventual, consistent prefix, session, and bounded staleness levels. This allows higher throughput and lower latency in globally distributed scenarios.
  • Session consistency: reads within a session are monotonic—you always see your own writes. A popular middle ground.

Architects must map application requirements to consistency level: financial ledgers demand strong consistency; a social media feed can tolerate eventual.

Partitioning and Sharding Strategies
#

  • Vertical partitioning: splitting columns into different tables. Rarely used as primary scaling.
  • Horizontal partitioning (sharding): splitting rows across multiple database instances based on a partition key. In Cosmos DB, a logical partition is defined by a partition key, and data is automatically distributed across physical partitions. In Azure SQL, sharding can be implemented at the application layer using the Elastic Database client library or with Hyperscale’s built-in sharding.
  • Partition key selection: critical for performance. It should distribute request volume and storage uniformly. In Cosmos DB, a poorly chosen partition key leads to hot partitions and throttling.

Replication and High Availability
#

  • Synchronous replication: a transaction commit waits for the data to be written to a replica before acknowledging success. Provides zero data loss but adds latency. Used in Azure SQL Database (Always On) and Cosmos DB (strong consistency with multi-master).
  • Asynchronous replication: the primary commits immediately, and replicas catch up asynchronously. Lower latency, potential for data loss on failover. Cosmos DB uses asynchronous replication across regions (except for strong consistency across regions, which has performance trade-offs).
  • Auto-failover: in case of primary failure, the platform promotes a replica. All managed databases provide automatic failover groups or region failover options.

Read Replicas and Scaling Patterns
#

  • Read scale-out: offload read-only queries to secondary replicas. Azure SQL Database and PostgreSQL support readable replicas (active geo-replication for Azure SQL, read replicas for PostgreSQL). Cosmos DB distributes reads across replicas in a region automatically.
  • Read/write split: application directs writes to the primary and reads to replicas. This requires the application to handle replication lag.

Backup and Disaster Recovery Strategies
#

  • Automated backups: all managed databases provide automatic backups with point-in-time restore (PITR). Retention periods vary (7–35 days for Azure SQL, up to 35 days for Cosmos DB continuous backup).
  • Geo-restore: restore a database from geo-replicated backups to a different region.
  • Long-term retention (LTR): for compliance, Azure SQL supports up to 10 years of backup retention.
  • Failover groups: Azure SQL Database and Managed Instance can be configured in failover groups for automated regional failover with read/write split. Cosmos DB offers multi-region failover policies.
  • Backup protection: Azure Backup can protect database backups at the resource level; soft delete and purge protection prevent accidental or malicious deletion.

4. Database Design Decisions
#

SQL vs NoSQL Selection Criteria
#

Factor Relational (Azure SQL, PostgreSQL) NoSQL (Cosmos DB)
Schema Fixed, enforced by the database Flexible, schema-on-read
Relationships Joins, foreign keys Denormalized, embedded documents
Transactions Full ACID across multiple rows/tables ACID within a logical partition (Cosmos); multi-document transactions are scoped
Scalability Vertical + limited horizontal (Hyperscale) Horizontal, elastic scale-out by design
Query language SQL, rich aggregations, window functions SQL-like (Cosmos SQL API), MongoDB, etc.
Global distribution Active geo-replication, failover groups Turnkey multi-master, tunable consistency

Guidance: Use relational databases for well-defined schemas, complex joins, and strict transactional integrity. Use NoSQL for evolving schemas, global distribution, and extreme scale where data can be partitioned easily by a key.

When to Use Cosmos DB vs Azure SQL
#

  • Cosmos DB: you need global low-latency (multiple regions), flexible schema, massive throughput (millions of requests per second), or you are building event sourcing, IoT telemetry, real-time personalization, or a globally distributed web/mobile app. The data model fits documents or key-value.
  • Azure SQL Database: you need relational integrity, ad-hoc SQL querying with complex joins, stored procedures, or you are migrating an existing SQL Server application. The workload is primarily within a single region (with geo-replication as DR). Hyperscale tier provides large capacity and scalability without sharding.

In practice, a single solution may use both: Cosmos DB for user profile and activity data, Azure SQL for billing and transactions.

Scaling Up vs Scaling Out
#

  • Scale up (vertical): increase instance size (vCores, memory). Simpler, but has a physical ceiling and often requires downtime for changes (though some platforms offer minimal interruption). Azure SQL Database and PostgreSQL support vertical scaling.
  • Scale out (horizontal): add more nodes (read replicas, shards). Better for massive workloads but requires application awareness (partitioning, routing). Cosmos DB scales horizontally by adding more throughput (RU/s) and storage; Azure SQL Hyperscale scales out storage and read replicas; PostgreSQL Hyperscale (Citus) distributes across nodes.

Choose scale out when data size or request volume exceeds the maximum capacity of a single node, or when you require global distribution.

Latency vs Consistency Trade-offs
#

  • Strong consistency ensures correctness but adds latency due to synchronous replication. In Cosmos DB, strong consistency across multiple regions is only possible if regions are within a limited distance (or uses single-master with strong).
  • Eventual or session consistency reduces latency by serving reads from local replicas that may be slightly behind. Acceptable for many user-facing features.

In financial systems, choose strong consistency within a region. For a global social app, session consistency is often sufficient.

Transactional vs Analytical Workload Separation
#

Mixing OLTP and OLAP on the same database can cause resource contention. Best practice:

  • Use a dedicated transactional database (Azure SQL, PostgreSQL) for low-latency writes and reads.
  • Use a separate analytical store: replicate data to Azure Synapse, Cosmos DB analytical store (for near-real-time analytics), or use PostgreSQL read replicas for reporting.
  • For Cosmos DB, the analytical store enables large-scale analytical queries without affecting transactional performance.

Caching Strategies Using Redis
#

  • Cache-aside: application checks Redis first; if miss, fetches from database and populates cache. Common for product details, user profiles.
  • Read-through/write-through: cache sits in front, transparently loading and persisting data. Simpler application code but requires careful invalidation.
  • Session store: store temporary user session data in Redis instead of relational databases, reducing database load.
  • Pub/sub: Redis supports publish/subscribe for real-time notifications, used in agent systems and event-driven communication.

Cache invalidation and TTL design are critical; stale cache data can cause inconsistency.


5. Databases in Enterprise Architecture
#

Multi-Tier Applications
#

In classic three-tier architecture:

  • Web tier stores session state in Redis or Cosmos DB (fast key-value).
  • Business tier uses Azure SQL for transactions and Cosmos DB for user activity data.
  • Data tier (reporting) uses read replicas or a separate analytical database.

Databases are deployed within the private VNet, accessed via private endpoints, and only the API tier has credentials (managed identity).

Microservices Data Ownership Patterns
#

Each microservice owns its data and has its own database (Database per Service). This prevents coupling through a shared schema. Implementation:

  • Service A uses Azure SQL for transactional orders.
  • Service B uses Cosmos DB for product catalog.
  • Service C uses Redis for real-time inventory cache.
  • Communication between services is via APIs or events, not direct database access.

This pattern increases architectural freedom but requires careful data synchronization and eventual consistency across services.

Event-Driven Systems with Data Persistence
#

In event-driven architectures, databases store both the current state and the event log. For example:

  • Cosmos DB’s change feed emits events when documents change, which can trigger Functions or stream to other services.
  • The event log (Service Bus, Event Hubs) is the authoritative record; the database is a projection optimized for querying.

This CQRS (Command Query Responsibility Segregation) pattern allows independent scaling of read and write sides.

Hybrid Enterprise Data Integration
#

Many enterprises run databases on-premises. Azure databases can connect to on-premises via VPN or ExpressRoute, enabling:

  • Replication: move data from on-premises to cloud databases for reporting or migration.
  • Federation: centralize identity (Entra ID) for authentication, using Azure SQL with Windows Integrated Authentication or Cosmos DB with Azure AD.
  • Data synchronization: Azure Data Factory, Change Data Capture (CDC), or custom pipelines sync data between on-premises and cloud.

6. Databases for AI & LLM Systems
#

Modern AI systems rely on databases for state, memory, metadata, and feedback storage.

Structured Memory for LLM Systems
#

LLMs are stateless; they need external memory to maintain context beyond the context window. Databases serve as:

  • Short-term memory: conversation history stored in Redis for fast retrieval during the session.
  • Long-term memory: Cosmos DB stores user preferences, facts, and past interactions. When an AI agent needs to recall a user’s preferences, it queries a long-term memory database.

Memory can be plain text or embedded (vectorized) for semantic search. The database holds both the raw text and the embedding reference.

RAG Structured Retrieval Layer
#

While vector databases (Azure AI Search) store embeddings for similarity search, structured metadata about the documents is often stored in a relational or NoSQL database. For example:

  • A Cosmos DB container holds document metadata (title, author, date, permissions).
  • When a user query is processed, the RAG pipeline first retrieves document IDs from the vector store, then fetches metadata and access control from Cosmos DB, ensuring the user is authorized.

Vector-Adjacent Metadata Storage
#

Not all data fits a vector store. Azure SQL or Cosmos DB can store:

  • Product inventories with category hierarchies (relational).
  • User access policies for retrieved chunks.
  • Structured attributes that can be used for filtering search results (e.g., “region = ‘EU’”).

Databases thus act as the “filter engine” that narrows down vector search results before the LLM sees them.

AI Agent State Persistence
#

Agents maintain state across tool calls and reasoning loops. This state includes:

  • The current plan or task tree.
  • Intermediate results from tool calls.
  • The agent’s “working memory” (list of completed steps).

A database (Cosmos DB or Azure SQL) stores this state durably, so agent execution can resume after a crash or scale-out. Durable Functions also use Azure Storage, but custom agents may prefer a database for queryability.

Feedback Loops and Conversation History Storage
#

To improve AI models and user experience, conversation history and user feedback are stored in databases for:

  • Analytics: Azure Synapse or Databricks queries the conversation store to identify common failure modes.
  • Fine-tuning: conversations are exported to Data Lake, preprocessed, and used to fine-tune a model.
  • Compliance: immutable logs (Azure SQL with temporal tables or Cosmos DB with TTL policies) ensure audit readiness.

Databases provide the durability and indexing necessary for querying historical AI interactions.


7. Security & Governance
#

Encryption at Rest and in Transit
#

  • At rest: all Azure databases automatically encrypt data using storage encryption with Microsoft-managed keys. Many support customer-managed keys (CMK) via Azure Key Vault for additional control.
  • In transit: TLS 1.2 or higher is enforced. Some services (Azure SQL) support double encryption with Always Encrypted, where data is encrypted by the client and the database never sees plaintext.

Identity-Based Access Control
#

All managed databases support Microsoft Entra ID authentication:

  • Azure SQL Database: supports Entra ID users and groups; can be used for multi-factor authentication and conditional access.
  • Cosmos DB: RBAC roles for data plane (Cosmos DB Built-in Data Contributor, Reader) assigned to managed identities or users.
  • PostgreSQL/MySQL: support Entra ID authentication natively in Flexible Server.

Principle of least privilege: applications use managed identity; human admins use PIM with just-in-time access.

Network Isolation (Private Link) #

All database services can be accessed via Azure Private Link, which assigns a private IP within your VNet. This removes public exposure entirely and allows on-premises access via VPN/ExpressRoute. Service Endpoints are an alternative but still use the service’s public FQDN.

Data Classification and Compliance
#

  • Azure SQL Database includes data discovery and classification for GDPR, HIPAA, and PCI. It can recommend sensitivity labels and audit access to classified data.
  • Cosmos DB provides logging of data plane operations for compliance.
  • Microsoft Purview integration enables cataloging, lineage, and classification across database assets.

Backup Protection and Data Loss Prevention
#

  • Soft delete and purge protection prevent accidental database or account deletion.
  • Immutable backups with customer-managed keys ensure that even administrators cannot tamper with backups.
  • Long-term retention for regulatory compliance.

8. Performance, Reliability & Scalability
#

Horizontal vs Vertical Scaling
#

  • Azure SQL Database: vertical scaling (DTU or vCore) can be done manually or with auto-scale (serverless). Hyperscale tier provides horizontal storage scaling (up to 100 TB) and named read replicas for scale-out reads.
  • Cosmos DB: horizontal scaling is fundamental; throughput (RU/s) and storage scale independently. Partitioning is automatic.
  • PostgreSQL Flexible Server: vertical scaling with read replicas for read scaling.
  • Redis: cluster mode for horizontal scaling.

Geo-Replication Strategies
#

  • Azure SQL Database: active geo-replication (readable secondaries) and failover groups for automated regional failover.
  • Cosmos DB: multi-master or single-master with automatic failover; choose regions and failover priority.
  • PostgreSQL/MySQL: read replicas in other regions, but failover may be manual or managed by Azure.

Failover Architectures
#

Define Recovery Time Objective (RTO) and Recovery Point Objective (RPO):

  • Automatic failover groups (Azure SQL) provide low RTO (<1 min) with RPO = 0 (synchronous replication in premium/business critical tier) or minimal RPO (asynchronous geo-replication).
  • Cosmos DB multi-region failover is automatic (if configured) and has RPO ~0 for strong consistency (single region loss) or bounded staleness for inter-region.

Query Optimization Strategies
#

  • Indexing: Azure SQL Database automatically recommends and tunes indexes; for custom workloads, manually define covering indexes and include columns.
  • Cosmos DB: automatic indexing of all properties by default; you can customize indexing policy for cost optimization.
  • Partition key selection (Cosmos DB): ensure even distribution of RU consumption and storage.
  • Avoid cross-partition queries: use the partition key in WHERE clauses to minimize RU cost.

Indexing Strategies and Performance Tuning
#

  • Azure SQL: use columnstore indexes for analytical queries; avoid index fragmentation.
  • Cosmos DB: use composite indexes for multi-property queries, and tune index precision based on query patterns.
  • Redis: choose appropriate eviction policies (volatile-lru, allkeys-lru) to manage memory.

9. Certification Mapping
#

Certification Database Domain Relevance
AZ-104 Deploy and manage Azure SQL Database, Cosmos DB basics, configure connectivity and security, perform backups.
AZ-305 Design database solutions: choose database type, design for scale and high availability, implement disaster recovery, integrate hybrid data.
AI-900 Understand data concepts for AI: structured vs unstructured, databases for storing AI model inputs and outputs.
AI-103 Implement data persistence for AI apps: store conversation history, manage agent state, integrate structured data with RAG.
AI-300 Architect data pipelines for ML: use databases for metadata storage, feature stores, model registry, and feedback loops.
GH-600 Design agent memory and persistence: structured long-term memory, session state, conversation logs, and tool result storage.

10. Real-World Architecture Example
#

Scenario: A global e-commerce platform with microservices, AI-powered recommendations, and a customer service agent.

Database Components:

  • Transactional database: Azure SQL Database (Business Critical tier, zone-redundant) stores orders, payments, and customer accounts. Failover groups replicate synchronously to a secondary region. The API microservice uses a managed identity to connect via private endpoint.
  • Product catalog: Cosmos DB (SQL API) stores product details, reviews, and inventory per region. Multi-master configuration with US and Europe regions provides low-latency reads/writes. Partition key is productId; consistency level set to session for performance. The catalog service uses Cosmos DB change feed to notify the inventory system of updates.
  • Caching layer: Azure Cache for Redis (Enterprise tier) is used for:
    • Shopping cart sessions (key-value store with TTL).
    • Cached product data to reduce Cosmos DB RU consumption.
    • Real-time inventory snapshots to speed up product availability checks.
  • Analytics: every evening, change data capture (CDC) from Azure SQL and Cosmos DB is ingested into Azure Synapse Analytics via Azure Data Factory. Analysts run queries over sales data, and a data pipeline prepares training features for the recommendation model.
  • AI recommendation engine: The trained model (stored in Azure ML) serves through an inference endpoint. The model metadata and A/B experiment configurations are stored in Azure SQL Database (a separate “ML metadata” database).
  • Customer service agent (AI agent):
    • The agent’s long-term memory is stored in Cosmos DB (a dedicated container for user profiles and interaction history).
    • The agent’s conversation state (current plan, tool call history) is persisted in Azure SQL Database (a “state” database) using a simple key-value table. This allows the agent to resume a session if the Container App restarts.
    • When the agent retrieves knowledge (RAG), it queries the vector index in Azure AI Search, but fetches document metadata and access permissions from Cosmos DB to enforce user authorization.
    • User feedback (“thumbs up/down”) is written to a Cosmos DB collection, which triggers a Function to log to the data lake for future model improvement.
  • Security: All databases are accessed via private endpoints; public access is disabled. Applications authenticate with managed identities. Azure SQL Database uses customer-managed keys for encryption. Cosmos DB IP firewall is set to allow only the VNet.

This design uses relational databases for strong transactional consistency, Cosmos DB for global scale and flexible schemas, Redis for performance, and databases as the backbone of AI state and metadata. It illustrates how database selection is driven by workload characteristics, not by a single “best” choice.