> ## Documentation Index
> Fetch the complete documentation index at: https://firebolt-aggregate-helm-docs-pr-5.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

> Azure Blob Storage object storage for engine managed table data, with Microsoft Entra Workload ID and intermediary service principals for external access.

# Azure Blob Storage

This page configures Azure Blob Storage as engine object storage.

Every engine needs object storage for managed table data. The chart does not support local-filesystem storage for engines, so an engine pod never becomes Ready until `customEngineConfig.storage` points at object storage.

With Azure Blob Storage as the backing store, durability does not depend on the per-pod data volumes mounted to each engine. Even a complete loss of those volumes does not cause data loss, because the authoritative copy of managed table data lives in object storage.

You configure object storage on the engine through `customEngineConfig.storage`, which the chart passes through unchanged into the engine's `config.yaml`. The `type`, `api_scheme`, and `bucket_name` keys match the Firebolt Core configuration schema, and the chart does not validate them. The engine reads Azure credentials from the pod's Azure identity, which you provide with [Microsoft Entra Workload ID](https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview).

<Note>
  The chart passes `customEngineConfig.storage` through unchanged and does not validate the `type`. The `abs` backend requires an engine image that supports it. An unsupported `type` is written verbatim into the engine `config.yaml`, so the engine fails at startup rather than at install time.
</Note>

## Prerequisites

Before you begin, ensure that you have the following installed and configured:

* A Kubernetes cluster running on Azure Kubernetes Service with workload identity and the OIDC issuer enabled.
* `kubectl` configured to access your cluster.
* `helm` v3 installed on your local machine.
* `az` configured for your subscription.
* An Azure subscription with permissions to create storage accounts, containers, and managed identities.
* An engine image that supports the `abs` storage backend.

## Use Azure Blob Storage

The following examples use a storage account named `fireboltenginedemo` with a container named `firebolt-managed`, but you can choose any names you like. Storage account names must be globally unique and use only lowercase letters and numbers.

### Create a storage account and container

Create a storage account and container for engine object storage:

```bash theme={"theme":{"light":"css-variables","dark":"css-variables"}}
# Resource group, storage account, container, and location names used below.
export RESOURCE_GROUP=firebolt-demo
export LOCATION=eastus
export STORAGE_ACCOUNT=fireboltenginedemo
export CONTAINER_NAME=firebolt-managed

# Create the storage account.
az storage account create \
  --name "${STORAGE_ACCOUNT}" \
  --resource-group "${RESOURCE_GROUP}" \
  --location "${LOCATION}" \
  --sku Standard_LRS \
  --kind StorageV2 \
  --allow-blob-public-access false

# Create the container.
az storage container create \
  --name "${CONTAINER_NAME}" \
  --account-name "${STORAGE_ACCOUNT}" \
  --auth-mode login
```

### Grant the engine an Azure identity

Create a user-assigned managed identity, grant it blob access on the storage account, and federate it with the engine's Kubernetes ServiceAccount:

```bash theme={"theme":{"light":"css-variables","dark":"css-variables"}}
# Identity and cluster names used below.
export IDENTITY_NAME=firebolt-engine
export AKS_CLUSTER=my-aks-cluster
export K8S_NAMESPACE=firebolt
export K8S_SA=firebolt-engine

# Create the user-assigned managed identity for the engine.
az identity create \
  --name "${IDENTITY_NAME}" \
  --resource-group "${RESOURCE_GROUP}"

# Read the identity's client ID and the storage account's resource ID.
export IDENTITY_CLIENT_ID=$(az identity show \
  --name "${IDENTITY_NAME}" --resource-group "${RESOURCE_GROUP}" \
  --query clientId -o tsv)
export STORAGE_ACCOUNT_ID=$(az storage account show \
  --name "${STORAGE_ACCOUNT}" --resource-group "${RESOURCE_GROUP}" \
  --query id -o tsv)

# Grant the identity blob read and write access on the storage account.
az role assignment create \
  --assignee "${IDENTITY_CLIENT_ID}" \
  --role "Storage Blob Data Contributor" \
  --scope "${STORAGE_ACCOUNT_ID}"

# Read the AKS OIDC issuer.
export OIDC_ISSUER=$(az aks show \
  --name "${AKS_CLUSTER}" --resource-group "${RESOURCE_GROUP}" \
  --query oidcIssuerProfile.issuerUrl -o tsv)

# Federate the managed identity with the Kubernetes ServiceAccount.
az identity federated-credential create \
  --name firebolt-engine \
  --identity-name "${IDENTITY_NAME}" \
  --resource-group "${RESOURCE_GROUP}" \
  --issuer "${OIDC_ISSUER}" \
  --subject "system:serviceaccount:${K8S_NAMESPACE}:${K8S_SA}" \
  --audience api://AzureADTokenExchange
```

Annotate the Kubernetes ServiceAccount with the managed identity's client ID so AKS injects credentials into engine pods that carry the Workload ID label:

```yaml theme={"theme":{"light":"css-variables","dark":"css-variables"}}
apiVersion: v1
kind: ServiceAccount
metadata:
  name: firebolt-engine
  namespace: firebolt
  annotations:
    azure.workload.identity/client-id: <managed-identity-client-id>
```

### Point the chart at the container

Run the engine pods under the annotated ServiceAccount, label them so Workload ID injects credentials, and set the storage block to the container. The default scheme for `abs` is `azure://`, `bucket_name` is the container name, and `azure.storage_account_name` is the storage account that holds it.

```yaml theme={"theme":{"light":"css-variables","dark":"css-variables"}}
# my-values.yaml
engineSpec:
  serviceAccount: firebolt-engine

engines:
  - name: default
    # Required for Microsoft Entra Workload ID to inject credentials.
    podLabels:
      azure.workload.identity/use: "true"

customEngineConfig:
  storage:
    type: abs
    api_scheme: "azure://"
    bucket_name: firebolt-managed
    azure:
      storage_account_name: fireboltenginedemo
```

Create the ServiceAccount, then install the chart with the matching values:

```bash theme={"theme":{"light":"css-variables","dark":"css-variables"}}
# Create the Workload-ID-annotated ServiceAccount in the release namespace.
kubectl apply -f engine-serviceaccount.yaml

# Install the chart against the container and the ServiceAccount.
helm install firebolt ./helm \
  --namespace firebolt --create-namespace \
  -f my-values.yaml
```

### Confirm that object storage works

Create a table, insert a row, and list the container to confirm the engine wrote data through to Azure Blob Storage:

```bash theme={"theme":{"light":"css-variables","dark":"css-variables"}}
# Forward the gateway Service to localhost:8080 in the background.
kubectl -n firebolt port-forward svc/firebolt-gateway 8080:80 &

# Create a table on the engine.
curl -s http://localhost:8080/ -H "X-Firebolt-Engine: default" \
  -H "Content-Type: text/plain" --data "create table t (val int)"

# Insert one row, which forces the engine to write a tablet.
curl -s http://localhost:8080/ -H "X-Firebolt-Engine: default" \
  -H "Content-Type: text/plain" --data "insert into t values (1)"

# List the container. New blobs appear as the engine writes data.
az storage blob list \
  --container-name firebolt-managed \
  --account-name fireboltenginedemo \
  --auth-mode login \
  --output table
```

New blobs appear under the container as the engine writes data.

## Restrict external access with an intermediary service principal

The container you set under `customEngineConfig.storage` holds the engine's managed tablet data, and the engine reaches it with the engine pod's own Azure identity. Queries that read from or write to external locations, such as external tables that point at a different container, follow a separate credential path.

By default, external access also uses the engine pod's own Azure identity. That identity belongs to this chart release, so it is not a convenient identity for the owner of an external container to reference when they grant access.

An intermediary service principal gives external access a stable identity instead. When you set one, the engine uses the intermediary service principal for external access rather than its own pod identity. Because the service principal is stable and known ahead of time, you can share it with third parties and reference it in container role assignments, including on Azure subscriptions outside your own organization. Access to the object storage container always uses the engine pod's own identity, so the intermediary service principal applies only to external locations.

Create the intermediary service principal and grant it the permissions it needs to reach the external data.

Set its application client ID under `customEngineConfig.storage.azure.intermediary_service_principal_client_id`:

```yaml theme={"theme":{"light":"css-variables","dark":"css-variables"}}
customEngineConfig:
  storage:
    type: abs
    api_scheme: "azure://"
    bucket_name: firebolt-managed
    azure:
      storage_account_name: fireboltenginedemo
      intermediary_service_principal_client_id: 35f11db5-082b-46e8-9f2f-5466d8630003
```

The chart passes the `storage.azure` block through unchanged. The block is valid when `type` is `abs` or `azurite`.

## Storage scope

`customEngineConfig` is global to the release. Multiple engines under the same `engines:` list share the same `customEngineConfig.storage` block, and therefore the same container. To run engines against different containers, install the chart twice in separate releases, each with its own `customEngineConfig.storage`.
