offdata ai — agentic AI for data modelers and data engineers

Azure Synapse

AI Azure Synapse schema generator

Describe your domain in plain English. Ship a Synapse dedicated SQL pool — DDL with HASH distribution, clustered columnstore indexes, and partitioning, plus a complete dbt project.

ddl/synapse.sqlsynapse sql
CREATE TABLE analytics.dim_customer (
  customer_sk   NVARCHAR(64)  NOT NULL,
  customer_id   NVARCHAR(64)  NOT NULL,
  name          NVARCHAR(256),
  plan_id       NVARCHAR(64),
  valid_from    DATE          NOT NULL,
  valid_to      DATE
)
WITH (
  DISTRIBUTION = REPLICATE,
  CLUSTERED COLUMNSTORE INDEX
);

CREATE TABLE analytics.fct_usage (
  usage_id      NVARCHAR(64)  NOT NULL,
  customer_sk   NVARCHAR(64)  NOT NULL,
  plan_id       NVARCHAR(64)  NOT NULL,
  event_ts      DATETIME2     NOT NULL,
  api_calls     BIGINT        NOT NULL
)
WITH (
  DISTRIBUTION = HASH(customer_sk),
  CLUSTERED COLUMNSTORE INDEX,
  PARTITION (event_ts RANGE RIGHT FOR VALUES (
    '2026-01-01', '2026-02-01', '2026-03-01', '2026-04-01'
  ))
);

OffDataAI is purpose-built for Azure Synapse Analytics. Our generator emits Synapse-native DDL for dedicated SQL pools with the right distribution strategy (HASH on the dominant join column for facts, REPLICATE for small dims), clustered columnstore indexes on analytical tables, and range partitioning on time-series facts. From the same conversation, you also get a complete dbt project configured for dbt-synapse.

What OffDataAI generates

  • Synapse-native DDL

    DISTRIBUTION = HASH / REPLICATE / ROUND_ROBIN, CLUSTERED COLUMNSTORE INDEX, range partitioning, and DATETIME2 — never generic ANSI SQL.

  • Kimball, Data Vault 2.0, or 3NF

    Three paradigms supported on Synapse. The generator emits the right distribution and index strategies for each.

  • Full dbt project

    dbt_project.yml configured for dbt-synapse, sources, staging, marts (with dist + index configs), schema tests, and seeds.

  • Range partitioning included

    Time-series facts get range partitions with monthly boundaries by default — configurable in the IR.

Frequently asked questions

Can OffDataAI generate Azure Synapse DDL from plain English?
Yes. Describe your business domain in plain English. OffDataAI's interview agent gathers grain, cardinality, and distribution hints, synthesizes a typed Intermediate Representation, and emits Synapse-native DDL for dedicated SQL pools with HASH distribution and clustered columnstore indexes.
Does OffDataAI choose distribution keys for Synapse?
Yes. The Synapse generator picks DISTRIBUTION = HASH(...) based on the dominant join column on fact tables, and DISTRIBUTION = REPLICATE for small dimensions. Large dimensions use HASH on the natural key.
Does it support clustered columnstore indexes?
Yes. Fact tables default to clustered columnstore indexes (CCI), which is the right default for analytical workloads on Synapse dedicated SQL pools. Small lookup tables can be configured as HEAP or rowstore in the IR.
Does it support Kimball, Data Vault, and 3NF on Synapse?
Yes. All three paradigms are first-class on Synapse. The generator emits the right DISTRIBUTION and indexing strategies for each.
Does it generate a dbt project for Synapse?
Yes. OffDataAI emits a complete dbt project for Synapse — dbt_project.yml configured for dbt-synapse, sources, staging, marts (with dist + index configs), schema tests, and seed CSVs.

Your data warehouse is one conversation away.

Describe your domain, or open one of 150+ production-grade templates. ERDs, DDL, and a complete dbt project — generated in under a minute.