Skip to content

Schemas

SBT allows you to define schema specs in your model yaml files. Those schemas can then be assigned to sources or models. When a schema is assigned to a source or model, sbt can validate the model output or source table/view against the schema definition. The use of schemas is what powers Insight Cloud's ability to evaluate if a user's data is compatible with an app. This is key to Seek's ability to run your app logic against any user's data.

Schema specs include the following fields:

Field Description
name The name of the schema
description An optional description of the schema
columns A list of column definitions

Column definitions include the following fields:

Field Description
name The name of the column
type The type of the column
description An optional description of the column
constraints A optional list of constraints

Column Constraints

Schema column constraints are used to enforce rules on the data in the column. This allows your app to catch data quality issues upstream and avoid providing users with confusing analysis.

There are 3 types of constraints currently:

  • not_null - Ensures that the column has no null values
  • unique - Ensures that the column has all unique values
  • expr - Allows for a custom expression to be evaluated on the column. The expression is a string that is used in a count query to evaluate the column. If the count is zero, the constraint is satisfied. The expression can be any valid SQL expression that can be used in a where clause. For example, > 5, < CURRENT_DATE(), = FALSE, not in ('US', 'CA').

SBT Commands

sbt has a few commands for checking schema adherence:

  • sbt validate-model - compares a model instantiated in Snowflake against a desired schema
  • sbt validate-source - compares a Snowflake source against a desired schema
  • sbt validate-app - runs validations for all models and sources in the app with a schema assigned
  • Use the --validate flag on the run-model or run-app command to trigger a schema validation after running the model or app

Example YAML

Here is an example schema object definition:

models/schemas.yml
schemas:
  - name: sample_target_schema
    description: an optional description
    columns:
        - name: CITY
          type: string
          constraints:
            - type: not_null
            - type: unique
        - name: DC NBR
          type: integer
          description: an optional description
          constraints:
            - type: expr
              expression: '>7500'
        - name: LATITUDE
          type: float
          description: an optional description
        - name: ISMANUALGEOCODE
          type: boolean
        - name: ADDRESS
          type: str
          constraints:
            - type: not_null
          description: an optional description
        - name: STATE/PROV
          type: text
          description: an optional description, this column is optional
          optional: true

To assign a schema to a model or schema, add a target_schema to the yaml config. You only need to assign the name of the schema, sbt will handle looking up the details of the named schema. We do this so that multiple models or sources can share the same schema definition.

models/models.yml
models:
  - name: sql_test
    config:
      language: sql
      materialized: table
      type: output
      target_schema: sample_target_schema
sources/sources.yml
sources:
  - name: atlas_luminate
    description: "Atlas-prepped Luminate Tables "
    database: SEEK_TEST_DATA
    tables:
      - name: luminate_stores
      - name: luminate_distributioncenters
        target_schema: sample_target_schema

Notes