DSL

YoY

Intention

We provide a series of frequently used financial comparison functions such as Year-over-year (YoY), for looking at two or more measurable events on an annualized basis.

  • YoY: Year-over-year
  • MoM: Month-over-month
  • Lookback: looking back certain value over a certain period of time.

 

It allows user to calculate relative changes in metrics over time periods by comparing current values with historical values (previous year or month).
Reference: https://help.sigmacomputing.com/docs/datelookback

Solution

This is implemented through a DSL converter that transforms high-level metric definitions into SQL queries.
This implementation allows users to easily define time-based comparative analytics without writing complex SQL joins and calculations manually.

Key Features

  • Support both aggregated and non-aggregated metrics
  • Handles multiple YoY/MoM metrics in the same query
  • Supports data truncation (e.g., by month)
  • Handles different date formats and types
  • Performs null checks and division by zero protection

Main Components

  • YoYDslConverter : The main class that handles the conversion
  • YoYExtra: Contains configuration for time-based comparisons
  • DSLConfigNode: Represents the structure of quires

Workflow

Input Processing

The system accepts DSL configurations where metrics can be marked with special operators:

  • YOY: Year-over-Year percentage change
  • MOM: Month-over-Month percentage change
  • YOY_DIFF: Difference between years
  • MOM_DIFF: Difference between months
  • LOOKBACK: just fetch the historical value
    The example can be found in the appendix.

Conversion Process

The converter:

  1. Identified YoY/MoM metrics in the input
  2. Creates two queries:
    1. One for current period data
    2. One for historical period data (shifted by year/month)
  3. Join these queries together
  4. Applies the appropriate calculation formula
    The core conversion logic can be found in the appendix.

 

Formula Implementation

The system implements different formulas based on the operator:

  • For YOY/MOM: (current - previous) / previous
  • For YOY_DIFF/MOM_DIFF: current - previous
  • For LOOKBACK: just returns the previous value
    The formula implementation can be found in the appendix

Output

The converter first converts it into low-level DSL that can be interpreted by SQL converter.
Then it finally converts into SQL queries.
Both examples can be found in the appendix.

Challenges

I handle these challenges through:

  • Modular code structure
  • Clear separation of concerns between DSL parsing and SQL generation
  • Flexible DSL configuration
  • Unit tests (30+)
  • Strong validation checks
  • Comprehensive error handling

Multiple and Nested Conversion in a Single Query

  • Complex JOIN conditions
  • Supporting multiple YoY/MoM metrics simultaneously
  • Ensuring consistent date column usage across metrics

 

Non-aggregation Support

We provide aggregated (with GROUP BY) YoY and MoM operations by default. However, for pre-calculated tables that are already aggregated. A non-aggregation operation is needed for such a scenario.
Challenges

  • Aggregated metrics
  • Non-aggregated metrics (raw data)
    Handle
  • Perform checks preventing the mixing of aggregated and non-aggregated metrics

Query Optimization

Query performance can be degraded if multiple YoY/MoM metrics in a single query.

 

For multiple metrics:

  • Combine joints that reference the same table
  • Extract fields from combined joints
  • Rewrite join conditions

 

For nested queries:

  • Convert the underlying view into the result table
  • Rewrite the reference into the result table

Edge cases

  • Missing data handling
  • Divided by zero handling
  • Null value propagation
  • String date format parsing

Appendix

DSL and SQL

Example of High Level DSL

{
  "datasource": {
    "connId": "demo_datasource",
    "name": "demo_datasource",
    "tableType": "MYSQL"
  },
  "dimensions": [
    {
      "name": "order_date",
      "type": "COLUMN",
      "ref": "order_date"
    }
  ],
  "metrics": [
    {
      "name": "order_money",
      "type": "COLUMN",
      "ref": "order_money",
      "operator": "SUM",
      "derive": "YOY",
      "extra": {
        "dateColumn": "order_date",
        "timeAdd": -1,
        "timeAddUnit": "year",
        "truncUnit": "month",
        "srcType": "datetime",
        "srcFormat": null,
        "destType": "datetime",
        "destFormat": null
      }
    }
  ],
  "table": {
    "name": "t1",
    "ref": "demo.t_order",
    "type": "table",
    "refExtra": {
      "metadata": {
        "columns": [
          "order_date",
          "order_money"
        ],
        "columnTypes": [
          "datetime",
          "decimal"
        ]
      }
    }
  }
}

Example of low level DSL

{
  "dslType": "NO_AGG",
  "name": "yoyQuery",
  "projections": [
    {
      "name": "order_date",
      "type": "UNIFY_CAST",
      "ref": "order_date",
      "refExtra": {
        "srcType": "datetime",
        "srcFormat": null,
        "destType": "datetime",
        "destFormat": null,
        "timeAddUnit": "year",
        "timeAdd": 0,
        "truncUnit": "month"
      },
      "evict": false,
      "operator": "IDENTIFIER",
      "nodeName": ""
    },
    {
      "name": "order_money",
      "type": "NEW_EXPRESSION",
      "ref": "IF(t_yoy_previous_order_money_1_year.order_money IS NULL OR t_yoy_previous_order_money_1_year.order_money = 0, NULL, (t_yoy_current.order_money - t_yoy_previous_order_money_1_year.order_money) / t_yoy_previous_order_money_1_year.order_money)",
      "evict": false,
      "operator": "IDENTIFIER",
      "nodeName": ""
    }
  ],
  "join": {
    "joinTypes": [
      "left"
    ],
    "tables": [
      {
        "name": "t_yoy_current",
        "type": "QUERY",
        "ref": "t_yoy_current",
        "evict": false,
        "nodeName": ""
      },
      {
        "name": "t_yoy_previous_order_money_1_year",
        "type": "QUERY",
        "ref": "t_yoy_previous_order_money_1_year",
        "evict": false,
        "nodeName": ""
      }
    ],
    "on": null,
    "onFilters": [
      [
        {
          "type": "COLUMN",
          "ref": "t_yoy_current.order_date",
          "evict": false,
          "operator": "EQ",
          "value": "t_yoy_previous_order_money_1_year.order_date",
          "valueType": "COLUMN",
          "nodeName": ""
        },
        {
          "type": "COLUMN",
          "ref": "t_yoy_current.order_date",
          "evict": false,
          "operator": "EQ",
          "value": "t_yoy_previous_order_money_1_year.order_date",
          "valueType": "COLUMN",
          "nodeName": ""
        }
      ]
    ],
    "nodeName": "",
    "pos": null
  },
  "children": [
    {
      "dslType": "AGG",
      "name": "t_yoy_current",
      "metrics": [
        {
          "name": "order_money",
          "type": "COLUMN",
          "ref": "order_money",
          "evict": false,
          "operator": "SUM",
          "nodeName": ""
        }
      ],
      "metricsCombine": [],
      "datasource": {
        "name": "demo_datasource",
        "connId": "demo_datasource",
        "tableType": "MYSQL",
        "defaultDb": null,
        "nodeName": "",
        "pos": null
      },
      "table": {
        "name": "t1",
        "type": "TABLE",
        "ref": "demo.t_order",
        "refExtra": {
          "metadata": {
            "columns": [
              "order_date",
              "order_money"
            ],
            "columnTypes": [
              "datetime",
              "decimal"
            ]
          }
        },
        "evict": false,
        "nodeName": ""
      },
      "dimensions": [
        {
          "name": "order_date",
          "type": "UNIFY_CAST",
          "ref": "order_date",
          "refExtra": {
            "srcType": "datetime",
            "srcFormat": null,
            "destType": "datetime",
            "destFormat": null,
            "timeAddUnit": "year",
            "timeAdd": 0,
            "truncUnit": "month"
          },
          "evict": false,
          "operator": "IDENTIFIER",
          "nodeName": ""
        }
      ],
      "limit": 0,
      "offset": 0,
      "nodeName": ""
    },
    {
      "dslType": "AGG",
      "name": "t_yoy_previous_order_money_1_year",
      "metrics": [
        {
          "name": "order_money",
          "type": "COLUMN",
          "ref": "order_money",
          "evict": false,
          "operator": "SUM",
          "nodeName": ""
        }
      ],
      "metricsCombine": [],
      "datasource": {
        "name": "demo_datasource",
        "connId": "demo_datasource",
        "tableType": "MYSQL",
        "defaultDb": null,
        "nodeName": "",
        "pos": null
      },
      "table": {
        "name": "t1",
        "type": "TABLE",
        "ref": "demo.t_order",
        "refExtra": {
          "metadata": {
            "columns": [
              "order_date",
              "order_money"
            ],
            "columnTypes": [
              "datetime",
              "decimal"
            ]
          }
        },
        "evict": false,
        "nodeName": ""
      },
      "dimensions": [
        {
          "name": "order_date",
          "type": "UNIFY_CAST",
          "ref": "order_date",
          "refExtra": {
            "srcType": "datetime",
            "srcFormat": null,
            "destType": "datetime",
            "destFormat": null,
            "timeAddUnit": "year",
            "timeAdd": -1,
            "truncUnit": "month"
          },
          "evict": false,
          "operator": "IDENTIFIER",
          "nodeName": ""
        }
      ],
      "limit": 0,
      "offset": 0,
      "nodeName": ""
    }
  ],
  "limit": 0,
  "offset": 0,
  "nodeName": ""
}

Example of output SQL

SELECT DATE_TRUNC('month', "t1"."order_date") AS "order_date", IF("t4"."order_money" IS NULL OR "t4"."order_money" = 0, NULL, ("t1"."order_money" - "t4"."order_money") / "t4"."order_money") AS "order_money"
FROM (SELECT DATE_TRUNC('month', "order_date") AS "order_date", SUM("order_money") AS "order_money"
FROM "demo"."t_order"
GROUP BY DATE_TRUNC('month', "order_date")) AS "t1"
LEFT JOIN (SELECT DATE_TRUNC('month', DATE_ADD('year', -1, "order_date")) AS "order_date", SUM("order_money") AS "order_money"
FROM "demo"."t_order"
GROUP BY DATE_TRUNC('month', DATE_ADD('year', -1, "order_date"))) AS "t4" ON "t1"."order_date" = "t4"."order_date"

Core conversion logic

  private static DSLConfigNode yoyDslConvert(
      DSLConfigNode dslConfigNode, List<OperatorExpression> yoyMetrics, boolean isAgg) {
    // 使用第一个指标获取基础 YoyExtra 配置
    OperatorExpression firstYoyMetric = yoyMetrics.get(0);
    YoyExtra yoyExtra = YoyExtra.getYoyExtra(firstYoyMetric);
    // 创建当前周期查询
    DSLConfigNode currentPeriodDsl = createYearDsl(
        dslConfigNode, CURRENT_PERIOD_TABLE_ALIAS, 0, yoyExtra, yoyMetrics, isAgg);
    // 为每个 YoY 指标创建对应的历史查询
    List<DSLConfigNode> previousDsls = new ArrayList<>();
    for (OperatorExpression yoyMetric : yoyMetrics) {
      YoyExtra metricYoyExtra = YoyExtra.getYoyExtra(yoyMetric);
      DSLConfigNode previousPeriodDsl = createYearDsl(
          dslConfigNode, getTableAlias(yoyMetric),
          metricYoyExtra.getTimeAdd(), metricYoyExtra, yoyMetrics, isAgg);
      previousDsls.add(previousPeriodDsl);
    }
    return createYoyDsl(yoyMetrics, currentPeriodDsl, previousDsls, isAgg);
  }

Formula implementation

  private static String getRef(
      OperatorExpression yoyMetric, String metricName, String previousPeriodTableAlias) {
    Operator operator = getOperator(yoyMetric);
    if (Objects.equals(operator, Operator.YOY)
        || Objects.equals(operator, Operator.MOM)) {
      return String.format(
          "IF(%s.%s IS NULL OR %s.%s = 0, NULL, (%s.%s - %s.%s) / %s.%s)",
          previousPeriodTableAlias, metricName,
          previousPeriodTableAlias, metricName,
          CURRENT_PERIOD_TABLE_ALIAS, metricName,
          previousPeriodTableAlias, metricName,
          previousPeriodTableAlias, metricName
      );
    }
    if (Objects.equals(operator, Operator.YOY_DIFF)
        || Objects.equals(operator, Operator.MOM_DIFF)) {
      return String.format("%s.%s - %s.%s",
          CURRENT_PERIOD_TABLE_ALIAS, metricName,
          previousPeriodTableAlias, metricName
      );
    }
    if (Objects.equals(operator, Operator.LOOKBACK)) {
      return String.format("%s.%s",
          previousPeriodTableAlias, metricName
      );
    }
    return null;
  }