Migrating SSIS to BigQuery: .dtsx Packages to Cloud Composer and Dataform

SQL Server Integration Services has been the backbone of enterprise ETL on Microsoft platforms for over two decades. Organizations running hundreds or thousands of SSIS packages — encoded as .dtsx XML files — face a pivotal decision as they move to Google Cloud: how do you faithfully translate the control flow orchestration, data flow transformations, and connection management patterns of SSIS into the serverless, cloud-native architecture of BigQuery, Cloud Composer, and Dataform?

This is not a simple lift-and-shift. SSIS packages are deeply coupled to SQL Server — they rely on OLE DB connections, Windows authentication, MSDB job scheduling, and a tightly integrated control flow / data flow execution model that has no direct equivalent in the Google Cloud ecosystem. The migration requires decomposing each .dtsx package into its constituent parts and mapping them to the right Google Cloud service: BigQuery for data storage and SQL transformations, Cloud Composer (managed Apache Airflow) for orchestration, Dataform for SQL-based ELT workflows, Cloud Functions for lightweight procedural logic, and Google Cloud Storage for staging and file operations.

Why SSIS to BigQuery? The Strategic Case

The decision to migrate SSIS to BigQuery is driven by several converging forces that make the Microsoft-centric ETL stack increasingly difficult to justify for modern data platforms.

SQL Server Dependency Removal

SSIS is inextricably tied to SQL Server. Every SSIS package requires a SQL Server instance for the SSISDB catalog, package storage, logging, and execution. The SSIS runtime itself runs as a Windows service that must be installed on dedicated integration servers. This creates a hard dependency on SQL Server licensing, Windows Server infrastructure, and Microsoft-specific tooling that conflicts with multi-cloud and cloud-first strategies.

BigQuery eliminates this dependency entirely. There is no server to manage, no runtime to install, no catalog database to maintain. SQL transformations run in BigQuery’s serverless engine, orchestration runs in Cloud Composer’s managed Airflow environment, and all infrastructure is abstracted away behind Google Cloud APIs.

Serverless Economics

SSIS requires dedicated Integration Runtime servers — typically Windows VMs with SQL Server installed — that must be provisioned, patched, and scaled manually. During off-peak hours, these servers sit idle but continue to incur costs. During peak loads, they become bottlenecks that require manual intervention to scale.

BigQuery’s on-demand pricing model charges only for bytes scanned during query execution. There is no idle compute cost. Cloud Composer charges for the Airflow environment, but DAG execution triggers BigQuery jobs that scale automatically with no infrastructure management. For most organizations, this shift from provisioned to serverless compute reduces total cost of ownership by 40–60% while eliminating operational overhead.

Cloud-First Data Architecture

Modern data platforms are built around cloud-native services: object storage (GCS) for data lakes, BigQuery for analytical warehousing, Pub/Sub for streaming, Dataflow for beam pipelines, and Composer for orchestration. SSIS packages that load data from flat files on network shares, transform data through in-memory data flow pipelines, and write results to SQL Server tables represent an architectural pattern that is fundamentally incompatible with this cloud-native stack.

Migrating SSIS to BigQuery is not just a tool swap — it is an architectural modernization that moves the organization from a server-centric ETL model to a service-centric ELT model where data lands in cloud storage, transformations happen in the warehouse, and orchestration is managed as code.

SSIS to BigQuery migration — automated end-to-end by MigryX

Architecture Comparison: SSIS vs. Cloud Composer + BigQuery

Understanding the architectural differences between SSIS and the Google Cloud target stack is essential for planning a successful migration. The two systems have fundamentally different execution models, and every SSIS concept must be reinterpreted in the context of cloud-native services.

SSIS Architecture

An SSIS package (.dtsx) is an XML document that encodes two parallel execution graphs: a control flow that defines the sequence of tasks (Execute SQL, Script Task, File System Task, For Each Loop, etc.) and a data flow that defines the in-memory transformation pipeline (source adapters, transformations, destination adapters). The control flow is the orchestration layer — it determines what runs when and how errors are handled. The data flow is the transformation layer — it moves and transforms data row-by-row or buffer-by-buffer through an in-memory pipeline engine.

SSIS packages are stored in the SSISDB catalog on a SQL Server instance, scheduled through SQL Server Agent jobs, and executed by the SSIS runtime on dedicated Integration Runtime servers. Variables, parameters, and connection managers provide configuration and environment management. Package execution is logged to the SSISDB catalog with detailed event and diagnostic data.

Google Cloud Target Architecture

In the Google Cloud stack, orchestration moves to Cloud Composer (Airflow DAGs written in Python), SQL transformations move to BigQuery (via Dataform SQLX models or direct BigQuery SQL), procedural logic moves to Cloud Functions or Dataproc jobs, and file operations move to GCS with Cloud Functions or Composer operators. There is no in-memory data flow pipeline — instead, data is pushed down to BigQuery’s distributed query engine for transformation at scale.

This architectural shift means that SSIS data flow components (Derived Column, Lookup, Conditional Split, Multicast, Union All, Sort, Aggregate) are not converted to equivalent in-memory operations. Instead, they are expressed as SQL statements that BigQuery executes in its serverless engine. The SSIS control flow is converted to a Composer DAG that orchestrates BigQuery jobs, GCS operations, and Cloud Function invocations.

MigryX: Purpose-Built Parsers for Every Legacy Technology

MigryX does not rely on generic text matching or regex-based parsing. For every supported legacy technology, MigryX has built a dedicated Abstract Syntax Tree (AST) parser that understands the full grammar and semantics of that platform. This means MigryX captures not just what the code does, but why — understanding implicit behaviors, default settings, and platform-specific quirks that generic tools miss entirely.

Component Mapping: SSIS to Google Cloud

Every SSIS component has a target equivalent in the Google Cloud ecosystem. The following mapping table provides the definitive reference for converting .dtsx packages to Cloud Composer DAGs, BigQuery SQL, and Dataform models.

SSIS Component	Google Cloud Equivalent	Notes
Data Flow Task	BigQuery SQL / Dataproc job	In-memory pipeline replaced by SQL pushdown or Spark for complex transforms
Execute SQL Task	BigQuery SQL (via Composer BigQueryInsertJobOperator)	SQL transpiled from T-SQL to BigQuery Standard SQL
Script Task (C#/VB.NET)	Cloud Function / Dataproc job	Procedural logic converted to Python Cloud Functions
For Each Loop Container	Composer DAG dynamic task loop	File iteration becomes GCS list + dynamic task mapping
Sequence Container	Composer TaskGroup	Logical grouping of related tasks with shared error handling
SSIS Variables	Composer Airflow variables / DAG params	Package variables become Airflow Variables or DAG run conf
Connection Managers	BigQuery connections / GCS / Secret Manager	OLE DB connections become BigQuery dataset refs; credentials in Secret Manager
.dtsx Package File	Composer DAG .py file	XML package converted to Python DAG definition
Derived Column Transform	SQL CASE / expression in SELECT	Column derivations become SQL expressions in BigQuery
Lookup Transform	BigQuery JOIN	Reference table lookups become JOIN operations in SQL
Conditional Split	CASE WHEN / WHERE clauses	Row routing becomes SQL conditional logic
OLE DB Source	BigQuery table / external table	SQL Server sources migrated to BigQuery tables
Flat File Source	GCS + BigQuery external table / LOAD	File sources staged in GCS, loaded via BigQuery load jobs
Multicast Transform	Multiple INSERT...SELECT statements	One source writing to multiple targets becomes parallel SQL
Union All Transform	UNION ALL in BigQuery SQL	Direct SQL equivalent
Sort Transform	ORDER BY in BigQuery SQL	Sorting pushed down to BigQuery engine
Aggregate Transform	GROUP BY with aggregate functions	SUM, COUNT, AVG, MIN, MAX in BigQuery SQL
Merge Join Transform	BigQuery JOIN (INNER/LEFT/FULL)	Sorted merge join becomes hash join in BigQuery
Data Conversion Transform	CAST / SAFE_CAST in BigQuery SQL	Type conversions handled by BigQuery casting functions
Row Count Transform	COUNT(*) subquery or Airflow XCom	Row counts captured as query results or task metadata
Execute Process Task	BashOperator / Cloud Function	External process execution via Composer operators
File System Task	GCSObjectOperator / GCSDeleteObjectsOperator	File copy/move/delete operations on GCS
Send Mail Task	EmailOperator / SendGrid API	Email notifications via Composer email operator
Package Parameters	Airflow DAG params / Variables	Externalized configuration for environment-specific values
Event Handlers	Airflow callbacks (on_failure_callback)	OnError, OnPostExecute become Airflow task callbacks
Precedence Constraints	Airflow task dependencies (>>)	Success/failure/completion constraints become DAG edges
SSISDB Catalog	Composer metadata DB / Cloud Logging	Execution history and logs in Cloud Logging and Airflow UI
SQL Server Agent Job	Composer DAG schedule	Cron-based scheduling in Airflow

Parsing .dtsx XML: Inside the Package Structure

SSIS packages are stored as XML files with the .dtsx extension. The XML schema is complex, deeply nested, and includes multiple namespaces (DTS, SSIS, SQLTask). Automated migration requires parsing this XML to extract the control flow graph, data flow pipeline, connection strings, variables, parameters, and expressions.

Key XML Elements in a .dtsx File

The root element DTS:Executable contains the package definition. Inside it, DTS:ConnectionManagers defines data source and destination connections. DTS:Variables contains package-scoped variables with types and default values. DTS:Executables contains the control flow tasks and containers. Each Data Flow Task contains a pipeline element with components that define sources, transformations, and destinations. DTS:PrecedenceConstraints defines the execution order and conditional logic between tasks.

<!-- Simplified .dtsx structure showing key elements -->
<DTS:Executable DTS:ObjectName="DailyOrderETL">
  <DTS:ConnectionManagers>
    <DTS:ConnectionManager DTS:ObjectName="OLE_DB_Source"
      DTS:ConnectionString="Data Source=SQLPROD;
        Initial Catalog=OrdersDB;
        Provider=SQLNCLI11.1;
        Integrated Security=SSPI;"/>
  </DTS:ConnectionManagers>
  <DTS:Variables>
    <DTS:Variable DTS:ObjectName="RunDate"
      DTS:DataType="7">2026-04-08</DTS:Variable>
    <DTS:Variable DTS:ObjectName="RowCount"
      DTS:DataType="3">0</DTS:Variable>
  </DTS:Variables>
  <DTS:Executables>
    <DTS:Executable DTS:ExecutableType="SSIS.Pipeline.3"
      DTS:ObjectName="Load Orders">
      <!-- Data Flow pipeline components here -->
    </DTS:Executable>
    <DTS:Executable
      DTS:ExecutableType="Microsoft.ExecuteSQLTask"
      DTS:ObjectName="Update Aggregates">
      <!-- Execute SQL task with T-SQL -->
    </DTS:Executable>
  </DTS:Executables>
  <DTS:PrecedenceConstraints>
    <DTS:PrecedenceConstraint DTS:From="Load Orders"
      DTS:To="Update Aggregates" DTS:Value="0"/>
  </DTS:PrecedenceConstraints>
</DTS:Executable>

MigryX’s SSIS parser reads every element in the .dtsx XML tree, resolves cross-references between connection managers and tasks, extracts embedded SQL statements from Execute SQL Tasks, parses data flow pipeline graphs to identify transformation chains, and resolves SSIS expression language constructs (which use a proprietary syntax distinct from T-SQL) into equivalent BigQuery SQL expressions.

Code Example: SSIS Package to Cloud Composer DAG

Consider a typical SSIS package that loads daily order data from a SQL Server source, applies derived columns and lookups in a data flow, runs an aggregation SQL statement, and sends a notification email on completion or failure. Here is how the complete package converts to a Cloud Composer DAG.

Original SSIS Package (Conceptual)

The package named DailyOrderETL.dtsx contains a Sequence Container with three tasks: (1) a Data Flow Task that reads from Orders table via OLE DB Source, applies a Derived Column to calculate deal_tier, performs a Lookup against Customers to enrich with region, applies a Conditional Split to route high-value orders, and writes to staging.orders_enriched and staging.high_value_orders; (2) an Execute SQL Task that runs an aggregation query to populate reporting.daily_summary; (3) a Send Mail Task that notifies the team on success. The package has an OnError event handler that sends a failure notification.

Converted Cloud Composer DAG

from airflow import DAG
from airflow.providers.google.cloud.operators.bigquery import (
    BigQueryInsertJobOperator,
)
from airflow.providers.google.cloud.transfers.gcs_to_bigquery import (
    GCSToBigQueryOperator,
)
from airflow.operators.email import EmailOperator
from airflow.utils.task_group import TaskGroup
from datetime import datetime

# Converted from: DailyOrderETL.dtsx
# MigryX conversion ID: ssis-069-daily-order-etl
# Original connections: OLE_DB_Source -> SQLPROD.OrdersDB

dag = DAG(
    "daily_order_etl",
    description="Converted from SSIS package DailyOrderETL.dtsx",
    schedule_interval="0 6 * * *",
    start_date=datetime(2026, 1, 1),
    catchup=False,
    default_args={
        "retries": 2,
        "retry_delay": 300,
        "on_failure_callback": lambda ctx: EmailOperator(
            task_id="failure_alert",
            to="data-team@company.com",
            subject=f"FAILED: daily_order_etl - {ctx['task_instance'].task_id}",
            html_content="DAG daily_order_etl failed. Check Airflow logs.",
        ).execute(ctx),
    },
)

# --- Sequence Container: "Process Orders" -> TaskGroup ---
with TaskGroup("process_orders", dag=dag) as process_orders:

    # Data Flow Task: "Load Orders"
    # Original: OLE DB Source -> Derived Column -> Lookup -> Conditional Split
    # Converted: Single BigQuery SQL combining all data flow transformations
    load_and_enrich_orders = BigQueryInsertJobOperator(
        task_id="load_and_enrich_orders",
        configuration={
            "query": {
                "query": """
                    -- Data Flow: Derived Column + Lookup + write to staging
                    CREATE OR REPLACE TABLE `project.staging.orders_enriched` AS
                    SELECT
                        o.order_id,
                        o.order_date,
                        o.customer_id,
                        o.product_id,
                        o.amount,
                        -- Derived Column: deal_tier calculation
                        CASE
                            WHEN o.amount >= 10000 THEN 'enterprise'
                            WHEN o.amount >= 1000 THEN 'mid_market'
                            ELSE 'smb'
                        END AS deal_tier,
                        -- Lookup: Customer enrichment (was Lookup transform)
                        c.customer_name,
                        c.region,
                        c.segment
                    FROM `project.staging.raw_orders` o
                    LEFT JOIN `project.ref.customers` c
                        ON o.customer_id = c.customer_id
                    WHERE o.order_date = CURRENT_DATE() - 1
                """,
                "useLegacySql": False,
            }
        },
        dag=dag,
    )

    # Conditional Split: high-value orders to separate table
    split_high_value = BigQueryInsertJobOperator(
        task_id="split_high_value_orders",
        configuration={
            "query": {
                "query": """
                    -- Conditional Split: route high-value orders
                    CREATE OR REPLACE TABLE `project.staging.high_value_orders` AS
                    SELECT *
                    FROM `project.staging.orders_enriched`
                    WHERE deal_tier = 'enterprise'
                """,
                "useLegacySql": False,
            }
        },
        dag=dag,
    )

    load_and_enrich_orders >> split_high_value

# Execute SQL Task: "Update Aggregates"
update_aggregates = BigQueryInsertJobOperator(
    task_id="update_daily_summary",
    configuration={
        "query": {
            "query": """
                -- Execute SQL Task: aggregate to reporting layer
                CREATE OR REPLACE TABLE `project.reporting.daily_summary` AS
                SELECT
                    order_date,
                    region,
                    deal_tier,
                    COUNT(*) AS order_count,
                    SUM(amount) AS total_amount,
                    AVG(amount) AS avg_amount,
                    MIN(amount) AS min_amount,
                    MAX(amount) AS max_amount
                FROM `project.staging.orders_enriched`
                GROUP BY order_date, region, deal_tier
            """,
            "useLegacySql": False,
        }
    },
    dag=dag,
)

# Send Mail Task: success notification
send_success_email = EmailOperator(
    task_id="send_success_notification",
    to="data-team@company.com",
    subject="SUCCESS: Daily Order ETL completed",
    html_content="Daily order ETL pipeline completed successfully.",
    dag=dag,
)

# Precedence Constraints: sequential execution
process_orders >> update_aggregates >> send_success_email

Notice how the entire SSIS Data Flow Task — which contained four separate in-memory transformations (OLE DB Source, Derived Column, Lookup, Conditional Split) connected by data flow paths — collapses into two BigQuery SQL statements. The in-memory pipeline is replaced by SQL pushdown to BigQuery’s serverless engine. The SSIS Sequence Container becomes an Airflow TaskGroup. Precedence Constraints become Python >> operators. The OnError event handler becomes an Airflow on_failure_callback. SSIS Variables become Airflow Variables or Jinja-templated parameters.

SSIS Expression Language to BigQuery SQL

SSIS uses a proprietary expression language for Derived Column transformations, variable expressions, and precedence constraint conditions. This language has its own syntax for string functions, date functions, type casting, and conditional logic that differs from both T-SQL and BigQuery Standard SQL.

Expression Mapping Examples

SSIS Expression	BigQuery SQL Equivalent
`(DT_STR,10,1252)OrderID`	`CAST(OrderID AS STRING)`
`SUBSTRING(Name,1,50)`	`SUBSTR(Name, 1, 50)`
`GETDATE()`	`CURRENT_TIMESTAMP()`
`DATEADD("dd",-1,GETDATE())`	`DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)`
`DATEDIFF("mm",StartDate,EndDate)`	`DATE_DIFF(EndDate, StartDate, MONTH)`
`REPLACE(City," ","")`	`REPLACE(City, ' ', '')`
`UPPER(TRIM(Email))`	`UPPER(TRIM(Email))`
`LEN(Description)`	`LENGTH(Description)`
`ISNULL(Phone) ? "N/A" : Phone`	`IFNULL(Phone, 'N/A')`
`Amount > 10000 ? "High" : "Low"`	`IF(Amount > 10000, 'High', 'Low')`
`(DT_DECIMAL,2)Amount`	`CAST(Amount AS NUMERIC)`
`RIGHT("0000" + OrderCode,6)`	`LPAD(OrderCode, 6, '0')`
`YEAR(OrderDate)`	`EXTRACT(YEAR FROM OrderDate)`
`MONTH(OrderDate)`	`EXTRACT(MONTH FROM OrderDate)`

MigryX maintains a comprehensive expression transpilation engine that handles all SSIS expression language functions, including the less common ones like TOKENCOUNT, TOKEN, FINDSTRING, HEXTOBIN, and the various DT_ type casting operators. The transpiler also handles nested expressions and complex conditional chains that combine multiple functions in a single Derived Column expression.

For Each Loop Containers: Dynamic File Processing

One of the most common SSIS patterns is the For Each Loop Container that iterates over files in a directory, processes each file through a data flow, archives the file, and logs the result. This pattern requires special handling in Cloud Composer because Airflow DAGs are generated at parse time, not at execution time.

# SSIS For Each Loop -> Composer dynamic task mapping
from airflow.decorators import task
from airflow.providers.google.cloud.hooks.gcs import GCSHook
from airflow.providers.google.cloud.operators.bigquery import (
    BigQueryInsertJobOperator,
)

@task
def list_incoming_files():
    """Replaces SSIS ForEachFile enumerator scanning a directory."""
    hook = GCSHook()
    files = hook.list(
        bucket_name="raw-incoming",
        prefix="orders/",
        match_glob="*.csv",
    )
    return [f for f in files if f.endswith(".csv")]

@task
def load_file_to_bigquery(file_path: str):
    """Replaces SSIS Data Flow inside For Each Loop."""
    from google.cloud import bigquery
    client = bigquery.Client(project="my-project")

    job_config = bigquery.LoadJobConfig(
        source_format=bigquery.SourceFormat.CSV,
        skip_leading_rows=1,
        autodetect=True,
        write_disposition=bigquery.WriteDisposition.WRITE_APPEND,
    )

    uri = f"gs://raw-incoming/{file_path}"
    load_job = client.load_table_from_uri(
        uri,
        "my-project.staging.raw_orders",
        job_config=job_config,
    )
    load_job.result()  # Wait for completion
    return {"file": file_path, "rows": load_job.output_rows}

@task
def archive_file(file_path: str):
    """Replaces SSIS File System Task (move to archive)."""
    hook = GCSHook()
    hook.copy(
        source_bucket="raw-incoming",
        source_object=file_path,
        destination_bucket="raw-archive",
        destination_object=file_path,
    )
    hook.delete(bucket_name="raw-incoming", object_name=file_path)

# DAG flow: list files -> load each -> archive each
files = list_incoming_files()
for file_path in files:
    loaded = load_file_to_bigquery(file_path)
    loaded >> archive_file(file_path)

The SSIS For Each Loop Container is one of the most challenging patterns to convert because Airflow’s execution model differs fundamentally from SSIS. In SSIS, the loop executes sequentially at runtime. In Airflow, dynamic task mapping generates parallel tasks at DAG parse time. MigryX handles both sequential and parallel semantics, preserving the original execution order when the SSIS package requires it.

From parsed legacy code to production-ready modern equivalents — MigryX automates the entire conversion pipeline

From Legacy Complexity to Modern Clarity with MigryX

Legacy ETL platforms encode business logic in visual workflows, proprietary XML formats, and platform-specific constructs that are opaque to standard analysis tools. MigryX’s deep parsers crack open these proprietary formats and extract the underlying data transformations, business rules, and data flows. The result is complete transparency into what your legacy code actually does — often revealing undocumented logic that even the original developers had forgotten.

Connection Managers: From OLE DB to BigQuery

SSIS Connection Managers are XML elements that define data source connections using provider-specific connection strings. The most common types are OLE DB (for SQL Server), Flat File (for CSV/text files), ADO.NET (for generic .NET data providers), and SMTP (for email). Each connection manager type maps to a different Google Cloud service.

OLE DB connections to SQL Server become BigQuery dataset references. The SQL Server database maps to a BigQuery dataset, and tables map 1:1 after schema migration. Connection credentials move from Windows Integrated Security or SQL authentication to Google Cloud service account authentication, with secrets stored in Secret Manager rather than SSIS package configuration files or environment variables.

Flat File connections become GCS bucket and object path references. The local or network file paths in SSIS (\\fileserver\data\orders.csv) are replaced by GCS URIs (gs://raw-data/orders/orders.csv). File format specifications (delimiter, text qualifier, header row) are preserved in BigQuery load job configurations or external table definitions.

Data Type Mapping: SQL Server to BigQuery

SSIS data flow components use SQL Server data types internally for buffer management. When converting to BigQuery, every column type must be mapped correctly to avoid data loss or precision issues.

SQL Server / SSIS Type	BigQuery Type	Notes
INT / DT_I4	INT64	BigQuery uses 64-bit integers
BIGINT / DT_I8	INT64	Direct mapping
SMALLINT / DT_I2	INT64	Widened to INT64
DECIMAL(p,s) / DT_DECIMAL	NUMERIC or BIGNUMERIC	NUMERIC for p<=38, BIGNUMERIC for larger
FLOAT / DT_R8	FLOAT64	IEEE 754 double precision
VARCHAR / DT_STR	STRING	No length limit in BigQuery
NVARCHAR / DT_WSTR	STRING	BigQuery strings are always UTF-8
DATETIME / DT_DBTIMESTAMP	DATETIME	Microsecond precision in BigQuery
DATE / DT_DBDATE	DATE	Direct mapping
BIT / DT_BOOL	BOOL	0/1 converted to FALSE/TRUE
UNIQUEIDENTIFIER / DT_GUID	STRING	Stored as UUID string format
VARBINARY / DT_BYTES	BYTES	Binary data preserved
XML / DT_NTEXT	STRING or JSON	XML may be converted to JSON for BigQuery

Handling SSIS Script Tasks

SSIS Script Tasks contain C# or VB.NET code that executes arbitrary logic: calling web APIs, parsing complex file formats, performing custom validations, sending notifications, or implementing business rules that cannot be expressed in SQL. These script tasks are the most challenging components to migrate because they require code translation from .NET to Python.

MigryX’s Script Task analyzer categorizes each script into one of several patterns and generates the appropriate Google Cloud equivalent. HTTP API calls become Cloud Function invocations with the requests library. File parsing logic becomes Python functions deployed as Cloud Functions or embedded in Composer task definitions. Custom validation logic becomes BigQuery SQL assertions or Dataform tests. Complex business rules become stored procedures in BigQuery scripting or Python Cloud Functions triggered by Composer.

# SSIS Script Task (C#) -> Cloud Function (Python)
# Original: Script Task calling REST API to validate customer records

# --- Original SSIS C# Script Task ---
# public void Main()
# {
#     string apiUrl = Dts.Variables["ApiEndpoint"].Value.ToString();
#     HttpWebRequest request = (HttpWebRequest)WebRequest.Create(apiUrl);
#     request.Method = "POST";
#     request.ContentType = "application/json";
#     // ... send customer batch, get validation results
#     Dts.Variables["ValidationResult"].Value = result;
#     Dts.TaskResult = (int)ScriptResults.Success;
# }

# --- Converted Cloud Function ---
import functions_framework
import requests
from google.cloud import bigquery

@functions_framework.http
def validate_customers(request):
    """Converted from SSIS Script Task: ValidateCustomerRecords."""
    client = bigquery.Client()

    # Read unvalidated customers from BigQuery
    query = """
        SELECT customer_id, name, email, phone
        FROM `project.staging.new_customers`
        WHERE validated = FALSE
    """
    rows = list(client.query(query).result())

    # Call validation API (same endpoint as SSIS package)
    api_url = "https://api.validation-service.com/v2/validate"
    batch = [dict(row) for row in rows]
    response = requests.post(api_url, json={"customers": batch})
    results = response.json()

    # Update validation status in BigQuery
    for r in results["validated"]:
        update_query = f"""
            UPDATE `project.staging.new_customers`
            SET validated = TRUE, validation_score = {r['score']}
            WHERE customer_id = '{r['customer_id']}'
        """
        client.query(update_query).result()

    return {"validated_count": len(results["validated"])}

Error Handling and Logging

SSIS provides a rich error handling model with event handlers (OnError, OnWarning, OnInformation, OnPostExecute), precedence constraint expressions (success, failure, completion with expressions), and detailed execution logging to the SSISDB catalog. Preserving this error handling behavior in Cloud Composer requires mapping each pattern to its Airflow equivalent.

SSIS OnError event handlers become Airflow on_failure_callback functions. Precedence constraints with failure paths become Airflow trigger_rule=TriggerRule.ONE_FAILED on downstream tasks. SSISDB execution logs are replaced by Cloud Logging integration, where every BigQuery job, Cloud Function invocation, and Composer task execution generates structured log entries that can be queried, alerted on, and retained according to organizational policies.

MigryX generates a logging configuration for each converted DAG that replicates the SSIS logging granularity, including row counts, execution times, error messages, and variable values at each stage of execution. This ensures that operations teams accustomed to SSISDB reporting have equivalent visibility in the Google Cloud environment.

Dataform Integration for SQL Transformations

While Cloud Composer handles orchestration, Dataform provides a purpose-built environment for SQL-based transformations with dependency management, incremental processing, testing, and documentation. Many SSIS Execute SQL Tasks that implement ELT patterns — staging-to-transformation-to-reporting SQL chains — are best expressed as Dataform SQLX models rather than individual BigQuery SQL statements in Composer.

-- Dataform SQLX model: converted from SSIS Execute SQL Task chain
-- Source: DailyOrderETL.dtsx -> "Update Aggregates" SQL task

config {
  type: "table",
  schema: "reporting",
  description: "Daily order summary by region and deal tier. Converted from SSIS package DailyOrderETL.dtsx.",
  assertions: {
    nonNull: ["order_date", "region", "deal_tier"],
    rowConditions: ["total_amount >= 0", "order_count > 0"]
  }
}

SELECT
    order_date,
    region,
    deal_tier,
    COUNT(*) AS order_count,
    SUM(amount) AS total_amount,
    AVG(amount) AS avg_amount,
    MIN(amount) AS min_amount,
    MAX(amount) AS max_amount
FROM ${ref("orders_enriched")}
GROUP BY order_date, region, deal_tier

Dataform adds capabilities that SSIS lacks: automatic dependency resolution between SQL models (no manual precedence constraints needed), built-in assertions for data quality testing, incremental processing with type: "incremental" that only processes new rows, and documentation that lives alongside the SQL code. For organizations migrating hundreds of SSIS Execute SQL Tasks, Dataform provides a more maintainable and testable target than raw BigQuery SQL in Composer DAGs.

MigryX Automation: End-to-End SSIS Conversion

Manual conversion of SSIS packages is error-prone, slow, and does not scale. A single complex .dtsx package can contain dozens of data flow components, hundreds of column mappings, and thousands of lines of embedded SQL and expressions. Enterprise SSIS environments typically contain hundreds to thousands of packages with interdependencies, shared connection managers, and parameterized configurations.

MigryX automates the end-to-end conversion process. Upload your .dtsx files and MigryX’s SSIS parser extracts every control flow task, data flow component, connection manager, variable, parameter, expression, and precedence constraint. The parser resolves the complete execution graph, identifies shared dependencies, and generates a conversion plan that maps each SSIS component to its Google Cloud equivalent.

The output is production-ready: Cloud Composer DAG Python files with correct task dependencies and error handling, BigQuery SQL statements with T-SQL transpiled to Standard SQL, Dataform SQLX models for SQL transformation chains, Cloud Function code for Script Task conversions, and a comprehensive mapping report that traces every SSIS component to its converted equivalent with line-level provenance.

Key Takeaways
SSIS .dtsx packages decompose into three Google Cloud services: Cloud Composer for orchestration, BigQuery for SQL transformations and data storage, and Cloud Functions for procedural logic.
SSIS Data Flow Tasks — the in-memory transformation pipeline with Derived Column, Lookup, Conditional Split, and other components — collapse into BigQuery SQL statements that execute in the serverless engine with zero data movement overhead.
SSIS Control Flow patterns (Sequence Containers, For Each Loops, precedence constraints, event handlers) map directly to Airflow concepts (TaskGroups, dynamic task mapping, task dependencies, callbacks).
The SSIS expression language requires a dedicated transpiler — it is neither T-SQL nor Standard SQL, and every function, type cast operator, and conditional construct must be mapped to its BigQuery equivalent.
Connection Managers move from OLE DB connection strings and Windows authentication to BigQuery dataset references and Google Cloud service account authentication with Secret Manager.
Script Tasks (C#/VB.NET) convert to Python Cloud Functions, preserving API call patterns, custom validation logic, and business rules in a cloud-native runtime.
Dataform SQLX models provide a superior target for SSIS Execute SQL Task chains, adding dependency management, incremental processing, assertions, and documentation that SSIS lacks.
MigryX automates the complete conversion pipeline: .dtsx XML parsing, expression transpilation, SQL dialect conversion, DAG generation, and mapping report creation — enabling enterprise-scale SSIS migrations without manual rewriting.

Why MigryX Is the Only Platform That Handles This Migration

The challenges described throughout this article are exactly what MigryX was built to solve. Here is how MigryX transforms this process:

Deep AST parsing: MigryX’s custom-built parsers achieve 95% accuracy on every supported legacy technology — not through approximation, but through true semantic understanding.
Merlin AI augmentation: Where deterministic parsing reaches its limit, Merlin AI resolves ambiguities and implicit behaviors, pushing accuracy to 99%.
Complete coverage: MigryX supports 25+ source technologies including SAS, Informatica, DataStage, SSIS, Alteryx, Talend, ODI, Teradata, and Oracle PL/SQL.
End-to-end automation: From parsing to conversion to validation — MigryX automates the entire pipeline, not just one step.

MigryX combines precision AST parsing with Merlin AI to deliver 99% accurate, production-ready migration — turning what used to be a multi-year manual effort into a streamlined, validated process. See it in action.

Ready to migrate your SSIS packages to BigQuery?

See how MigryX converts your .dtsx packages to production-ready Cloud Composer DAGs, BigQuery SQL, and Dataform models — with full lineage and expression transpilation.

Explore BigQuery Migration Schedule a Demo