Unlock Fast, Reliable Data Ingestion
You will play a key role in our team specialising in crafting bespoke data platforms and AI solutions. As a Fabric Data Engineer, you will unlock fast, reliable Bronze-layer ingestion.
You'll own Fabric Mirroring from SQL Server and other sources into OneLake, manage CDC and schema drift at scale, and design resilient, high-volume ingestion pipelines that teams can trust.
- Mirroring Configuration : Configure Fabric Mirroring from SQL Server (and other relational sources) into OneLake; tune schedules, snapshots, retention, and throughput.
- Bronze Ingestion : Define Lakehouse folder structures, naming / tagging conventions, and partitioning for fast, organised Bronze ingestion.
- CDC Management : Implement change data capture—including soft / hard deletes, late-arriving data, and backfills—using reliable watermarking and reconciliation checks.
- Pipeline Design : Build ingestion with Fabric Data Factory and / or notebooks; add retries, dead-lettering, and circuit-breaker patterns for fault tolerance.
- Schema Drift Management : Automate drift detection and schema evolution; publish change notes and guardrails so downstream consumers aren't surprised.
- Performance Optimisation : Optimise batch sizes, file counts, partitions, parallelism, and capacity usage to balance speed, reliability, and spend.
Requirements :
SQL Server, T-SQL; CDC / replication fundamentalsMicrosoft Fabric Mirroring; OneLake / Lakehouse; OneLake shortcutsSchema drift detection / management and data contractsFamiliarity with large, complex relational databasesPython / Scala / Spark for ingestion and validation