A US/Canadian based IT MNC is hiring Data Modeller for one of its Banking Client.
Experience with Banking Domain and BIAN Architecture is must.
Position Name: Data Modeler
Location: Remote
Time Overlap: Till 11 PM IST
Required skill:
Experience with the Banking domain is a big plus.
1. Lakehouse Data Modeling on Amazon S3
o Design Medallion architecture (Bronze/Silver/Gold)
o Model data for scalability, partitioning, and domain-based access
o Handle schema evolution and time-travel use cases
2. AWS Glue + PySpark (ETL Modeling)
o Translate logical/physical models into PySpark transformations
o Optimize joins, partition pruning, pushdown predicates
o Manage schema via Glue Data Catalog
3. Schema Design & Metadata Management
o Define canonical schemas and data contracts
o Maintain centralized metadata using Glue Catalog
o Versioning and backward compatibility of schemas
4. Modern Table Formats (Apache Iceberg / Delta)
o Implement ACID-compliant tables on S3
o Design for incremental loads, CDC, and snapshot-based querying
o Optimize compaction and partition strategies
5. Streaming & CDC Data Modeling (Kafka / MSK)
o Design event schemas aligned with domain models
o Model change data capture flows into lakehouse
o Ensure consistency between streaming and batch layers
6. Advanced Data Modeling Techniques
o Data Vault 2.0 (Hubs, Links, Satellites)
o Dimensional modeling (Star/Snowflake)
o SCD (Type 1/2/3), surrogate keys, historization
7. Data Governance & Quality Engineering
o Data lineage, cataloging, metadata-driven pipelines
o Data quality frameworks (Great Expectations, Deequ)
o RBAC, audit, compliance
8. Lakehouse & Medallion Architecture
o Bronze (raw CDC), Silver (conformed), Gold (business-ready)
o Schema evolution, late arriving data, deduplication
9. Orchestration & Pipeline Engineering
o Apache Airflow (DAG design, dependency mgmt, SLA handling)
o Hybrid orchestration (event + schedule driven)
o CI/CD for data pipelines
10.Canonical & Contract-First Data Design
o Canonical schemas, data contracts, schema versioning
o API/event schema alignment (Avro/JSON/Protobuf)
11. Domain-Centric Data Modeling
o Nice to have experience BIAN-aligned service domains (www.Bian.Org)
o Domain-driven design with clear data ownership and boundaries