Job Description : Data Engineering Specialist (PySpark / Databricks) – Noida, India
Company : SII Group
Location : Noida, India
Employment Type : Full-time
Role Type : Data Engineer / Senior Data Engineer (depending on experience)
About SII Group USA
SII Group USA is part of the global SII Group, a leading provider of engineering, consulting, and digital transformation services. Our teams partner with global enterprises to design, build, and scale high-performance technology solutions. We are expanding our data engineering capabilities in India and seeking passionate technologists committed to excellence, innovation, and delivery velocity.
Role Overview
We are looking for an experienced Data Engineering Specialist with hands-on expertise in PySpark, Databricks, Delta Lake, Cloud Integration (GCP / AWS), CI / CD, IaC, and data modeling . The ideal candidate is comfortable building scalable data pipelines, optimizing performance, implementing governance standards, and driving client delivery acceleration in a multi-cloud environment.
Key Responsibilities
1. Data Engineering & Lakehouse Development
- Design, build, and maintain scalable data pipelines using PySpark and Databricks .
- Implement and optimize Delta Lake / Lakehouse architectures for high-volume and real-time workloads.
- Ensure robust data ingestion, transformation, and export workflows with scalable orchestration.
2. Cloud Integration (GCP & AWS)
Develop integrations between GCP services , Databricks, and enterprise systems.Utilize AWS services for complementary workloads (S3, Lambda, EC2, IAM, API Gateway, etc.).Manage hybrid cloud data flows and cross-platform integrations securely and efficiently.3. API / REST-Based Data Export
Build and manage API / REST-based data export triggers and automated delivery pipelines.Architect and optimize data exposure layers for downstream consumption and client interfaces.4. Infrastructure as Code & DevOps
Implement IaC using Terraform for cloud resource provisioning and environment management.Develop and maintain CI / CD pipelines for code deployments, Databricks jobs, and infrastructure automation.Ensure repeatable, scalable, and compliant deployment workflows.5. Data Modeling & SQL Optimization
Design logical and physical data models , export structures, and schema standards.Write and tune complex SQL queries with a focus on performance at scale.Implement best practices for partitioning, caching, indexing, and cost optimization.6. Security, Governance & IAM
Apply data governance best practices across metadata, lineage, quality, and access control.Configure and manage IAM , cluster security, encryption, credential handling, and audit logging.Ensure compliance with enterprise security policies and client requirements.7. Performance, Scalability & Reliability
Optimize ETL / ELT workflows for cost, latency, and throughput.Implement monitoring, alerting, and auto-scaling strategies across cloud platforms.Collaborate with architecture teams to enable long-term scalability and resilience.8. Client Delivery & Ramp-Up Support
Support rapid client onboarding and delivery velocity enhancement .Collaborate closely with product owners, project managers, and client stakeholders.Provide guidance to junior engineers, perform code reviews, and help build best-practice frameworks.Required Skills & Experience
4–10 years of experience in data engineering (range can be adjusted).Strong hands-on experience with PySpark and Databricks .Proven expertise in Delta Lake / Lakehouse implementations.Experience with both GCP (BigQuery, GCS, Pub / Sub, IAM, etc.) and AWS (S3, Lambda, Glue, Redshift, etc.).Proficiency in Terraform , CI / CD tools (Azure DevOps, GitHub Actions, GitLab CI, Jenkins, etc.).Strong SQL skills, including performance tuning and query optimization.Experience with API integrations, REST services, and automated triggers.Understanding of data security, governance frameworks, and IAM policies.Experience working in agile delivery teams, with client-facing exposure preferred.Preferred Qualifications
Databricks certifications (Data Engineer Associate / Professional).GCP or AWS cloud certifications.Experience supporting enterprise-scale data programs.Knowledge of data quality frameworks (e.g., Deequ, Great Expectations).What We Offer
Opportunity to work with global clients and cutting-edge data technologies.Collaborative culture with strong focus on innovation.Competitive compensation, continuous learning opportunities, and global career mobility