# 25WD9141
Position Overview
We are seeking a highly experienced
Principal Engineer
to lead the design, development, and evolution of our
Batch Processing platform , which powers Autodesk’s Analytics Data Platform (ADP). This role requires deep technical expertise in distributed data systems, large-scale pipeline orchestration, and hands-on leadership in shaping next-generation data platform capabilities. You will partner closely with Engineering Managers, Architects, Product teams, and Partner Engineering to modernize our data lakehouse architecture, deliver highly reliable data pipelines, and establish technical excellence across ingestion, processing, and governance.
Responsibilities
Technical Leadership
Define and drive the technical strategy for Batch Processing within ADP
Design and scale PySpark based distributed processing frameworks integrated with Airflow for orchestration
Champion Apache Iceberg-based data lakehouse adoption (versioning, branching, WAP, indexing)
Introduce performance optimization patterns for data pipelines (e.g., partition pruning, caching, resource tuning)
Establish patterns for metadata-driven, intent-based data processing aligned with AI-assisted pipelines
Architecture & Delivery
Partner with architects to design multi-tenant, secure, and compliant (SOC2 / GDPR / CCPA) batch processing services
Define reference architectures and reusable frameworks for cross-domain processing workloads
Lead technical reviews, solution designs, and architectural forums across ADP
Guide the evolution of Airflow APIs and SDKs to enable scalable pipeline self-service
Mentorship & Collaboration
Mentor senior engineers, staff engineers, and tech leads within the Batch Processing and adjacent teams
Partner with the Batch Ingestion and Data Security teams to deliver unified ingestion + processing flows
Innovation & Impact
Drive modernization initiatives to migrate away from legacy tools
Pioneer AI-augmented data engineering practices (e.g., pipeline optimization agents, anomaly detection)
Ensure scalability, cost-efficiency, and reliability for thousands of production pipelines across Autodesk
Influence company-wide data engineering strategy by contributing thought leadership and whitepapers
Minimum Qualifications
8+ years of experience in software / data engineering, with at least 3 years in big data platform leadership roles
Expert in distributed data processing (Spark, PySpark, Ray, Flink)
Deep experience with workflow orchestration (Airflow, Dagster, Prefect)
Strong hands-on expertise in Lakehouse technologies (Iceberg, Delta, Hudi) and cloud platforms (AWS / Azure / GCP)
Proven track record in architecting secure, multi-tenant, and compliant data platforms
Skilled in SQL / NoSQL databases, Snowflake, Glue, and modern metadata / catalog tools
Strong problem-solving, communication, and cross-geo collaboration skills
Experience mentoring engineers, building strong technical culture, and influencing at scale
Preferred Qualifications
Exposure to AI / ML-driven data engineering (optimizers, anomaly detection, auto-scaling)
Experience with data governance, lineage, and observability tools (Atlan, Databand, Collibra, etc.)
Familiarity with streaming + batch hybrid architectures (Kappa / Lambda)
#LI-KS2
Principal Data Engineer • Delhi, India