# 25WD9141
Position Overview
We are seeking a highly experienced Principal Engineer to lead the design, development, and evolution of our Batch Processing platform , which powers Autodesk’s Analytics Data Platform (ADP). This role requires deep technical expertise in distributed data systems, large-scale pipeline orchestration, and hands-on leadership in shaping next-generation data platform capabilities. You will partner closely with Engineering Managers, Architects, Product teams, and Partner Engineering to modernize our data lakehouse architecture, deliver highly reliable data pipelines, and establish technical excellence across ingestion, processing, and governance.
Responsibilities
Technical Leadership
- Define and drive the technical strategy for Batch Processing within ADP
- Design and scale PySpark based distributed processing frameworks integrated with Airflow for orchestration
- Champion Apache Iceberg-based data lakehouse adoption (versioning, branching, WAP, indexing)
- Introduce performance optimization patterns for data pipelines (e.g., partition pruning, caching, resource tuning)
- Establish patterns for metadata-driven, intent-based data processing aligned with AI-assisted pipelines
Architecture & Delivery
Partner with architects to design multi-tenant, secure, and compliant (SOC2 / GDPR / CCPA) batch processing servicesDefine reference architectures and reusable frameworks for cross-domain processing workloadsLead technical reviews, solution designs, and architectural forums across ADPGuide the evolution of Airflow APIs and SDKs to enable scalable pipeline self-serviceMentorship & Collaboration
Mentor senior engineers, staff engineers, and tech leads within the Batch Processing and adjacent teamsPartner with the Batch Ingestion and Data Security teams to deliver unified ingestion + processing flowsInnovation & Impact
Drive modernization initiatives to migrate away from legacy toolsPioneer AI-augmented data engineering practices (e.g., pipeline optimization agents, anomaly detection)Ensure scalability, cost-efficiency, and reliability for thousands of production pipelines across AutodeskInfluence company-wide data engineering strategy by contributing thought leadership and whitepapersMinimum Qualifications
8+ years of experience in software / data engineering, with at least 3 years in big data platform leadership rolesExpert in distributed data processing (Spark, PySpark, Ray, Flink)Deep experience with workflow orchestration (Airflow, Dagster, Prefect)Strong hands-on expertise in Lakehouse technologies (Iceberg, Delta, Hudi) and cloud platforms (AWS / Azure / GCP)Proven track record in architecting secure, multi-tenant, and compliant data platformsSkilled in SQL / NoSQL databases, Snowflake, Glue, and modern metadata / catalog toolsStrong problem-solving, communication, and cross-geo collaboration skillsExperience mentoring engineers, building strong technical culture, and influencing at scalePreferred Qualifications
Exposure to AI / ML-driven data engineering (optimizers, anomaly detection, auto-scaling)Experience with data governance, lineage, and observability tools (Atlan, Databand, Collibra, etc.)Familiarity with streaming + batch hybrid architectures (Kappa / Lambda)#LI-KS2