Ready to build the future with AI?
At Genpact, we don’t just keep up with technology—we set the pace. AI and digital innovation are redefining industries, and we’re leading the charge. Genpact’s AI Gigafactory, our industry-first accelerator, is an example of how we’re scaling advanced technology solutions to help global enterprises work smarter, grow faster, and transform at scale. From large-scale models to agentic AI, our breakthrough solutions tackle companies’ most complex challenges.
If you thrive in a fast-moving, innovation-driven environment, love building and deploying cutting-edge AI solutions, and want to push the boundaries of what’s possible, this is your moment.
Genpact (NYSE : G) is an advanced technology services and solutions company that delivers lasting value for leading enterprises globally. Through our deep business knowledge, operational excellence, and cutting-edge solutions – we help companies across industries get ahead and stay ahead. Powered by curiosity, courage, and innovation , our teams implement data, technology, and AI to create tomorrow, today. Get to know us at genpact.com and on LinkedIn, X, YouTube, and Facebook.
Inviting applications for the role of Assistant Vice President – Lead Data Engineer
This role requires expertise in Databricks, Azure Data Factory (ADF), Python, PySpark and Unity Catalog to efficiently process and manage large datasets, along with a deep understanding of cloud architecture to build scalable, secure, and reliable data solutions on the Microsoft Azure platform. The primary responsibility of the lead data engineer with Unity Catalogue expertise is to apply advanced data engineering skills to optimize data integration, enhance data accessibility, and drive strategic decision making through effective data governance, simplification, standardization, and innovative solutions across all supported units. This role will be implementing DevOps best practices and driving innovation using modern data platform capabilities such as Unity Catalog, MLflow, and Large Language Models
(LLMs).
Responsibilities
Design and development
Collaborate with business stakeholders and analysts to understand
data requirements. Design, develop, and test data pipelines and workflows using Unity Catalogue
to optimize end-to-end processes.
Create reusable components, robust exception handling, and standardized frameworks for data solutions.
Solution Design
Develop and maintain robust data architectures using Lakehouse principles to ensure
efficient data processing and storage. Comprehensive data architecture solutions using
Databricks and Lakehouse principles to support advanced analytics and machine learning
initiatives.
Explore and integrate Large Language Models (LLMs) and Copilot tools to drive
automation and agility.
Leverage Databricks MLflow for model lifecycle management and operationalization
Data Quality and Governance :
Ensure data quality frameworks, lineage, and monitoring are in place.
Implement data quality checks, validation rules, and governance policies to ensure the
accuracy, reliability, and security of data assets.
Implement data security and privacy measures to protect sensitive information.
Data Integration and Analytics :
Pull data from different sources, transform and stitch it for advanced analytics activities.
Design, implement, and deploy data loaders to load data into the engineering sandbox.
Collaborate with data scientists and analysts to support their data requirements and
prepare machine learning feature stores.
Leadership and Mentorship :
Own complex, cross-functional data projects from ideation to production, including
defining requirements, designing solutions, leading development, and ensuring successful
deployment and long-term maintenance.
Provide guidance and technical leadership to a team of data engineers through in-depth
code reviews, mentoring junior and mid-level engineers, and fostering a culture of
technical excellence.
Mentor mid-level engineers and perform peer reviews.
Process improvement and efficiency :
Drive continuous improvement initiatives in data processes and systems.
Promote standardization and automation to enhance efficiency and accuracy. Support regional and global data projects.
Operates within a fast-paced, innovative environment focused on scalable data solutions using
Azure services.
Conduct detailed planning to align data engineering goals with organizational objectives. Set clear priorities, timelines, and deliveries for data projects.
Implement periodic and quarterly reviews with the business team and relevant parties to track
progress, analyze variances, and adjust plans as needed.
Use tools like JIRA to monitor data project progress concerning schedule adherence, cost
management, quality standards, and other key metrics.
Maintain open channels with stakeholders through scheduled meetings, updates, and reports.
Collaborate closely with cross-functional leads to gather insights, validating assumptions, and
support decision-making.
Qualifications We Seek in You!
Minimum Qualifications / Skills
Bachelor’s degree in computer science, Information Systems, or a related field.
15+ years of experience in Databricks, Azure ADF, Python, Pyspark and Unity Catalog Dataflow, and Lakehouse architecture
Certifications in Azure data engineering, Databricks or related fields
Preferred Qualifications / Skills
Deep hands-on expertise in Azure Data Services (e.g., Azure Data Lake, Azure Data Factory,
Synapse, etc.) and Databricks.
Strong experience in data pipeline design, ETL / ELT development, and data orchestration
frameworks.
Proficiency in DevOps tools and practices (CI / CD pipelines, IaC, monitoring).
Knowledge of data lineage, cataloging, and enterprise data marketplace concepts.
Familiarity with integrating 3rd party data sources and managing data quality frameworks.
Ability to leverage LLMs and Copilot solutions to enhance data platform productivity.
Experience in building self-healing architecture for data pipelines.
Hands-on experience with data pipeline development and optimization
Deep knowledge of data governance frameworks and tools, including Databricks Unity
Catalog, to ensure data
Why join Genpact?
Lead AI-first transformation – Build and scale AI solutions that redefine industries
Make an impact – Drive change for global enterprises and solve business challenges that matter
Accelerate your career —Gain hands-on experience, world-class training, mentorship, and AI certifications to advance your skills
Grow with the best – Learn from top engineers, data scientists, and AI experts in a dynamic, fast-moving workplace
Committed to ethical AI – Work in an environment where governance, transparency, and security are at the core of everything we build
Thrive in a values-driven culture – Our courage, curiosity, and incisiveness - built on a foundation of integrity and inclusion - allow your ideas to fuel progress
Come join the 140,000+ coders, tech shapers, and growth makers at Genpact and take your career in the only direction that matters : Up.
Let’s build tomorrow together.
Vice President • Hyderabad, Telangana, India