Location - Bengaluru
Experience : 2+ years
Key Responsibilities :
- Design and Development : Design, build, and maintain robust, scalable, and optimized ETL / ELT data
pipelines using Python and standard data warehousing principles.
Data Platform Management : Implement and manage data processing jobs and workflows within theDatabricks platform, leveraging Spark (PySpark) for large-scale data transformation.
Data Modeling and Querying : Develop complex and efficient SQL queries, stored procedures, andoptimize database performance; contribute to the design and implementation of dimensional and
relational data models.
Code Quality and Automation : Ensure high standards of code quality, testing, and documentation;implement CI / CD practices and automation for data workflows.
Collaboration : Work closely with Data Scientists, Analysts, and other software engineers tounderstand data requirements and deliver solutions that meet business needs.
Monitoring and Optimization : Monitor data pipeline performance, troubleshoot issues, andimplement necessary optimizations to ensure data accuracy and timely delivery.
Required Qualifications
Experience : Minimum of 2 years of professional experience in a Data Engineering, BI Engineering, orsimilar role.
Technical Expertise (Must-Haves) :Expert proficiency in writing complex, optimized SQL queries and experience with various relationaldatabases (e.g., PostgreSQL, SQL Server, MySQL).
Expert proficiency in Python for data manipulation, scripting, and pipeline development(experience with libraries like Pandas, NumPy is a plus).
Demonstrated experience and strong working knowledge of the Databricks platform, includingDelta Lake, Spark (PySpark), and notebook-based development.
Data Concepts : Solid understanding of data warehousing concepts, data modeling (e.g., Star Schema,Snowflake Schema), and ETL / ELT processes.
Communication : Excellent communication and collaboration skills, with the ability to explaincomplex technical concepts to non-technical stakeholders.