Key Responsibilities
- Develop and maintain scalable ETL / ELT data pipelines using Python and PySpark.
- Write efficient and optimized SQL queries for data extraction, transformation, and analysis.
- Design and support data integration workflows across Vertica and Snowflake environments.
- Collaborate with data analysts, data scientists, and business stakeholders to gather data requirements and translate them into technical solutions.
- Perform light DBA activities such as table partitioning, indexing, user and access control, and performance tuning.
- Ensure data quality, consistency, and integrity across systems.
- Create and maintain technical documentation.
Required Skills & Qualifications
4–6 years of experience in data engineering or a similar role.Proficiency in Python for data manipulation and scripting.Hands-on experience with PySpark, AWS Glue in distributed data processing.Strong SQL skills with experience working on large datasets.Practical experience with Vertica and Snowflake (data modeling, query optimization, etc.).Understanding of basic DBA tasks including backup / recovery, schema management, and monitoring.Familiarity with version control (e.g., Git)Strong problem-solving skills and the ability to work in a fast-paced, collaborative environment.Skills Required
Python, Sql, data engineering , Data Modeling, Pyspark, Etl