Job Title : Data Engineer
Experience : 5 to 7 Years
Budget : 17 to 25 LPA
Location : Bangalore
Type : Full-Time
- Design and Development :
- Design, develop, and maintain scalable and robust data pipelines (ETL / ELT) using PySpark on the Databricks platform.
- Implement data ingestion strategies to load data from various sources (e.g., streaming, APIs, databases, files) into the Data Lake / Data Warehouse.
- Write optimized and complex SQL queries for data extraction, transformation, and loading.
- Platform & Optimization :
- Leverage expertise in a Cloud Platform (AWS / Azure / GCP) for cloud-native data services, storage solutions (e.g., S3, ADLS, GCS), and compute resources.
- Optimize and fine-tune Spark jobs and Databricks clusters for performance, cost efficiency, and reliability.
- Ensure data quality, integrity, and security across all data lifecycle stages.
- Collaboration & Operations :
- Collaborate with Data Scientists, Analysts, and other engineering teams to understand data requirements and deliver solutions that meet business needs.
- Implement and maintain CI / CD pipelines and version control (e.g., Git) for data pipeline deployments.
- Monitor and troubleshoot production data pipelines and proactively address issues.