Design, implement, and manage scalable ETL / ELT pipelines using AWS services and Databricks.
Ingest and process structured, semi-structured, and unstructured data from multiple sources into AWS Data Lake or Databricks.
Develop advanced data processing workflows using PySpark, Databricks SQL, or Scala to enable analytics and reporting.
Configure and optimize Databricks clusters, notebooks, and jobs for performance and cost efficiency.
Design and implement solutions leveraging AWS-native services like S3, Glue, Redshift, EMR, Lambda, Kinesis, and Athena.
Work closely with Data Analysts, Data Scientists, and other Engineers to understand business requirements and deliver data-driven solutions.
Optimize data pipelines, storage, and queries for performance, scalability, and reliability.
Ensure data pipelines are secure, robust, and monitored using CloudWatch, Datadog, or equivalent tools.
Maintain clear and concise documentation for data pipelines, workflows, and architecture
Data Engineer • bangalore, India