Talent.com
No longer accepting applications
Data Engineer

Data Engineer

ConfidentialMumbai, Delhi
15 days ago
Job description

Roles and Responsibilities :

  • Design, build, and optimize scalable ETL pipelines using Apache Airflow or similar frameworks to process and transform large datasets efficiently.
  • Utilize Spark (PySpark), Kafka, Flink, or similar tools to enable distributed data processing and real-time streaming solutions.
  • Deploy, manage, and optimize data infrastructure on cloud platforms such as AWS, GCP, or Azure, ensuring security, scalability, and cost-effectiveness.
  • Design and implement robust data models, ensuring data consistency, integrity, and performance across warehouses and lakes.
  • Enhance query performance through indexing, partitioning, and tuning techniques for large-scale datasets.
  • Manage cloud-based storage solutions (Amazon S3, Google Cloud Storage, Azure Blob Storage) and ensure data governance, security, and compliance.
  • Work closely with data scientists, analysts, and software engineers to support data-driven decision-making, while maintaining thorough documentation of data processes.

Ideal candidate

  • Strong proficiency in Python and SQL, with additional experience in languages such as Java or Scala.
  • Hands-on experience with frameworks like Spark (PySpark), Kafka, Apache Hudi, Iceberg, Apache Flink, or similar tools for distributed data processing and real-time streaming.
  • Familiarity with cloud platforms like AWS, Google Cloud Platform (GCP), or Microsoft Azure for building and managing data infrastructure.
  • Strong understanding of data warehousing concepts and data modeling principles.
  • Experience with ETL tools such as Apache Airflow or comparable data transformation frameworks.
  • Proficiency in working with data lakes and cloud based storage solutions like Amazon S3, Google Cloud Storage, or Azure Blob Storage.
  • Expertise in Git for version control and collaborative coding.
  • Expertise in performance tuning for large-scale data processing, including partitioning, indexing, and query optimization.
  • Skills Required

    Python, Sql, Pyspark, Kafka, Aws, Google Cloud Platform, Apache Airflow

    Create a job alert for this search

    Data Engineer • Mumbai, Delhi