Data Engineer
Job Description :
Job Location : Hyderabad
We are seeking a hands-on Data Engineer with a strong focus on data ingestion to support the delivery of high-quality, reliable, and scalable data pipelines across our Data & AI ecosystem. This role is essential in enabling downstream analytics, machine learning, and business intelligence solutions by ensuring robust and automated data acquisition from various internal and external sources.
Key Responsibilities :
- Design, build, and maintain scalable and reusable data ingestion pipelines to onboard structured and semi-structured data from APIs, flat files, databases, and external systems.
- Work with Azure-native services (e.g., Data Factory, Azure Data Lake, Event Hubs) and tools like Databricks or Apache Spark for data ingestion and transformation.
- Develop and manage metadata-driven ingestion frameworks to support dynamic and automated onboarding of new sources.
- Collaborate closely with source system owners, analysts, and data stewards to define data ingestion specifications and implement monitoring / alerting on ingestion jobs.
- Ensure data quality, lineage, and governance principles are embedded into ingestion processes.
- Optimize ingestion processes for performance, reliability, and cloud cost efficiency.
- Support batch and real-time ingestion needs, including streaming data pipelines where applicable.
Technical Experience :
3+ years of hands-on experience in data engineering bonus : with a specific focus on data ingestion or integration.Hands-on experience with Azure Data Services (e.g., ADF, Databricks, Synapse, ADLS) or equivalent cloud-native tools.Experience in Python (PySpark) for data processing tasks. (bonus : SQL knowledge)Experience with ETL frameworks, orchestration tools, and working with API-based data ingestion.Familiarity with data quality and validation strategies, including schema enforcement and error handling.Good understanding of CI / CD practices, version control, and infrastructure-as-code (e.g., Terraform, Git).Bonus : Experience with streaming ingestion (e.g., Kafka, Event Hubs, Spark Structured Streaming).(ref : hirist.tech)