Job Description :
Key Responsibilities :
- Design, develop, and maintain robust ETL pipelines for large and complex datasets.
- SQL and PL / pgSQL queries in PostgreSQL, ensuring efficiency in high-concurrency environments.
- Manage and scale environments with 5000+ processor instances in PostgreSQL or equivalent setups.
- Ingest, process, and validate data from multiple sources while ensuring data integrity, consistency, and availability.
- Monitor ETL workflows, identify bottlenecks, and apply performance tuning techniques.
- Collaborate with data architects, analysts, and business stakeholders to define and deliver data requirements.
- Implement and maintain data quality, validation, and reconciliation across systems.
- Document ETL processes, pipelines, and data architectures.
- Ensure compliance with data security, governance, and privacy standards.
Required Skills & Experience :
5+ years of ETL development experience with complex data workflows.Strong hands-on expertise in Apache NiFi (must-have).Advanced experience with PostgreSQL (query optimization, scaling in massively parallel systems).Proven ability to work in large-scale environments (e.g., 5000+ processor instances).Proficiency in SQL, PL / pgSQL, and performance tuning techniques.Hands-on experience with ETL tools like Talend, Informatica, Airflow, etc.Familiarity with big data ecosystems (Hadoop, Spark, Kafka) is a plus.Strong knowledge of data modeling, warehousing, and governance principles.Excellent problem-solving, debugging, and analytical skills.Preferred Qualifications :
Experience with cloud platforms (AWS, GCP, Azure).Exposure to DevOps and CI / CD practices for data pipelines.Hands-on experience with real-time streaming data processing.Knowledge of scripting languages (Python, Bash, etc.).Education : Bachelors or Masters degree in Computer Science, Data Engineering, or a related field.
(ref : hirist.tech)