JOB OBJECTIVE
We are seeking a dynamic professional to fill in our Data Engineering role who will be responsible for designing, developing, and maintaining the data architecture, infrastructure, and pipelines that enable Northern Arc to collect, store, process, and analyze large volumes of data. This role will play a crucial part in ensuring data availability, quality, and accessibility for data-driven decision-making.
KEY ACCOUNTABILITIES
Data Pipeline Development :
- Design, develop, and maintain data pipelines to ingest, process, and transform data from various sources into usable formats.
- Implement data integration solutions that connect disparate data systems, including databases, APIs, and third-party data sources.
Data Storage and Warehousing :
Create and manage data storage solutions, such as data lakes, data warehouses, and NoSQL databases.Optimize data storage for performance, scalability, and cost-efficiency.Data Quality and Governance :
Establish data quality standards and implement data validation and cleansing processes.Collaborate with data analysts and data scientists to ensure data consistency and accuracy.ETL (Extract, Transform, Load) :
Develop ETL processes to transform raw data into a structured and usable format.Monitor and troubleshoot ETL jobs to ensure data flows smoothly.Data Security and Compliance :
Implement data security measures and access controls to protect sensitive data.Ensure compliance with data privacy regulations and industry standards (e.g., GDPR, HIPAA).Performance Tuning :
Optimize data pipelines and queries for improved performance and efficiency.Identify and resolve bottlenecks in data processing.Data Documentation :
Maintain comprehensive documentation for data pipelines, schemas, and data dictionaries.Create and update data lineage and metadata documentation.Scalability and Reliability :
Design data infrastructure to scale with growing data volumes and business requirements.Implement data recovery and backup strategies to ensure data availability and resilience.Collaboration :
Collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders, to understand data requirements and deliver data solutions.QUALIFICATIONS, EXPERIENCE, & COMPETENCIES :
9+ years of experience in similar roleBachelor's or masters degree in computer science, Information Technology, or a related field.Proven experience in data engineering, ETL development, and data integration.Proficiency in data pipeline orchestration tools (e.g., Apache NiFi, Apache Airflow).Strong knowledge of databases (SQL and NoSQL), data warehousing, and data modeling concepts.Familiarity with data processing frameworks (e.g., Hadoop, Spark) and cloud-based data services (e.g., AWS, Azure, GCP).Experience with version control systems (e.g., Git) and data versioning.Excellent programming skills in languages such as Python, SQL, Java, Scala, R, and / or Go.Knowledge of data security and compliance standards.Strong problem-solving and troubleshooting skills.Effective communication and teamwork abilities(ref : hirist.tech)