About the Role :
We are seeking a highly skilled and experienced Data Engineer with deep expertise in Google Cloud Platform (GCP) to join our fast-growing team in Bangalore.
As a Data Engineer, you will be responsible for designing, building, and optimizing scalable data pipelines and architectures to support advanced analytics, reporting, and data-driven decision-making across the organization.
This role demands a strong understanding of distributed computing, cloud-native data tools, and best practices in data governance, performance optimization, and secure data Responsibilities :
- Design, implement, and maintain robust, scalable, and secure data pipelines on GCP.
- Build data architectures that efficiently process structured and unstructured data from multiple sources including APIs, event streams, and databases.
- Leverage GCP services such as BigQuery, Dataflow, Pub / Sub, and Cloud Storage to architect and deploy data solutions.
- Develop ETL / ELT processes to transform raw data into clean, structured, and consumable datasets for data science and analytics teams.
- Optimize existing workflows for cost, performance, and scalability across large datasets using distributed computing tools like Apache Spark and Hadoop.
- Collaborate closely with data analysts, data scientists, and business stakeholders to understand data requirements and deliver high-quality solutions.
- Implement and maintain data quality checks, lineage tracking, and metadata management.
- Ensure compliance with data security and privacy standards throughout all data engineering operations.
- Contribute to the continuous improvement of data engineering practices using Agile Skills & Qualifications :
- Minimum 7 years of hands-on experience in Data Engineering, with proven success in cloud-based environments.
- Bachelors degree in Computer Science, Information Technology, Engineering, or a related technical discipline.
- Strong expertise in Google Cloud Platform (GCP), particularly in working with BigQuery, Dataflow, Cloud Composer, and Cloud Storage.
- Proficient in SQL for data manipulation and transformation.
- Solid experience with Python or Scala for data processing and scripting.
- Experience with big data technologies like Apache Spark, Hadoop, and related tools.
- Deep understanding of data modeling, data warehousing concepts, and best practices.
- Familiarity with CI / CD pipelines, version control (Git), and orchestration tools like Airflow or Cloud Composer.
- Strong analytical and problem-solving skills, with the ability to work with complex datasets and extract meaningful insights.
- Excellent communication skills to collaborate effectively with cross-functional teams.
- Working knowledge of data security, compliance, and privacy best to Have :
- Experience with real-time data streaming using Apache Kafka or GCP Pub / Sub.
- Exposure to DevOps practices and Infrastructure as Code (IaC) using tools like Terraform or Deployment Manager.
- Familiarity with Agile / Scrum methodologies and collaborative project tracking tools (e.g., Jira, Confluence)
ref : hirist.tech)