Company Description
MeghGen Technologies is a trusted technology solutions partner specializing in data engineering, cloud-native applications, and cloud migrations. We design and implement end-to-end cloud solutions to help businesses modernize infrastructure, scale applications, and harness the power of their data. Our expertise extends to AI-powered innovations, including machine learning and generative AI applications, as well as products like the MeghGen AI Voicebot. Serving industries such as healthcare, retail, fintech, and banking, MeghGen empowers organizations to overcome technological challenges and drive measurable growth.
Role Description
This is a full-time hybrid role for a Senior Data Engineer, based in Bengaluru, with partial work-from-home flexibility. The Senior Data Engineer will be responsible for designing, building, and maintaining data pipelines, performing data modeling, and optimizing ETL processes. Additional responsibilities include collaborating with teams to enable robust data warehousing solutions and leveraging data analytics to generate insights that drive decision-making.
Key Responsibilities
As a Senior Data Engineer, you will be expected to take ownership of complex data initiatives and systems. Your responsibilities will include :
- Architecture & Design : Design, implement, and maintain scalable and resilient modern data architectures, such as Data Lakehouses or Data Mesh paradigms, on a cloud platform (AWS, GCP, or Azure).
- Data Pipeline Development : Develop, test, and optimize high-volume, high-performance ETL / ELT data pipelines for batch and real-time processing using technologies like Apache Spark, Kafka, or Flink .
- Data Warehousing : Lead the implementation and management of data warehousing solutions (e.g., Snowflake, Databricks, BigQuery, or Amazon Redshift ) to support business intelligence and advanced analytics.
- Mentorship & Standards : Serve as a technical leader, mentoring junior engineers, conducting code reviews, and establishing engineering best practices, coding standards, and continuous integration / continuous deployment (CI / CD) pipelines.
- Data Quality & Governance : Implement and monitor data quality checks, security protocols, and metadata management strategies to ensure data reliability and compliance.
- Collaboration : Work closely with cross-functional teams, including Data Scientists, Analysts, and Product Managers, to understand data requirements and translate them into scalable technical solutions.
- Performance Optimization : Troubleshoot, debug, and optimize existing data systems and pipelines to improve performance, reliability, and cost-efficiency.
Technical Skills
Experience : 5+ years of professional experience in a Data Engineering, BI Engineering, or similar role.Programming & SQL : Expert-level proficiency in at least one object-oriented programming language, preferably Python or Scala , and advanced expertise in SQL .Big Data Ecosystem : Extensive hands-on experience with Apache Spark (PySpark / Scala) and other components of the Big Data ecosystem.Cloud Platforms : Proven, deep experience with the data services of a major cloud provider (e.g., AWS S3, EMR, Glue, Lambda; GCP BigQuery, Dataflow, Cloud Storage; Azure Data Factory, Synapse ).Workflow Orchestration : Significant experience with data workflow management tools like Apache Airflow , Prefect, or Dagster.Data Modeling : Solid understanding of data modeling principles, including dimensional modeling (Star / Snowflake schema), 3NF, and data vault modeling.Infrastructure : Experience with version control ( Git ) and familiarity with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.Education : Bachelor’s degree in Computer Science, Engineering, or a related quantitative field.