Job Title : Data Engineer
Experience Required : Minimum 1–2 Years
Location : Vapi, Gujarat
Employment Type : Full-time
About the Role
We are seeking a skilled and motivated Data Engineer to join our growing technology team. The
role involves building and maintaining scalable, reliable, and secure data infrastructure to support
analytics, data-driven decision-making, and AI / ML pipelines.
You’ll work with diverse data types and modern data platforms to design efficient data pipelines and ensure smooth data flow across systems.
Key Responsibilities
- Design, develop, and maintain robust ETL / ELT pipelines for structured and unstructured data
using tools like Apache NiFi, Airflow, or Dagster.
Build streaming and event-driven data pipelines using Kafka, RabbitMQ, or similar systems.Design and manage scalable data lakes (e.g., Apache Hudi, Iceberg, Delta Lake) over AmazonS3 or MinIO.
Implement and optimize distributed databases such as Cassandra, MongoDB, ClickHouse, andElasticSearch.
Ensure data quality, monitoring, and observability across all data pipeline components.Work with query engines like Trino for federated data access.Manage data versioning and reproducibility using tools like DVC.Perform data migrations, query optimization, and system performance tuning.Collaborate with analytics, product, and AI teams to provide clean and well-structured datasets.Must-Have Skills & Experience
Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field.1–2 years of experience as a Data Engineer or in a similar role.Strong proficiency in Python and SQL.Hands-on experience with ETL orchestration tools (Airflow, NiFi, Dagster).Familiarity with data lakes, streaming platforms, and distributed databases.Experience working with cloud / object storage (Amazon S3, MinIO).Knowledge of data governance, security, and pipeline observability.Good-to-Have Skills
Experience with time-series databases (InfluxDB, TimescaleDB, QuestDB).Familiarity with graph databases (Neo4j, OrientDB, or RavenDB).Understanding of MLOps, feature stores, or data lifecycle automation.Exposure to Elasticsearch for indexing and search use cases.Experience in query performance tuning and data migration strategies.