Senior Software Engineer, Remote
The Software Engineer, Data Ingestion will be a critical individual contributor responsible for designing collection strategies, developing, and maintaining robust and scalable data pipelines . This role is at the heart of our data ecosystem, deliver new analytical software solution to access timely, accurate, and complete data for insights, products, and operational efficiency.
Key Responsibilities :
- Design, develop, and maintain high-performance, fault-tolerant data ingestion pipelines using Python.
- Integrate with diverse data sources (databases, APIs, streaming platforms, cloud storage, etc.).
- Implement data transformation and cleansing logic during ingestion to ensure data quality.
- Monitor and troubleshoot data ingestion pipelines, identifying and resolving issues promptly.
- Collaborate with database engineers to optimize data models for fast consumption.
- Evaluate and propose new technologies or frameworks to improve ingestion efficiency and reliability.
- Develop and implement self-healing mechanisms for data pipelines to ensure continuity.
- Define and uphold SLAs and SLOs for data freshness, completeness, and availability.
- Participate in on-call rotation as needed for critical data pipeline issues
Key Skills :
5+ years of experience, ideally with background in Computer Science , working in software product companies.Extensive Python Expertise : Extensive experience in developing robust, production-grade applications with Python.Data Collection & Integration : Proven experience collecting data from various sources (REST APIs, OAuth, GraphQL, Kafka, S3, SFTP, etc.).Distributed Systems & Scalability : Strong understanding of distributed systems concepts, designing for scale, performance optimization, and fault tolerance.Cloud Platforms : Experience with major cloud providers (AWS or GCP) and their data-related services (e.g., S3, EC2, Lambda, SQS, Kafka, Cloud Storage, GKE).Database Fundamentals : Solid understanding of relational databases (SQL, schema design, indexing, query optimization). OLAP database experience is a plus (Hadoop)Monitoring & Alerting : Experience with monitoring tools (e.g., Prometheus, Grafana) and setting up effective alerts.Version Control : Proficiency with Git.Containerization (Plus) : Experience with Docker and Kubernetes.Streaming Technologies (Plus) : Experience with real-time data processing using Kafka, Flink, Spark Streaming.