Experience required : 2–4 years of professional software engineering experience.
Role Overview :
You will design, build, and operate software for data collection and processing at scale. The role is hands‑on, with emphasis on clean design, reliability, and performance.
Key Responsibilities :
- Develop and maintain Python applications for crawling, parsing, enrichment, and processing of large datasets.
- Build and operate data workflows (ETL / ELT), including validation, monitoring, and error‑handling.
- Work with SQL and NoSQL (plus vector databases / data lakes) for modeling, storage, and retrieval.
- Contribute to system design using cloud‑native components on AWS (e.g., S3, Lambda, ECS / EKS, SQS / SNS, RDS / DynamoDB, CloudWatch).
- Implement and consume APIs / microservices; write clear contracts and documentation.
- Write unit / integration tests, perform debugging and profiling; contribute to code reviews and maintain high code quality.
- Implement observability (logging / metrics / tracing) and basic security practices (secrets, IAM, least privilege).
- Collaborate with Dev / QA / Ops; ship incrementally using PRs and design docs.
Required Qualifications
2–4 years of professional software engineering experience.Strong proficiency in Python; good knowledge of data structures / algorithms and software design principles.Hands‑on with SQL and at least one NoSQL store; familiarity with vector databases is a plus.Experience with web scraping frameworks (e.g., Scrapy, Selenium / Playwright, BeautifulSoup) and resilient crawling patterns (respect robots / rotations / retries).Practical understanding of system design and distributed systems basics.Exposure to AWS services and cloud‑native design; comfortable on Linux and with Git.Preferred / Good to Have (Prioritized)
GenAI & LLMs : experience with LangChain, CrewAI, LlamaIndex, prompt design, RAG patterns, and vector stores. (Candidates with this experience will be prioritized.)CI / CD & Containers : exposure to pipelines (GitHub Actions / Jenkins), Docker, and Kubernetes.Data Pipelines / Big Data : ETL / ELT, Airflow, Spark, Kafka, or similar.Infra as Code : Terraform / CloudFormation; basic cost‑ and performance‑optimization on cloud.Frontend / JS : not required; basic JS or frontend skills are a nice‑to‑have only.Exposure to GCP / Azure.How We Work
Ownership of modules end‑to‑end (design → build → deploy → operate).Clear communication, collaborative problem‑solving, and documentation.Pragmatic engineering : small PRs, incremental delivery, and measurable reliability.Work‑from‑Home Requirements
High‑speed internet for calls and collaboration.A capable, reliable computer (modern CPU, 8GB+ RAM).Headphones with clear audio quality.Stable power and backup arrangements.ForageAI is an equal‑opportunity employer. We value curiosity, craftsmanship, and collaboration.