About the Role
Phonologies is seeking a hands-on ML Engineer who bridges data engineering and machine learning designing and implementing production-ready ML pipelines that are reliable, scalable, and automation-driven.
You'll own the end-to-end workflow : from data transformation and pipeline design to model packaging, fine-tuning preparation, and deployment automation.
This is not a Data Scientist or MLOps role its data-focused engineering position for someone who understands ML systems deeply, builds robust data workflows, and develops the platforms that power AI in production.
Role & responsibilities
Machine Learning Pipelines & Automation :
- Design, deploy and maintain end-to-end ML pipelines.
- Build data transformation layers (Bronze, Silver, Gold) to enable structured, reusable workflows.
- Automate retraining, validation, and deployment using Airflow, Kubeflow, or equivalent tools.
- Implement Rules-Based Access Control (RBAC) and enterprise-grade authorization.
LLM & GenAI Readiness :
Prepare datasets for LLM fine-tuning - tokenization, formatting, and quality filtering.Support LangChain / RAG integration and automate embeddings preparation for GenAI applications.API & Platform Architecture :
Develop robust APIs for model serving, metadata access, and pipeline orchestration.Participate in platform design and architecture reviews, contributing to scalable ML system blueprints.Create monitoring and observability frameworks for performance and reliability.Cloud & Deployment :
Deploy pipelines across cloud (AWS, Azure, GCP) and on-prem environments, ensuring scalability and reproducibility.Collaborate with DevOps and platform teams to optimize compute, storage, and workflow orchestration.Collaboration & Integration :
Work with Data Scientists to productionize models, and with Data Engineers to optimize feature pipelines.Integrate Firebase workflows or data event triggers where required.Preferred candidate profile
Experience : 5+ years in Data Engineering or ML Engineering , with proven experience in :
building data workflows and ML pipelinespackaging and deploying models using Docker & CI / CDpreparing data for LLM fine-tuning and generative AI pipelinesdesigning platform and API architecture for ML systems.Technical Skills :
Programming & ML : Python, SQL, scikit-learn, XGBoost, LightGBMData Engineering & Cloud Pipelines : Large-scale preprocessing, containerized ETL (Docker, Airflow, Kubernetes), workflow automationData Streaming & Integration : Apache Kafka, micro-batch and real-time ingestionML Lifecycle & Orchestration : MLFlow, Dagshub, Databricks, A / B Testing, modular ML system designAPI & Platform Development : FastAPI, Flask, RESTful APIs, architecture planningData Governance, Privacy, Security & Access Control : Schema registry, lineage tracking, secure data handling, audit logging, RBACLLM & GenAI : Fine-tuning prep, RAG, LangChain, LLMOps (LangSmith, vector DBs)AutoML & Optimization : PyCaret, H2O.ai, Google AutoMLModel Monitoring & Automation : Drift detection, retraining workflows, Airflow / Kubeflow automationFirebase & Tooling : Cloud functions, Firestore models, Jenkins, Prefect, CI / CD automationEducation : Bachelors or Masters in Computer Science , Machine Learning , or Information Systems .
Communication & Collaboration : Translating technical concepts, business storytelling, cross-functional delivery.