Were seeking a passionate and experienced Senior Python & Machine Learning Engineer to join our Data domain team.
Youll work on unique, one-of-a-kind problem statements using advanced GenAI, large language models (LLMs), and modern data engineering frameworks.
You will help conceptualize and deliver impactful solutions that push the boundaries of data science and machine learning in finance.
Responsibilities :
- Design, develop, and deploy sophisticated machine learning and GenAI models to solve complex data problems at scale.
- Implement, optimize, and scale ML solutions using Databricks, Spark, and cloud-native data ecosystems (AWS / Azure / GCP).
- Collaborate with other engineers, product managers, and UX teams to build robust, high-performance Python-based analytics pipelines.
- Develop and finetune LLMs and generative AI applications for structured and unstructured financial data.
- Architect data processing workflows leveraging Delta Lake, Feature Stores, and MLOps best practices.
- Translate cutting-edge research (papers, new ML techniques) into production solutions.
- Mentor junior data scientists and engineers on ML, best practices and GenAI.
- Work on one-of-a-kind data challenges, including entity disambiguation, real-time risk analytics, NLP, graph data modeling, and anomaly detection.
- Keep up-to-date with the latest in ML tooling, GenAI, Databricks, and cloud data :
- Bachelors / Masters in Computer Science, Data Science, Mathematics, or related field.
- 5-6 years professional experience in ML, Python programming, and data engineering.
- Deep expertise in Python (NumPy, Pandas, PySpark, FastAPI, etc.) and ML frameworks (TensorFlow, PyTorch, Transformers).
- Practical experience with GenAI : training / fine-tuning LLMs (OpenAI, HuggingFace, Google Gemini, etc.), prompt engineering, and retrieval-augmented generation (RAG).
- Hands-on experience with Databricks (Workspace, MLflow, Delta Lake, Notebooks).
- Strong knowledge of cloud data platforms (AWS, Azure, GCP) and containerization (Docker, Kubernetes).
- Applied experience with ETL / ELT, data lakes, real-time streaming (Kafka, Spark Streaming).
- Proven track record of tackling cutting-edge data problems at scale published research or open source contributions a plus.
- Familiarity with modern MLOps toolchains (MLflow, Airflow, Feature Store, CI / CD).
- Effective communicator with excellent collaboration skills.
Tech Stack :
Python, PySpark, FastAPI, Flask.TensorFlow, PyTorch, HuggingFace Transformers.Databricks, Delta Lake, MLflow.AWS / Azure / GCP S3, Blob Storage, EC2, Lambda, Step Functions.SQL, NoSQL.(ref : hirist.tech)