Job Description
AIML Engineer
Job Summary
We are looking for a skilled AI / ML Engineer with 3-4 years of experience in building and deploying scalable machine learning and Generative AI solutions. The ideal candidate should have a strong foundation in core AI / ML concepts, deep learning, NLP, and experience with Generative AI frameworks.
Who We Are Looking For
An enthusiastic self-starter who will work with minimal guidance, has strong Python programming skills, and hands-on experience deploying end-to-end AI / ML solutions. A first-principle understanding of databases, deep learning and LLMs is also required.
Required Skills and Qualifications
1. 3+ years of work experience in Python programming for AI / ML, deep learning, and Generative AI model development
2. Proficiency in TensorFlow / PyTorch, Hugging Face Transformers and Langchain libraries
3. Hands-on experience with NLP, LLM prompt design and fine-tuning, embedding, vector databases and genetic frameworks
4. Strong understanding of ML algorithms, probability and optimization techniques
5. 6+ years of experience in deploying models with Docker, Kubernetes, and cloud services (AWS Bedrock, SageMaker, GCP Vertex AI) through APIs, and using MLOps and CI / CD pipelines
6. Familiarity with retrieval-augmented generation (RAG), cache-augmented generation (CAG), retrieval-integrated generation (RIG), low-rank adaptation (LoRA) fine-tuning
7. Ability to write scalable, production-ready ML code and optimized model inference
8. Experience with developing ML pipelines for text classification, summarization and chat agents
9. Prior experience with SQL and noSQL databases, and Snowflake / Data bricks
What You Will Be Doing
As an AI / ML Engineer in the AI center of excellence (COE) at DataZymes, you will be developing and maintaining end-to-end AI solutions to business problems. Key responsibilities are listed below :
1. Design, develop, and deploy AI / ML models for various applications, including NLP and structured data
2. Implement, prompt, fine-tune, and optimize Generative AI models, such as LLMs and diffusion models
3. Develop and maintain data pipelines, features engineering workflows, and model training infrastructure
4. Work with large-scale datasets, ensuring data quality and pre-processing for training and inference
5. Optimize model performance, reducing latency, and improving efficiency for real-time AI applications
6. Deploy models using MLOps best practices on cloud platforms (AWS, GCP)
7. Collaborate with cross-functional teams, including data engineers, software developers, and product managers to develop AI solutions to business problems
8. Implement responsible AI practices, ensuring fairness, interpretable, and ethical AI adoption
Requirements
Key Responsibilities Pipeline Development : Design, build, and maintain efficient and scalable ETL / ELT pipelines on the Databricks platform using PySpark, SQL, and Delta Live Tables (DLT). Lakehouse Management : Implement and manage data solutions within the Databricks Lakehouse Platform, ensuring best practices for data storage, governance, and management using Delta Lake and Unity Catalog. Code Optimization : Write high-quality, maintainable, and optimized PySpark code for large-scale data processing and transformation tasks. AI & ML Integration : Collaborate with data scientists to productionize machine learning models. Utilize Databricks AI features such as the Feature Store, MLflow for model lifecycle management, and AutoML for accelerating model development. Data Quality & Governance : Implement robust data quality checks and validation frameworks to ensure data accuracy, completeness, and reliability within the delta tables. Performance Tuning : Monitor, troubleshoot, and optimize the performance of Databricks jobs, clusters, and SQL warehouses to ensure efficiency and cost-effectiveness. Collaboration : Work closely with data analysts, data scientists, and business stakeholders to understand their data requirements and deliver effective solutions. Documentation : Create and maintain comprehensive technical documentation for data pipelines, architectures, and processes. Required Qualifications & Skills Experience : 3-5 years of hands-on experience in a data engineering role. Databricks Expertise : Proven, in-depth experience with the Databricks platform, including Databricks Workflows, Notebooks, Clusters, and Delta Live Tables. Programming Skills : Strong proficiency in Python and extensive hands-on experience with PySpark for data manipulation and processing. Data Architecture : Solid understanding of modern data architectures, including the Lakehouse paradigm, Data Lakes, and Data Warehousing. Delta Lake : Hands-on experience with Delta Lake, including schema evolution, ACID transactions, and time travel features. SQL Proficiency : Excellent SQL skills and the ability to write complex queries for data analysis and transformation. Databricks AI : Practical experience with Databricks AI / ML capabilities, particularly MLflow and the Feature Store. Cloud Experience : Experience working with at least one major cloud provider (AWS, Azure, or GCP). Problem-Solving : Strong analytical and problem-solving skills with the ability to debug complex data issues. Communication : Excellent verbal and written communication skills. Preferred Qualifications Databricks Certified Data Engineer Associate / Professional certification. Experience with CI / CD tools (e.g., Jenkins, Azure DevOps, GitHub Actions) for data pipelines. Familiarity with streaming technologies like Structured Streaming. Knowledge of data governance tools and practices within Unity Catalog.
Engineer • bangalore, India