Talent.com
This job offer is not available in your country.
Sr Data Scientist-Innovation lab

Sr Data Scientist-Innovation lab

Genzeon Globalhyderabad, India
8 hours ago
Job description

Sr Data Scientist

Job Responsibilities :

  • LLM Architecture : Good understanding of the architecture underlying large language models, such as Transformer-based models and their variants. Design and implement deep learning model architectures using PyTorch.
  • Language Model Training and Fine-Tuning : Experience in training large-scale language models from scratch, as well as fine-tuning pre-trained models on domain data.
  • Data Preprocessing for NLP : Skilled in preprocessing textual data, including tokenization, stemming, lemmatization, and handling of different text encoding.
  • Transfer Learning and Adaptation : Proficiency in applying transfer learning techniques to adapt existing LLMs to new languages, domains, or specific business needs.
  • Data Annotation and Evaluation : Skills in designing and implementing data annotation strategies for training LLMs and evaluating their performance using appropriate metrics.
  • Scalability and Deployment : Experience in scaling LLMs for production environments, ensuring efficiency and robustness in deployment.
  • Model Training, Optimization, and Evaluation :  Evaluate the performance of PyTorch models using appropriate metrics and techniques like cross-validation, holdout sets, or online evaluation. This encompasses the complete cycle of training, fine-tuning, and validating language models. You will be designing and adapting LLMs for use in virtual assistants, Information retrieval and extraction etc.
  • Experimentation with Emerging Technologies and Methods : Actively exploring new technologies and methodologies in language model development, including experimental frameworks and software tools.
  • LLM Alignment : Understanding of algorithms like DPO, PPO, KPO, RLHF and using it for guardrails.
  • AI Data Retrieval : Data retrieval from unstructured data, extract key value pairs using techniques like donut, layoutLM, table transformers.
  • Analyze data and build EDAs to identify data patterns Hands-on and strong understanding of concepts in Deep Learning and NLP Proficient in TensorFlow and similar libraries.

Required Qualifications

  • 5+ years of hands-on experience in developing and deploying Large Language Models, and Machine learning and working with Pytorch.
  • A thorough understanding of machine learning, particularly deep learning techniques, including knowledge of neural network architectures, training methods, and optimization algorithms.
  • Proficiency in AI technology, Python, including experience with NLP libraries (e.g., Hugging Face Transformers, NLTK, spaCy), text classification.
  • Experience with frameworks : PyTorch, or Tensorflow.
  • Experience with cloud services (AWS, Azure) and ML deployment tool Docker
  • Familiarity with model fine-tuning and optimization techniques for LLMs.
  • Proven track record of innovative solutions in the field of LLMs.
  • Strong communication skills, with the ability to explain complex AI concepts to non-expert audiences.
  • Additional good to have qualifications :

  • 4+ years' experience in data analytics, data science, quantitative analysis using statistical computer languages to draw insights from large data sets 3+ years' experience in Python development, preferably delivering production code for data applications.
  • Experience with unstructured data or computer vision models is a plus.
  • Experience with SQL is a big plus Extensive model implementation experience using Scikit.
  • Experience designing and developing for security critical applications; experience with the specifics for HIPAA / PHI / PII / GDPR a big plus.
  • Basic experience with Linux, Git, Jupyter Notebooks is must Knowledge of Agile development practices Flexibility and adaptability to respond to a rapidly changing environment.
  • Experience with distributed computational techniques and job orchestration tools and platforms is very valuable : airflow, etc.
  • Create a job alert for this search

    Sr • hyderabad, India