Discover a challenging opportunity as a Red Teaming Specialist where you will leverage your analytical expertise to rigorously test and evaluate AI-generated content.
Key Responsibilities
- Design and execute Red Teaming exercises to uncover vulnerabilities in large language models (LLMs).
- Evaluate and stress-test AI prompts across diverse domains, identifying potential failure modes.
- Develop and apply test cases to assess accuracy, bias, toxicity, hallucinations, and misuse potential in AI responses.
- Collaborate with data scientists, safety researchers, and prompt engineers to report risks and suggest mitigations.
- Perform manual Quality Assurance and content validation, ensuring factual consistency, coherence, and guideline adherence.
- Create evaluation frameworks and scoring rubrics for prompt performance and safety compliance.
- Document findings, edge cases, and vulnerability reports with high clarity and structure.
Requirements
Proven experience in AI red teaming, LLM safety testing, or adversarial prompt design.Familiarity with prompt engineering, NLP tasks, and ethical considerations in generative AI.Strong background in Quality Assurance, content review, or test case development for AI / ML systems.Understanding of LLM behavior's, failure modes, and model evaluation metrics.Excellent critical thinking, pattern recognition, and analytical writing skills.Ability to work independently, follow detailed evaluation protocols, and meet tight deadlines.PREFERRED QUALIFICATIONS
Experience working with teams like Open AI, Anthropic, Google DeepMind, or other LLM safety initiatives.Background in risk assessment, red team security testing, or AI policy & governance.Note : This role requires the ability to work effectively in a fast-paced environment and adapt to changing priorities.