Location : Remote
Team : Applied AI / Data Science
Role Overview
As a Data Scientist at blkbox.ai, you will architect, implement, and scale data-driven and LLM-based systems that generate high-performing mobile gaming ads at production level. This role blends data science, multimodal AI, generative models, and LLM automation—focusing on retrieval pipelines, embedding-based similarity search, and production-grade model operations.
You will design and deploy end-to-end LLM workflows—from prompt design and model orchestration to RAG pipelines and vector database integration—enabling automated ad creation.
Key Responsibilities LLM System Design, Data Intelligence & Workflow Engineering
- Architect LLM-powered workflows for narrative generation, scripting, voiceovers, storylines, and other ad concepts
- Build data-driven logic and model selection strategies informed by creative performance and mobile gaming outcomes
- Run structured data analysis using Python and SQL to identify creative attributes correlated with winning ad performance
- Integrate multimodal data sources into creative intelligence pipelines
LLM, RAG & Vector Retrieval
Use LLM APIs such as OpenAI GPT and Google Gemini for scripted and generative workflowsBuild retrieval-augmented generation (RAG) pipelines for :story / voiceover generationgameplay retrievalcreative asset selectionWork with modern embedding models (OpenAI, sentence-transformers, etc.)Implement vector search via Pinecone, Weaviate, Chroma or FAISSExperiment with prompt engineering, prompt templates, function calling, and structured promptingProduction AI, Monitoring & Optimization
Collaborate with backend engineers to integrate LLM components into production pipelinesMonitor and optimize LLM workflows for :latencytoken usage and costthroughputfailure rateshallucinationsEvaluate multiple LLMs and embeddings for performance, relevance, and cost efficiencyImplement reproducible workflows, model evaluation systems, and reliability checks at scaleCross-Functional Collaboration
Partner with production and creative teams to translate creative requirements into systematic generative pipelinesPresent technical recommendations clearly to product, engineering, and creative stakeholdersOwn experiments end-to-end including benchmarking, metrics, prototype automation, and deploymentRequired Skills & Qualifications Technical
Strong proficiency in Python and SQLHands-on experience with LLMs, RAG concepts, embeddings, and vector searchExperience calling model APIs (OpenAI, Gemini, etc.)Familiarity with LangChain, LlamaIndex or similar frameworks is a plusStrong understanding of :LLM evaluation frameworksprompt engineering patternslatency and cost optimizationmodel hallucination controlSolid foundation in statistics, experimentation, and ML deployment workflowsSoft Skills & Behaviors
Ability to communicate complex technical concepts to non-technical partnersCurious, inventive, and motivated by experimentation and rapid iterationAdaptability in fast-moving production environmentsStrong ownership mindset and comfort working cross-functionallyWhy This Role Is Unique
You’ll be building real production LLM systems—not just prototypesYou’ll work across creative, multimodal, and generative domainsYou’ll influence automated ad production for some of the largest mobile gaming companies in the world