Design, train, and deploy custom OCR models leveraging Azure Document Intelligence and AWS Textract .
Automate document classification and data extraction from structured / unstructured inputs (PDFs, scanned images, invoices, etc.).
Build multilingual OCR workflows by integrating Azure Translator and apply NLP insights using AWS Comprehend.
Develop GenAI-powered solutions with AWS Bedrock and Azure OpenAI for semantic understanding, summarization, and contextual search of documents.
Utilize AWS Rekognition for image-based document analysis and identity verification.
Implement and optimize scalable pipelines for OCR output post-processing (JSON / CSV) using Python, Java, or .NET .
Collaborate with cross-functional teams to gather requirements and deliver tailored, production-grade solutions.
Ensure compliance with data privacy, security, and governance standards across cloud platforms.
Required Skills
8+ years of experience in OCR technologies, intelligent document processing, and enterprise-scale AI / ML solutions.
Expertise in Azure AI Services (Form Recognizer / Document Intelligence, Translator, OpenAI) and AWS AI / ML stack (Textract, Comprehend, Rekognition, Bedrock).
Strong coding skills in Python, Java, or .NET with hands-on experience in post-processing OCR outputs.
Familiarity with AWS services (EC2, ECS, S3, Lambda, Step Functions) and Azure cloud environments .
Deep knowledge of document layout analysis (tables, forms, key-value pairs).
Experience with NLP tools, translation APIs , and GenAI model training, fine-tuning, and deployment .
Exposure to Consumer, Retail, and Logistics domain workflows (preferred).
Understanding of Terraform, CI / CD pipelines, DevOps practices, and Git (advantage).
Comfortable working in Agile projects , with excellent problem-solving and communication skills (client-facing role).
Bachelor’s degree in Computer Science, IT, or equivalent (preferred).