Experience : 8 years to 12 year
Location : Bangalore
Job Description : Key Responsibilities
- Design, deploy and maintain automated invoice extraction pipelines using GCP Document AI.
- Develop custom model training workflows for documents with non-standard formats.
- Preprocess and upload document datasets to Cloud Storage.
- Label documents using DocAI Workbench or JSONL for training.
- Train and evaluate custom processors using AutoML or custom schema definitions.
- Integrate extracted data into downstream tools (e.g., BigQuery, ERP systems).
- Write robust, production-grade Python code for end-to-end orchestration.
- Maintain CI / CD deployment pipelines
- Ensure secure document handling and compliance with data policies.
________________________________________
Required Skills & Experience
Strong hands-on experience with Google Cloud Platform, especially :o Document AI
o Cloud Storage
o IAM
o Vertex AI (preferred)
o Cloud Functions / Cloud Run
Proficient in Python, including Google Cloud SDK librariesFamiliarity with OCR, and schema-based information extractionUnderstanding of security best practices for handling financial documents________________________________________
Preferred Qualifications
Previous projects involving invoice or document parsingFamiliarity with BigQuery for analytics / reporting