Company Description
Invito Software Solutions is a leading software development company providing world-class web and mobile solutions efficiently and cost-effectively. We specialize in next-generation design patterns, responsive coding techniques, and rigorous quality assurance, resulting in high-quality apps with a high return on investment. Our scalable services and customized engagement models cater to businesses of all sizes, from innovative startups to well-established companies. We develop powerful and effective solutions that meet our clients' specific needs.
Project : Spring-Powered Desktop App for Invoice OCR → Clean CSVWhat you’ll build
- Desktop Java application (Java 17+) backed by Spring Boot
- Runs locally (offline-first), ships as a single native installer or fat JAR.
- UI options (pick one) :
- JavaFX UI using Spring for DI (FXML controllers wired via Spring).
- Swing UI with Spring-managed services.
- Embedded local web UI : Spring Boot serves a UI (e.g., Vaadin or lightweight HTML / TS), shown in an embedded browser (WebView / JxBrowser).
- Invoice ingestion : drag-and-drop folders / files; PDF, PNG / JPG; multipage; batch mode.
- AI / OCR pipeline (pluggable via Spring beans) :
- Local OCR (Tesseract) + layout / zone analysis, or
- Cloud OCR (AWS Textract, Google Vision) with retry / backoff, or
- LLM-assisted parsing to a JSON schema with guardrails.
- Field extraction (headers + line items) : vendor, invoice #, dates, currency, taxes, subtotals / totals, PO, line descriptions, qty, unit price, amounts.
- Validation & review UI : show the document preview, highlight extracted zones, flag low-confidence fields, quick edits, autocomplete.
- CSV export : stable schema; normalize number / date / locale; export per file or batch.
- Rules & heuristics : vendor templates, regex fallbacks, learned patterns, per-vendor overrides.
- Quality metrics : confidence per field, accuracy dashboards, reject reasons, simple analytics.
- Offline by default , with optional cloud connectors for OCR / LLM and template sync.
Architecture (Spring-centric)
App launcher : desktop entry point that bootstraps SpringApplication .Core modules (Spring beans) :IngestionService : drag-and-drop, PDF / image decoding, page splitting.OcrService (strategy) : TesseractOcrService, TextractOcrService, GcvOcrService.ParsingService : layout analysis, key-value detection, tables; optional LlmParsingService.TemplateService : vendor profiles, regex rules, learned mappings; local cache + optional remote sync.ValidationService : confidence scoring, anomaly detection, suggestions.ExportService : CSV writer (stable schema, locale normalization).MetricsService : capture confidence, errors, durations; local storage (SQLite / H2).DocumentPreviewService : render + zone overlays (PDFBox + image layers).Persistence : Spring Data (SQLite / H2 on disk) for runs, templates, audits.Config : Spring profiles (offline, cloud), YAML config for OCR provider keys, thresholds, CSV schema.UI layer :If JavaFX : Controllers get beans via Spring (custom SpringFXMLLoader), reactive updates via ApplicationEventPublisher.If Swing : @Component panels wired to services, event bus for updates.If Vaadin (embedded web) : served by Spring Boot; package app with an embedded browser window.Key user flows
Drop files / folders → Ingestion queue (progress bar, cancel / retry).OCR + Parsing → Field map + line items + confidence per field.Review screen → Document preview with highlight boxes, editable fields, low-confidence badges, autocomplete from templates.Approve / Reject → Approved go to export queue; rejected capture reasons.Export → Single CSV or per-invoice CSV; schema versioning; logs.Analytics → Success rate, average confidence, common reject reasons, vendor leaderboard.CSV schema (stable, versioned)
schemaVersion, fileId, vendorName, invoiceNumber, invoiceDateISO, dueDateISO, currency, subtotal, tax, total, poNumber, …Line items (denormalized or separate CSV) : lineIndex, description, qty, unitPrice, amount.Locale-safe formatting (ISO dates, dot decimal).Offline / Cloud strategy
Offline : Tesseract + local templates; everything runs without internet.Cloud (optional) : switch OCR / LLM beans via Spring profile / env; graceful fallback to offline if unavailable.Packaging & ops
Distribution : jpackage / native installers (Win / MSI, macOS / DMG, Linux / DEB / RPM) or fat JAR.Logging : Spring Boot logging; per-invoice audit trail; export logs.Updates : optional auto-update check (profile-gated).Security & privacy
Local processing by default; redact PII in logs; encrypted at-rest store (configurable).For cloud calls : minimal payloads, signed requests, regional endpoints.Nice-to-haves
Hot keys and batch review UX.Template “learn” button : convert manual fixes into a saved vendor rule.Import / export of templates.Headless CLI mode : input dir output csv.Deliverables
Source code (Spring Boot project + UI layer).Packaged desktop installer(s).Sample vendor templates & test invoices.README + setup, profiles, and OCR provider docs.Test suite (unit + a few end-to-end fixtures).Short user guide (drop → review → export).Qualifications
Strong proficiency in Java and Spring Boot (DI, Spring Data, configuration, profiles).Experience building desktop UIs in JavaFX , Swing , or Vaadin (served by Spring) with responsive, user-friendly design.Skilled in troubleshooting, debugging, and performance tuning in Spring / Java.Familiar with OCR / AI integrations (Tesseract, Textract, Vision, OpenAI / Vertex) and robust parsing.Version control with Git; excellent communication; ability to work independently / remote.Bachelor’s in CS / Engineering (or equivalent experience).Freelance / contract experience is a plus.