UNIScan — On‑device OCR Scanner
High‑speed scanning with on‑device OCR, native modules, and end‑to‑end encrypted results for privacy‑sensitive workflows.




Overview
Uniscan is a privacy-first mobile scanner for Android and iOS (Expo + React Native) that captures documents, cleans them up, recognizes text fully offline, and lets you sign & encrypt files before you share. It ships with two OCR engines: Google ML Kit (unbundled, no Play Services) for Latin/Chinese/Devanagari/Japanese/Korean and Tesseract for Arabic/Cyrillic — both running on the device. A built-in encryption vault (AES-256-GCM) and optional biometric unlock keep clear files short-lived and encrypted by default.
Core flows:
- Scan or import images/PDFs → auto-crop/enhance → OCR on-device → export as text/PDF/DOCX.
- Encrypt before sharing (.uenc envelope) or keep inside the app’s vault.
- Bring files into Uniscan from other apps via Android’s Share sheet (custom native bridge).
- Automatic wipe of decrypted previews, session auto-lock, and screenshot/recording prevention while sensitive screens are open.
Targeted audience
- Field teams & SMBs that need fast capture + OCR with no cloud dependency.
- Legal/finance/healthcare staff who must keep data resident on the device.
- Privacy-conscious individuals who want local-only scanning and encryption.
- Developers/IT needing a secure scanning tool that still plays well with MDM and enterprise sharing.
The challenge
Most mobile scanners either upload to a cloud for OCR or bundle SDKs that phone home. That fails strict compliance and breaks in low-connectivity environments. Teams needed reliable OCR for multiple scripts, strong encryption, and hardened UX (no accidental screenshots or lingering decrypted files) — all without sending data off the device.
Our approach
100% on-device pipeline with native glue where it matters:
- Dual OCR engines, offline:
- Defense-in-depth for clear data:
- Share-in / Share-out with control:
- Predictable file layout (no surprises):
ML Kit (unbundled) is baked into the app bundle, so recognition runs locally and works without Google Play Services.
Tesseract (Android) via a custom native module (TesseractDirect) for Arabic/Cyrillic and other .traineddata packs. The app includes a tessdata bridge to point Tesseract at bundled language files.
Vault encryption uses AES-256-GCM via react-native-quick-crypto, with keys in SecureStore and optional passphrase + biometrics.
Auto-wipe of temporary decrypted files when the app backgrounds; session auto-lock after inactivity; screen-capture blocking on sensitive screens.
Custom ShareBridge (Android) reads incoming share intents so documents can be received privately.
Exports can be encrypted by default (.uenc envelope) before leaving the app, or shared as images/PDF/DOCX when policy allows.
Private app directories such as uni-scans/ (captured images), uni-docs/ (imports/exports), and dedicated vault paths keep clear vs. encrypted data separated and manageable.
The stack
- Framework & runtime: Expo 53, React Native 0.79, React 19, Expo Router.
- Camera & media: expo-camera, expo-image-manipulator, expo-media-library, expo-sharing, expo-document-picker.
- OCR:
- @react-native-ml-kit/text-recognition (unbundled model packs for latin/chinese/devanagari/japanese/korean).
- Tesseract Android native module (TesseractDirect) + tessdata bridge; .traineddata bundled in assets/tessdata/.
- Security & crypto:
- Vault in lib/cryptoVault.ts using AES-256-GCM (react-native-quick-crypto) with PBKDF2 when encrypting via passphrase; keys in Expo SecureStore; optional biometrics via expo-local-authentication.
- Session lock, decrypted file wiper, screen-capture prevention (expo-screen-capture).
- Storage & OS bridges: expo-file-system, AsyncStorage, custom ShareBridge for Android share intents.
- Export & docs: docx, expo-print (PDF), plus image/PDF exporters; optional filename utilities and HTML export helpers.
- UI: Expo vector icons, Safe Area Context, bottom tabs; Skia available for image effects.
- Android config: minSdk 26 / targetSdk 35, ProGuard rules included; ML Kit + Tesseract plugins configure native dependencies at build time.
Results
- Reliable OCR without network for Latin, CJK, Devanagari, Arabic, Cyrillic (and extensible via tessdata).
- Compliance-friendly posture: no analytics SDKs, no background uploads, no third-party cloud OCR.
- Safer sharing with one-tap encryption, short-lived clear previews, and screenshot blocking.
- Smooth UX: tabbed navigation for Scanner, Documents, Editor, Sign & Encrypt, Export, Settings, Vault; device haptics; polished camera flow.