Data Processing Guide
What data VaultPDF processes, where it flows, how long it is retained, and how it relates to GDPR and data-residency requirements.
Last updated: 2026-05-31. Version: 1.0.
For the visual architecture diagram see Architecture & Data Flow.
1. Data Categories Processed
| Data category | Examples | Processed by | Stored where |
|---|---|---|---|
| Document payload | JSON/YAML template data, field values, tables | The Dispatcher (render), The Vault Engine (seal) | Your SharePoint, your Azure Blob Storage |
| Rendered PDFs | Output documents, sealed .vpdf archives | The Vault Engine | Your SharePoint output library, your Azure Blob Storage |
| Audit events | Operation type, timestamp, correlation ID, status | The Vault Engine | Your Azure Blob Storage (append-only JSONL blob) |
| Approver identity | Approver email address (PII) | The Dispatcher (workflow) | Encrypted in Azure Table Storage (your subscription); hash in sealed PDF |
| Portal session | Short-lived HMAC token, IP address (optional), user agent | The Dispatcher | Ephemeral (in-memory and Azure Table Storage, 4-hour TTL) |
| License data | License key (string), tenant ID | The Dispatcher - VaultPDF Licensing API | Not stored by VaultPDF |
| Usage metrics | Monthly aggregates: esign count, delivery count, delivery email variant, delivery SMS variant | The Dispatcher (via nightly sync) | VaultPDF Licensing API (aggregate only; keyed by licenseKey) |
| Activity records | Workflow status, timestamps, initiating user UPN | The Dispatcher | Your SharePoint Activity list |
| Telemetry | Function execution logs, error traces (no document content) | Azure App Insights | Your Log Analytics workspace |
2. Data Flows
2.1 Render / Seal Flow
flowchart TD
A1(["š¤ User ā SharePoint SPFx command"])
A2(["š„ļø Source System ā direct API call"])
A1 -->|"SPFx command"| B
A2 -->|"POST /api/render (Function Key Ā· HTTPS)"| B
B["The Dispatcher<br/>func-dispatcher<br/>(your Azure subscription)"]
B --> C["Read template + payload<br/>SharePoint via Graph API Ā· MSI"]
B --> D1["Validate entitlement<br/>Tier 1 - In-memory cache"]
D1 --> D2["Tier 2 - SharePoint license.vpdf"]
D2 --> D3["Tier 3 - Licensing API (fallback)"]
D3 --> D4["Tier 4 - Grace mode (API unavailable only)"]
B --> E["Enqueue render message<br/>Azure Service Bus queues Ā· MSI"]
E --> F["The Vault Engine<br/>func-processor<br/>(your Azure subscription)"]
F --> G["Read template<br/>SharePoint via Graph API Ā· MSI"]
F --> H["Render PDF<br/>in-memory ā Ephemeral Processing Unit"]
H --> I["Upload PDF<br/>SharePoint output library<br/>Graph API Ā· MSI"]
H --> J["Write audit event<br/>Azure Blob Storage JSONL Ā· MSI"]
F --> K["Update Activity list<br/>SharePoint Ā· MSI"]
The Dispatcher validates entitlement through local tiers first. It falls back to the VaultPDF Licensing API only when the cache and SharePoint entitlement file are missing or expired.
Document content flows only within your Microsoft 365 tenant and your Azure subscription.
2.2 License Validation (Tier 3 Fallback)
sequenceDiagram
participant D as The Dispatcher<br/>func-dispatcher<br/>(your Azure subscription)
participant T1 as Tier 1<br/>In-memory cache
participant T2 as Tier 2<br/>SharePoint license.vpdf
participant L as Tier 3<br/>VaultPDF Licensing API<br/>(fallback)
participant T4 as Tier 4<br/>Grace mode
D->>T1: Validate entitlement
T1-->>D: Cache hit or miss
D->>T2: Check signed license.vpdf
T2-->>D: Valid, expired, or missing
D->>L: Fallback only if T1 and T2 miss
Note over D,L: Body: { licenseKey: string, tenantId: string }
Note over D,L: ā No document content<br/>ā No user PII<br/>ā No file data<br/>The Microsoft 365 tenant ID identifies an organization and is not intended to identify a natural person.
L-->>D: { features: { ... } }
D->>T4: Use grace mode only if API is temporarily unavailable
Note over L: VaultPDF does not retain document content, user PII,<br/>or tenant business data from licensing requests.<br/>Operational logs are limited to service health,<br/>security monitoring, and abuse prevention.
No document data, no user PII, and no file content is sent to the VaultPDF Licensing API. The Microsoft 365 tenant ID identifies an organization and is not intended to identify a natural person. VaultPDF does not retain document content, user PII, or tenant business data from licensing requests. Operational logs are limited to service health, security monitoring, and abuse prevention.
2.3 eSign Portal Flow (External Redirect)
sequenceDiagram
actor Approver as š¤ Approver (browser)
participant Portal as VaultPDF eSign Portal<br/>(VaultPDF-operated Ā· stateless)
participant Dispatcher as The Dispatcher<br/>func-dispatcher<br/>(your Azure subscription)
Approver->>Portal: Click eSign link in email (HMAC-token in URL)
Portal->>Dispatcher: GET /api/esign/verify?token=...
Note over Portal,Dispatcher: Token only ā ā no document bytes<br/>ā no approver identity transmitted
Dispatcher-->>Portal: Token valid Ā· approver context
Portal-->>Approver: Redirect to Dispatcher signing page
Approver->>Dispatcher: Load signing page (your Azure Function)
Note over Approver,Dispatcher: Signing page served from YOUR Azure subscription<br/>not from VaultPDF infrastructure
Approver->>Dispatcher: Submit signature
Dispatcher->>Dispatcher: Record approval Ā· enqueue seal message
The eSign Portal is a stateless relay. It stores no document data and no approver PII. The signing page is served from your Azure Function, not from VaultPDF infrastructure.
2.4 Licensing Data Processing (Tiered Entitlement Cache)
VaultPDF uses a 4-tier license validation cascade to reduce runtime dependency on external licensing services.
Validation Tier Sequence
| Tier | Source | Behaviour |
|---|---|---|
| 1 | In-memory process cache | Valid for 24 hours. No network call. |
| 2 | SharePoint license.vpdf (JWS-signed) | Persisted signed entitlement from last successful sync. Non-trial plans only. |
| 3 | VaultPDF Licensing API (direct call) | Called when Tiers 1ā2 miss or are expired. Carries licenseKey and tenantId only. No document content. |
| 4 | 48-hour grace (API unavailability only) | Activates when Tier 3 is unreachable and a valid response was received within the last 48 hours. |
Nightly Sync Flow
The Dispatcher runs a scheduled nightly sync at 2:00 AM UTC:
- Calls the Licensing API with
licenseKeyandtenantId(no document content). - Receives a JWS ES256-signed entitlement payload containing feature flags, plan, expiry date, and tenant ID binding.
- Verifies the
tenantIdin the signed payload matches the deployment tenant before writing. - Writes the refreshed entitlement to a
license.vpdffile in your SharePoint document library (customer-controlled). - Pushes usage metrics to the Licensing API:
- Monthly aggregates (not per-document granularity)
- Metrics:
esign(eSign session count),delivery(delivery job count),delivery_email(email delivery variant),delivery_sms(SMS delivery variant) - Data transmitted: integer counters only; no document content, no end-user identity information
- Keyed by:
licenseKey(which is bound to your tenant during activation)
- If the license is revoked or expired, the sync deletes
license.vpdffrom SharePoint so the next runtime check reaches Tier 3 (API) and returns the correct denied result.
Usage Metrics Storage
Usage metrics are stored as monthly aggregate counters keyed by licenseKey. Metrics do not contain document content, user identities, file names, document metadata, or workflow payloads.
- Aggregated: Rolling monthly totals (reset on month boundaries), not per-document or per-user.
- Tenant-identifiable: Keyed by
licenseKey, which is cryptographically bound to your tenantId during license activation. - Restricted to VaultPDF: Usage metrics do not leave your subscription for any purpose other than licensing enforcement. They are not included in telemetry or shared with third parties.
Entitlement File Storage
The license.vpdf entitlement file is stored in your SharePoint document library. It is cryptographically signed using JWS ES256 (ECDSA P-256): VaultPDF holds the private signing key; your Dispatcher Function holds only the public verification key. Any modification to the file causes signature validation to fail and the file is silently discarded, falling through to Tier 3.
Runtime Behavior
The Licensing API is not called on every request. The Dispatcher evaluates the tier cascade in order:
- If the in-memory cache (Tier 1) is valid (< 24 h old), the API is not called.
- If the SharePoint
license.vpdf(Tier 2) is present and its JWS signature is valid, the API is not called (non-trial plans). - If both Tiers 1 and 2 miss or are expired, the Dispatcher calls the Licensing API directly (Tier 3).
- If the API is unreachable and a valid response was received within the last 48 hours, the system continues on the last cached entitlement (Tier 4 grace).
Grace Period
The 48-hour grace period applies only when the Licensing API is temporarily unreachable (network outage, planned maintenance). It does not apply to licenses that are revoked or expired ā those return a denied response immediately. After the 48-hour grace window, access is denied until the API is reachable again.
No document content is transmitted during licensing operations at any tier.
3. Data Residency
VaultPDF deploys entirely into your chosen Azure region, set via the -Location parameter in deploy.ps1. All document data, audit logs, and encryption keys reside in that region.
VaultPDF-operated services:
| Service | Region | What it receives |
|---|---|---|
| Licensing API | West Europe (Microsoft Azure) | License key and tenant ID only |
| eSign Portal | West Europe (Microsoft Azure) | Short-lived HMAC token only |
Region-Specific Endpoints
If your data-residency requirements prohibit any connection to EU-hosted services, contact [email protected] - we can provide a region-specific Licensing API endpoint.
4. Data Retention
| Data | Default retention | Configurable? |
|---|---|---|
| Rendered PDF outputs (SharePoint) | Governed by your SharePoint retention policies | Yes - apply SharePoint/Purview retention labels |
.vpdf audit archives (Azure Blob Storage) | No automatic deletion (append-only) | Yes - set blob lifecycle policy (Bicep parameter) |
| Audit JSONL blobs | No automatic deletion | Yes - set blob lifecycle policy |
| Activity list items (SharePoint) | Governed by your SharePoint policies | Yes |
| Portal session tokens | 4 hours (automatic TTL in Azure Table Storage) | No |
| Usage metrics (VaultPDF Licensing API) | Retained indefinitely for billing and enforcement; monthly aggregates retained for 7 years minimum (compliance baseline) | Yes - contact VaultPDF to set custom retention |
| App Insights telemetry | 90 days (configurable in workspace settings) | Yes |
| Log Analytics data | 90 days (Bicep default, retentionInDays: 90) | Yes - change retentionInDays parameter |
Recommended storage lifecycle policy (set in Bicep or the Azure portal):
| Tier | Transition |
|---|---|
| Cool | After 90 days |
| Archive | After 365 days |
| Delete | After 2,555 days (7 years, common compliance baseline) |
5. GDPR Posture
Roles
| Entity | Role |
|---|---|
| Your organisation | Data Controller |
| Your Azure subscription (Microsoft) | Data Processor (under your Microsoft Customer Agreement) |
| VaultPDF | Data Processor (limited: license validation and eSign token relay only) |
Lawful Basis
VaultPDF processes document payloads on the instructions of the Data Controller (your organisation). The lawful basis for processing is the performance of a contract (document generation and approval workflow services).
Personal Data in VaultPDF
- Approver email addresses are the primary PII processed by VaultPDF workflow functions. They are AES-256-GCM encrypted before storage in Azure Table Storage. The encryption key is in your Azure Key Vault.
- Document payload fields may contain personal data depending on your template design. VaultPDF treats all payload fields as potentially personal data and applies the same encryption and access controls.
- Portal session IP address is optionally logged in App Insights telemetry for security purposes.
Data Subject Rights
Because all data is stored in your Azure subscription and SharePoint tenant:
- Right of Access - query your Azure Blob Storage and SharePoint via your admin tooling.
- Right to Erasure - delete blobs from Azure Blob Storage; delete list items from SharePoint. Sealed
.vpdfarchives in WORM-locked containers cannot be deleted before retention expiry (by design - required for audit integrity). - Data Portability -
.vpdfarchives are standard ZIP containers; PDF outputs are standard PDF files.
Cross-Border Transfers
The only cross-border transfer to VaultPDF-operated services transmits license key and tenant ID (non-personal data under GDPR Art. 4(1)) and HMAC tokens (pseudonymous, no identity information). A Data Processing Agreement (DPA) with VaultPDF is available on request at [email protected].
6. Breach Notification
In the event of a suspected breach within VaultPDF-operated infrastructure (Licensing API or eSign Portal), VaultPDF will notify affected customers within 72 hours of becoming aware of the incident. See Incident Response for the full procedure.
Because your document data is stored in your own Azure subscription, a breach of VaultPDF-operated infrastructure does not expose your document content. A breach of your Azure subscription is governed by your incident response procedures and Microsoft's contractual obligations under your MCA.
DPA Requests and Privacy Inquiries
Contact our privacy team for Data Processing Agreement requests and GDPR-related inquiries.
Architecture & Data Flow
Full data-flow diagrams, trust boundaries, and key management overview for VaultPDF. The canonical architecture reference for security architects, procurement teams, and AppSource reviewers.
Security Controls
Authentication, encryption, network isolation, key management, audit controls, and compliance posture for VaultPDF. For CISO, security teams, and pen-test scoping.