Blog · Published April 25, 2026 · 11-minute read
Privilege-preserving time tracking: a metadata-only architecture, explained
How to build a tool that captures billable moments for a lawyer without ever reading call audio, email bodies, or document contents. The full architecture, the exact data boundary, and why ABA Formal Opinion 512 (2024) made this the only architecture a privacy-paranoid solo should seriously consider.
TL;DR
ClaimHour watches four surfaces — calls, email, documents, and calendar — and records only metadata from each: durations, counterparty identifiers, subject-line keyword matches, edit-window times. It never reads audio, email bodies, or document contents. Everything lives in a local SQLite database inside macOS application support; nothing leaves the device until the user explicitly exports an approved time entry. The architecture is deliberately narrower than every incumbent that touches this space — Clio Duo, Smokeball AutoTime, Billables.ai — because ABA Formal Opinion 512 (2024) treats content-reading AI tools as requiring affirmative client consent, and the 120,000-strong no-PMS solo audience has already decided they would rather forgo capture than take on that workflow. This post walks the four capture paths, the refusal list, the egress boundary, and the three-question privilege test we apply to every design decision.
Why this has to be architectural, not a policy page
Every time-tracking vendor that reads a lawyer's phone, email, and documents has a privacy policy. Most of those policies say roughly the same thing: we will not misuse your data; we encrypt at rest; we comply with applicable law. That language is fine for a CRM. It is not fine for an attorney-client-privileged communication.
Privilege does not bend to a vendor's best intentions. The question a state bar asks — and the question the r/Lawyertalk threads keep asking — is not "does this vendor promise to behave?" It is "what does this vendor have the capability to see?" A tool that can see the body of a privileged email already has a potential privilege problem, independent of what it promises to do with what it sees. A subpoena, a breach, a future acquisition, a change in terms of service — any of those turns capability into exposure.
The only honest answer to that question is architectural: the tool cannot misuse what it does not have access to. ClaimHour's capture pipeline is organized around deliberate incapability. There is no code path that reads call audio. There is no code path that reads email body text. There is no code path that reads document contents. The absence of those paths is the privacy guarantee. Everything else — the policy language, the data-retention schedule, the SOC 2 roadmap — is secondary.
What "metadata only" means, concretely
Language matters here. "Metadata" gets used loosely in marketing copy. When we say metadata, we mean exactly the following set, per capture surface, and nothing else.
Surface 1 — Calls (macOS + iOS)
On macOS 14+ and iOS 17+, ClaimHour reads from Apple's CallKit framework. CallKit surfaces per-call: start timestamp, end timestamp, direction (inbound/outbound), and counterparty phone number. It does not expose audio, speaker identity, or content of any kind, and never has. The surfaced data is exactly what your iPhone already shows in its Recents screen — ClaimHour's claim is just that it can read that list programmatically with your permission and match entries against your matter roster.
Numbers that match a matter's counterparty field become candidate billable events. Numbers that do not match (your spouse, your dentist, Chipotle) are discarded on device within milliseconds of being read; they never enter the SQLite database. The match happens locally. The phone book never leaves the phone.
Surface 2 — Email (Apple Mail, Outlook for Mac)
Email is the hardest surface to handle without reading content, and it is where most competitors quietly cross the line. ClaimHour reads through scripting bridges — AppleScript for Apple Mail, the Outlook for Mac AppleScript dictionary for Outlook — and asks each message exactly two questions: who sent or received this, and what is the subject line? The message body is never requested from the mail client; attachments are never fetched; the raw MIME is never touched.
Subject lines are a compromise we made with intent. They are the lowest-entropy field in an email that is still useful for matter matching, and they are already treated as non-privileged by most courts because they are exposed to every mail relay in transit. ClaimHour extracts keyword tokens from the subject — matter names, case numbers, counterparty last names — and discards the rest. If a subject line reads "RE: Smith v. Jones — discovery deadlines," ClaimHour stores smith_v_jones as the matched matter token and drops the remaining words. The raw subject is not persisted.
This is narrower than Billables.ai's posture, which ingests full email bodies to generate AI-drafted time entries. We covered the comparison in the Billables.ai alternative deep-dive; the short version is that AI-drafted entries are genuinely convenient, and that convenience buys them a privilege review under Formal Opinion 512 that ClaimHour users simply do not have to run.
Surface 3 — Documents (Word, Pages, PDF editors)
Documents are captured through macOS NSWorkspace window-activation events combined with Apple's FileProvider framework when available. ClaimHour watches when a document window gains and loses focus, computes the elapsed edit-window time, and matches the document's file path against the user's declared matter folders. If the file path contains ~/Matters/Smith-v-Jones/, the edit window is attributed to the Smith matter. The file is never opened by ClaimHour. The file contents are never read. The diff is never computed.
This design gives us something valuable and something we give up on. What we get: a reasonable estimate of how long a lawyer was actively working on a specific matter's documents, without any content exposure. What we give up: the ability to distinguish "actively drafting" from "window open while I made coffee." We mitigate the second with standard focus-assist heuristics — keyboard-input events per minute, application foreground/background state, Do Not Disturb status — all of which come from public macOS APIs that never reach inside the document.
Surface 4 — Calendar (EventKit)
Calendar is the cleanest surface. Apple's EventKit framework exposes calendar events to any app the user permits. ClaimHour reads event start and end times, plus titles where titles match the matter roster's naming conventions. Event descriptions (the "notes" field) are explicitly not read — they are frequently the most sensitive field in a lawyer's calendar because they often contain client instructions. An event titled "Jones intake — 3pm" is recorded as a Jones-matter candidate billable slot; the notes below that title are ignored.
The negative architecture: what we deliberately refuse to capture
The easiest way to read an architecture like this is by what it excludes. Here is the explicit refusal list, per surface, in the order we had to make the decisions during design.
| Refused capability | Surface | Why we refuse |
|---|---|---|
| Call audio recording or transcription | Calls | Two-party consent states (11 including CA, FL, WA). Recording a client call without explicit consent is per-se misconduct. |
| Call transcript summarization (local or cloud) | Calls | Still privileged content; moving it to the device does not change the privilege analysis. |
| Speaker identification or diarization | Calls | Implies audio processing. Same content problem. |
| Email body reading | Privileged content. Triggers Formal Opinion 512 vendor-review workflow on every client. | |
| Attachment inspection or fetching | Same as bodies; attachments are frequently the most sensitive part of a legal email. | |
| Automatic entry narrative generation from email | Requires reading the body. Same reason. | |
| Document content reading or parsing | Documents | Privileged work product. No amount of local processing changes the capability question. |
| Keystroke logging | All | Captures content by definition. Also a red line for every state bar advisory opinion on lawyer monitoring. |
| Screen content OCR / screenshot analysis | All | Same as keystrokes. "Looking at the screen" is reading content. |
| Calendar event notes | Calendar | Often contain client-facing instructions or strategy; treat as privileged. |
| Contact directory upload | All | Even counterparty phone numbers are matched locally. The rolodex never leaves the device. |
| Location or geofencing | All | Not needed for the product. Every capability that is not needed is a liability we refused to build. |
Every item on this list is a feature a reasonable competitor could and sometimes does ship. Each one we refused because it would have crossed the privilege line, because it would have required a Formal Opinion 512 vendor review, or — most often — because it would have introduced a capability we did not need for the core product and therefore did not want as attack surface.
Where data lives, moment by moment
The data-residency story is short enough to state in one paragraph. ClaimHour writes every captured event to a local SQLite database at ~/Library/Application Support/ClaimHour/capture.db. The database uses SQLCipher for at-rest encryption keyed to the device's Keychain. The iOS companion maintains a parallel SQLite file inside the app sandbox, and the two sync through CloudKit's end-to-end-encrypted private database — Apple cannot read the contents, and ClaimHour's servers are not in the sync path. No capture event ever lands on a ClaimHour-operated server. The only time data leaves the device is during an explicit export, and exports carry approved time entries only — not the underlying capture events that generated them.
This is what "local-first" means for a lawyer. It is not a marketing posture; it is a straight answer to the state-bar due-diligence question about where client information resides. The answer is: on the lawyer's own Mac and iPhone, encrypted at rest, synced through Apple's privacy-preserving infrastructure, never transiting our servers.
Why ABA Formal Opinion 512 (2024) changed the math
In July 2024 the ABA issued Formal Opinion 512, the first ABA-level guidance on lawyers' use of generative AI. The opinion does not ban AI tools. It does something subtler: it treats content-reading AI tools as requiring three specific lawyer-side workflow steps before use.
- Competence: the lawyer must understand the tool well enough to evaluate its outputs, including its training data practices and hallucination tendencies.
- Confidentiality: the lawyer must evaluate whether the tool's data-handling practices comply with Rule 1.6 duties, which in practice means reviewing vendor terms, data retention, and sub-processor lists.
- Client consent: for tools that ingest client information in any way the client has not already agreed to, the lawyer must obtain informed consent — in writing, ideally, and almost always at the matter-engagement stage.
Those three steps are tractable for a firm with a compliance officer. They are brutal for a solo who already has twelve open matters and a queue of intake calls. Every content-reading tool — Billables.ai, the email-scanning features of Clio Duo, AI-drafted narrative generators — requires a written addendum to every fee agreement before the tool can legally be pointed at a client's matter. Most solos who evaluate such tools for the first time abandon them within a week of reading the opinion.
Metadata-only tools do not trigger the same workflow. They do not ingest client content; they do not generate outputs derived from client content; they do not need the Rule 1.6 vendor review or the written consent step. The competence obligation still applies in a general sense — the lawyer should understand what the tool does — but the matter-by-matter consent burden that killed content-reading adoption simply is not there.
This is not a loophole; it is the point. The architecture preceded the opinion by choice. When the opinion landed, it validated the posture that had already been designed in.
The three-question privilege test
If you are evaluating any billable-moment capture tool — ours or a competitor's — three questions decide it. They are more useful than any feature list.
- What content can the tool access? Not what it promises not to use; what it has the technical capability to read. If the answer includes call audio, email bodies, or document contents, the privilege analysis is nontrivial and Formal Opinion 512 is in play.
- Where does captured data physically reside? Device only, device plus end-to-end-encrypted sync, or vendor-operated servers. Each tier has a different breach/subpoena/terms-change exposure. Cloud-resident content-capture tools are strictly harder to defend under Rule 1.6 than device-resident metadata-capture tools.
- What triggers egress? If the tool transmits data on any schedule other than explicit, per-export user action, every transmission is a potential disclosure surface. Tools that phone home for telemetry, model improvement, or AI inference are transmitting content on a schedule the lawyer does not control.
ClaimHour's answers to those three questions: metadata only; device plus E2EE sync; user-triggered export only. A competitor that cannot say each of those three things in the same order is shipping a different product — one that might be useful, but that requires a client-consent workflow ClaimHour users do not need to run.
Edge cases we had to solve
Four categories of edge case eat most of the implementation time:
Mixed-use devices
Most solos run on a single phone that takes client calls, personal calls, and robocalls. The matcher has to be conservative: only numbers explicitly in the matter roster become candidate billable events. Unknown numbers are discarded. This means ClaimHour can miss a first-time intake call from a number not yet in the roster — we handle that with a "saw an unknown 14-minute call at 3:12pm; was this a client?" prompt at end-of-day digest time, asking the lawyer yes/no without ever storing or surfacing the number if the answer is no.
Shared calendars
Family calendars frequently share into the lawyer's calendar. The matcher ignores any calendar whose account ID is not one the lawyer has tagged as work. Spouse's dental appointment never becomes a candidate billable slot.
Opposing-counsel calls
Calls with opposing counsel are billable but do not correspond to a client in your matter roster. We handle this with per-matter opposing counsel phone fields — you enter the number once at matter intake, and it matches alongside the client's number.
Matter-level conflicts
A single phone call can relate to two matters (joint representation, related litigation). The digest surfaces the ambiguity rather than auto-assigning, and the lawyer picks at approval time. Never silently assign a billable moment to the wrong matter.
What the architecture means we can and cannot build
Being explicit about incapability constrains the roadmap in useful ways. The following features are not on the 2026 plan, because each would require crossing the metadata/content boundary:
- AI-drafted time-entry narratives from email bodies (Billables.ai-style)
- Call transcription and billable-moment extraction from audio (AI-scribe-style)
- Document-aware work descriptions (reading the brief to generate "drafted MSJ memorandum on standing")
- Keystroke-based activity scoring
- Screenshot-based capture (monitoring-software style)
The following are on the plan, because each respects the boundary:
- Tighter matter-roster onboarding via QuickBooks and LawPay import (done; see the QuickBooks-for-lawyers page for the IIF flow)
- iPhone CallKit capture parity with macOS (shipping in May)
- Richer matter-folder heuristics for documents stored in iCloud Drive shared folders
- Native Calendar-app rule-based aliasing (for lawyers who use non-standard event titles)
- A simple Windows port of the core capture using Win32
SetWinEventHook— same architectural constraints, different API surface. Low priority until Mac demand is fully met.
The point of publishing this list is that architecture is a commitment, not a phase. The features we refused at the start, we still refuse; the features we can build without breaking the boundary are the ones that ship.
Why this matters for the wedge
If you have read the launch essay or the first post in this blog — why US solo lawyers leak $30,000 a year in unbilled hours — you have seen the commercial framing: 120,000 US solos refuse to pay the PMS tax, and the billable-moment tools that would fix their leak are bundled inside the PMS subscription they refuse. The commercial wedge and the architectural wedge are the same wedge seen from two angles.
The lawyers who refuse the PMS tax are also the lawyers who refuse content-reading AI. The surveys we ran through early conversations (r/Lawyertalk DMs, LinkedIn cold outreach, in-person at a local immigration-bar CLE) converge on the same profile: price-sensitive, privacy-paranoid, Formal-Opinion-512-aware, Google-trusting, and absolutely not interested in onboarding a tool that needs a written client consent addendum to every fee agreement. Metadata-only is what sells to this audience. It would not have sold to a BigLaw IT-procurement lead, who wants a SOC 2 report and a trained AI that can do narrative drafting. ClaimHour is not built for that buyer. It is built for the 120,000.
The recap
ClaimHour's privacy posture is not a policy, a certification, or a checkbox. It is four capture paths — calls, email, documents, calendar — each deliberately restricted to metadata fields, plus a refusal list of the content-reading capabilities a reasonable competitor might ship. Data lives on the device, in an encrypted SQLite file, synced through Apple's end-to-end-encrypted private database, never transiting ClaimHour-operated servers. Egress happens only during user-triggered export, and exports carry approved time entries only. ABA Formal Opinion 512 (2024) shaped the trade: content-reading tools require a matter-by-matter client-consent workflow the solo ICP refuses to run, and metadata-only tools do not. The architecture is the product.
If you want to see what this buys you in practice — a digest of billable moments captured from the last week of your actual work — join the waitlist. If you want a closer look at exactly what the tool does and does not collect, the privacy policy spells out the same boundaries this post does, in a form a state bar reviewer can cite.
Frequently asked
What exactly does ClaimHour mean by "metadata only"?
For calls: duration, counterparty phone number, and direction. No audio, no transcript, no summary. For email: per-thread send and receive counts plus subject-line keyword matches against the user's matter list. No bodies, no attachments. For documents: edit-window time per open file whose path matches a matter folder. No file contents. For calendar: event start/end plus matter-matched titles. No event notes. That is the entire data boundary.
Why does ABA Formal Opinion 512 (2024) matter here?
Formal Opinion 512 treats content-reading AI tools as requiring affirmative client consent and careful vendor due diligence under Rule 1.6. Metadata-only tools do not ingest privileged content, so they fall outside the opinion's core concern. ClaimHour's architecture was designed against that boundary — anything that would require the opinion's consent workflow is specifically off the capture path.
Where does the captured data physically live?
On the user's device. ClaimHour writes to a SQLite file inside macOS application support, encrypted with SQLCipher keyed to the device Keychain. The iOS companion maintains a parallel file inside the app sandbox and syncs through CloudKit's end-to-end-encrypted private database. Nothing reaches ClaimHour servers unless the user explicitly exports an approved time entry.
Can metadata still leak privileged information in the aggregate?
In theory, yes — the fact that a lawyer called a particular number at a particular time can, combined with public-docket information, reveal representation. In practice, the same information already exists in the lawyer's phone bill, calendar, and billing system. ClaimHour does not create a new disclosure surface that did not already exist, and the meaningful privilege line is between metadata and content. ClaimHour sits unambiguously on the metadata side.
What happens during an export — is that when data leaves the device?
Exports are the only egress path. When the user exports to QuickBooks IIF, LawPay, FreshBooks, or CSV, ClaimHour assembles the approved time entries locally and writes the destination file. For LawPay and FreshBooks, the export POSTs from the user's device directly to the billing vendor's API using the user's own credentials — ClaimHour's servers are not in the path. For QuickBooks IIF and CSV, the file is written to disk.
Is this HIPAA-compliant or HIPAA-covered?
ClaimHour is deliberately not HIPAA-covered. A HIPAA-covered tool would need to process protected health information; the architecture specifically cannot. Family lawyers handling medical records, for example, will find the records never enter the pipeline — the tool sees the filename and edit-window duration, nothing more. Narrower capture, lower regulatory surface, no Business Associate Agreement required.
Further reading
- Why US solo lawyers leak $30,000 a year in unbilled hours — the commercial companion piece
- The ClaimHour launch essay — the 1,600-word opening argument for the product
- The ClaimHour privacy policy — the same data boundaries, in bar-citable form
- Billables.ai alternative — the direct comparison with the closest content-reading competitor
- Automatic time tracking for attorneys — the buyer's education on passive capture
- Time tracking without a practice management system — the wedge, commercially
- Clio vs ClaimHour — PMS-level head-to-head with three-year cost math
- Embed the leak calculator on your site — the interactive $30k-a-year math