Trust Scoring·7 min read

The Trust Score Spectrum

When to act, when to flag, when to block — a practical guide to interpreting trust scores in production.

Luke Swestun·February 3, 2026

A trust score without a decision framework is just a number. Knowing that a claim scored 72 out of 100 is useful, but knowing what to do with that score — whether to accept, flag, escalate, or block — is where the real value lives. This article provides a practical framework for interpreting trust scores in production, building trust policies that match your risk tolerance, and handling the edge cases that every verification system encounters.

The Decision Framework

SignalStack's trust scoring system (/docs/guides/trust-scoring) outputs scores on a 0-100 scale, but the mapping from score to action is entirely configurable. The default thresholds provide a starting point, but production deployments should calibrate them against their specific data, risk tolerance, and regulatory requirements.

The Four Zones

Trust decisions fall into four zones, each corresponding to a range of scores and a recommended action pattern:

Green Zone: 85-100 (Automated Acceptance)

Claims in this range have strong corroboration from multiple independent sources, high source authority, and recent timestamps. The evidence chain shows no contradictions, and the historical accuracy of the sources is excellent. These claims can be accepted without human review.

Example: A verified business registration query returns records matching the provided company name, address, and registration number from the official government registry. The source authority is 98 (government domain), recency is 95 (data updated today), and corroboration is 92 (matches three independent databases). Aggregate trust score: 94.

"The green zone is where automation earns its keep. If you are requiring human review for claims scoring above 90, you are leaving efficiency on the table. The evidence chain is your proof that the decision was sound." — Alex Rivera, SignalStack

Yellow Zone: 65-84 (Flag for Review)

Claims in this range are likely valid but have one or more dimensions that reduce confidence. Perhaps the source authority is strong but the recency is poor. Or the corroboration is weak because only two independent sources were found. These claims should be flagged with their evidence chain for human review, but the review process should be streamlined — the human should be able to see exactly which dimension is pulling the score down and make a rapid decision.

Example: An invoice from a known supplier arrives with a trust score of 72. The source authority is 95 (verified supplier, 5-year history), but the recency is 45 (the invoice is dated 18 months ago) and the corroboration is 60 (the invoice number exists in the system but a matching PO cannot be found). The evidence chain shows the exact discrepancy: the invoice date falls outside the standard 90-day billing cycle for this supplier. A human reviewer can quickly determine whether this is a legitimate delayed invoice or an attempt to reuse an old invoice template.

Orange Zone: 40-64 (Escalate for Priority Review)

Claims in this range have significant issues. Multiple dimensions are below threshold, or one critical dimension (typically source authority or corroboration) is severely deficient. These claims require priority human review, and the default action should be to pause any automated processing until the review is complete.

Example: A claim verification for a news event returns a trust score of 48. The source authority is 20 (the claim originates from an anonymous social media account), corroboration is 35 (only one other source mentions the event, and that source is a blog with no editorial standards), and the internal consistency is 55 (the timestamp claims the event occurred yesterday, but no major news outlet has reported it). The evidence chain clearly shows why each dimension scored poorly. Priority review with an instruction to treat the claim as unsubstantiated until confirmed.

Red Zone: 0-39 (Automatic Block)

Claims in this range are almost certainly fraudulent or deeply unreliable. The trust score is low across multiple dimensions, and at least one dimension is likely below 20. These claims should be automatically rejected with a detailed evidence report that can be used for investigation, reporting, or legal action.

Example: A document analysis request for an identity document returns a trust score of 22. The media provenance check detects AI generation with 94% confidence (score: 6). The metadata analysis reveals that the document was created with a generative model at 3:47 AM GMT from an IP address in a different country than the claimed issuing authority (consistency score: 18). The document is automatically blocked and the evidence is logged for fraud investigation.

Building Trust Policies

Trust policies map trust score ranges to actions. SignalStack supports flexible policy definitions that can vary by verification type, data category, dollar amount, or any other parameter your application requires. Here are examples of real trust policies deployed by SignalStack customers:

E-commerce payment verification — Accept payments with business verification scores ≥ 85. Flag scores 65-84 for manual review if payment exceeds $5,000. Block scores < 65 for all payments.
Supplier onboarding — Accept supplier registrations with document analysis scores ≥ 80 AND business verification scores ≥ 90. Flag any document with media provenance score < 70 for deepfake investigation.
Content moderation — Automatically approve user-generated claims with trust scores ≥ 75. Flag scores 50-74 for human review. Reject scores < 50 without the option to appeal (redirect to appeals process documented in the evidence report).
Financial reporting — Accept claim verification scores ≥ 90 for automated data ingestion. Flag all scores below 90 for analyst review. Never auto-accept claims with recency scores < 30, regardless of aggregate trust score.

The /docs/guides/trust-scoring documentation covers the full policy configuration API, including conditional rules, time-based thresholds, and override mechanisms for verified escalation paths.

Handling Edge Cases

No threshold system is perfect. Here are the most common edge cases and how to handle them:

The Borderline Claim (Score: 79-81)

Claims that land right on a threshold boundary can cause inconsistent behavior — especially for high-volume systems. SignalStack supports a "sticky threshold" configuration that applies hysteresis: a claim scoring 81 that comes from a source with a recent history of scores in the yellow zone is treated as yellow until the source establishes a track record in the green zone. This prevents thrashing in automated decision systems.

The High-Authority Lie

Sometimes a high-authority source publishes false information — a government website with outdated data, a reputable news outlet with an uncorrected error. The trust score may be misleadingly high because the source authority dimension dominates. SignalStack's approach is to flag claims where the source authority is high (> 90) but the corroboration is moderate (< 70) for additional cross-referencing. This catches the cases where a single authoritative source is wrong.

The Rapidly Changing Claim

Some claims have a short shelf life — stock prices, breaking news, sports scores. A trust score of 90 for a stock price from 5 seconds ago is correct; the same score for a stock price from 5 minutes ago may be dangerously stale. SignalStack's recency dimension handles this automatically, but the trust policy should also consider whether the claim type requires real-time verification or can tolerate some latency. Configurable recency requirements per claim type are documented at /docs/guides/trust-scoring.

Run a two-week shadow evaluation before enforcing any trust policy in production. During this period, log the trust score and the recommended action, but continue using your existing decision logic. At the end of two weeks, analyze how many of your automated acceptances would have been flagged by the trust score system — and how many of those flags would have caught actual fraud. This data lets you calibrate your thresholds with confidence. Most teams adjust their thresholds by 5-10 points after the shadow evaluation period. The /pricing page includes volume-based plans that support this evaluation phase cost-effectively.

Conclusion

The trust score spectrum transforms a single number into a decision framework. By mapping score ranges to specific actions — accept, flag, escalate, block — and building configurable trust policies that match your organization's risk profile, you can automate the vast majority of verification decisions while maintaining human oversight for the cases that need it. The key is to treat trust policies as living configurations that evolve with your data, your risk tolerance, and your regulatory environment. Start with the default thresholds, run a shadow evaluation, and iterate. The full trust scoring documentation at /docs/guides/trust-scoring and the platform overview at /pricing provide everything you need to build a production-grade verification system.

Luke Swestun

Founder & CEO

Luke Swestun is the founder of SignalStack. He writes about trust infrastructure, hallucination detection, and building AI agents that can verify before they act.

Trust Scoring