CTIEEITLDRThreat Intelligence

Building an Evidence Timeline

Grimm's Gadgets June 4, 2026 8 min read

Don't paste a pile of bullet points and call it an actor timeline. Record what was reported, where it came from, how good the source is, and what *you* concluded, all kept separate. Then it survives the "how do you know?" question.

Evidence Timeline: TLDR

Why the pasted-summary way fails: It can't tell you which report a claim came from. The first time someone challenges it, you've got nothing to point at.

Every entry = 5 fields:

  • Event date: when it happened, not when the article ran.
  • What was reported: the source's words, not your spin.
  • Source: the actual report, specific enough to find again.
  • Source tier: vendor first-hand telemetry is not the same as an anonymous forum post.
  • Your assessment: flagged as yours, with a confidence level. Can be blank.

The one rule: your inference never goes in the "what was reported" box.

Build it in 5 quick passes:

  1. Drop every claim into the 5 fields. Merge duplicates. Log gaps as their own entries.
  2. Tier the sources. Vendor telemetry on top, unverified forum post at the bottom.
  3. Sort by event date. Order is not cause: a leak on the 3rd before ransomware on the 11th proves nothing by itself.
  4. Tag ATT&CK to the reported behavior, not to a hunch.
  5. Write conclusions in calibrated language (moderate is not high). Point each one at its evidence.

30-second example, SILVERSCALE-117: Vendor telemetry gives persistence, C2 (a JA3 match links it to older activity), and phishing. Solid. A forum credential dump and a second-hand ransomware report are weak, and the timeline flags them as weak. Nothing covers how the actor got from access to impact, so that gap is a to-do, not an assumption. Asked "how solid is the ransomware claim?" the answer is exact: disclosure feed, uncorroborated, moderate confidence.

Payoff: Holds up under questioning. Weak sources look weak. Confidence stays honest. Clean handoff (mark the TLP and go).

Where ThreatSpire fits: Its Evidence Timeline does this automatically: ingest, dedupe, tier, ATT&CK-tag, keep current per actor. It enforces the habit. You still have to have it.

Bottom line: Keep what was reported separate from what you concluded, and note where each piece came from. That's the whole trick.


Building an Evidence Timeline: From Scattered Reporting to a Defensible Actor Picture

An analyst gets asked to put together what a ransomware crew has been up to. They open a dozen tabs: two vendor research blogs, a news write-up, a disclosure feed, a forum screenshot a colleague flagged. The highlights go into a document in rough date order, and that document becomes the actor's timeline.

It works until a hunt lead asks where the ransomware deployment on the 11th came from and how solid it is. The document can't answer. It never recorded which report each line came from, and it never separated what a vendor actually observed from what got inferred during a fast skim. The picture might be correct, but there's no way to demonstrate that, and an actor assessment you can't trace back to its sources won't hold up the first time someone leans on it.

A pasted summary tells you what the team thinks the actor did. An evidence timeline records something more useful: what was reported, who reported it, how much that source is worth, and what you assess from it, with the assessment kept apart from the reporting it rests on. Hold those two things apart and the rest is mechanics.

The unit is the reported event, not the headline

A timeline built from headlines fuses two things that need to stay separate: what a source said happened, and the meaning you assigned to it. "SILVERSCALE-117 pivoted to manufacturing targets" reads like a fact, but it's a reported event plus an inference about intent plus a trend claim. Once they're welded into one line, you can't pull them back apart when someone challenges any one of them.

Build from the reported event up. Each entry carries five fields, and they don't bleed into each other:

  • Event date. When the activity happened, not when the article ran. A vendor blog posted yesterday might describe a campaign from three weeks ago. Track publication date separately if it's useful; mix the two and your timeline starts following publishing cycles instead of adversary behavior.
  • Reported activity. What the source described, in the source's terms, without your gloss.
  • Source. Specific enough to return to. Publisher and the actual report, not "a vendor blog."
  • Source tier. How much weight the source carries. A named vendor's first-hand telemetry and an anonymous forum post are not worth the same.
  • Assessment. What you make of it, flagged as yours, with a confidence level. This field can be empty. Plenty of reported events mean nothing alone and only matter in combination.

One rule holds the structure together: an inference never goes in the reported-activity field. The moment "actor is escalating against critical infrastructure" lands where the reported behavior belongs, that entry has stopped being evidence and become opinion.

Building it, in five passes

1. Normalize the scatter into events

Reporting shows up in different shapes on different clocks. The first pass is tedious: get each claim into the five fields, split event date from publication date, and deduplicate. Two vendors describing the same phishing wave is one event, recorded once against the better-sourced report, with the second noted as corroboration. Log it as three separate events and you've turned a single campaign into a crime spree that never happened.

Record what's missing, too. If the reporting jumps from initial access straight to ransomware with nothing between, that silence is an entry. A logged gap is something you can go collect against later; a gap you smooth over is what a reviewer finds first.

2. Establish provenance and source tier

For each source, ask two questions separately, the way the Admiralty system does: how reliable is the source in general, and how credible is this specific claim? A normally solid vendor can publish a thin attribution. A sketchy forum can occasionally be right. Score them independently and the reporting sorts into tiers without an argument later: a vendor describing its own telemetry near the top, reputable news below it, disclosure and leak feeds lower, an unverified forum post lower still and present in the timeline but never load-bearing. When two sources disagree, the tier decides which one anchors the entry. ThreatSpire applies the same logic when it weights a vendor advisory above an unverified post.

3. Order by event date, then separate sequence from cause

Order events by event date, and watch for the mistake this whole method exists to prevent: reading sequence as causation. A credential dump posted on the 3rd followed by a ransomware deployment on the 11th does not establish that those credentials enabled that intrusion. The timeline shows order. Whether one caused the other is an assessment, and it belongs in the assessment field with a confidence level, not in the spine of the timeline as reported fact.

4. Map to ATT&CK at the event level

Technique mapping belongs to individual events, tied to behavior a source actually described, not applied across the narrative because the campaign feels like a known group. Reported registry run-key persistence maps to T1547.001 because a source observed it. Beaconing to a fast-flux cluster maps to T1071.001. If you're mapping a technique you can't tie to a reported behavior, you're mapping a guess, and it goes in the assessment field labeled as one.

5. Layer the assessment with calibrated confidence

Now write the conclusions, in calibrated language. "Moderate confidence" and "high confidence" are not interchangeable hedges. Pick a scale, the ICD 203 bands work, and use it honestly. Each assessment should point back at the specific events behind it and name the ones that would change your mind if they turned out differently.

A worked example

SILVERSCALE-117, reported across several sources over six weeks. The timeline, abbreviated:

Event dateReported activitySourceTierAssessment
22 Apr 2026Registry run-key persistence on compromised endpoints (T1547.001)Vendor-X researchHigh (vendor telemetry)Observed; consistent with later activity
03 May 2026Stolen credentials posted to darkforum thread tied to actor aliasDisclosure feedLow (unverified)Alias linkage only; do not assume these creds drove later intrusions
11 May 2026Ransomware deployed at two manufacturing subsidiaries (T1486)Disclosure feedMediumImpact reported second-hand; corroboration pending
18 May 2026Beaconing to fast-flux .top cluster, JA3 matches prior campaign (T1071.001)Vendor-X researchHigh (vendor telemetry)Strong infrastructure link to earlier SILVERSCALE activity
26 May 2026Phishing wave vs. Nordic logistics, weaponized ISO lures (T1566.001)Vendor-X researchHigh (vendor telemetry)Observed; new loader variant

The persistence, C2, and phishing entries are high-tier: a named vendor's own telemetry, each tied to a specific behavior and an ATT&CK technique. The JA3 match on the 18th is what actually links this run to earlier SILVERSCALE activity, so the attribution rests on that entry rather than on the actor's name turning up in a headline.

The 3rd and the 11th are weaker, and the timeline makes that visible. The credential dump is an unverified forum post; it earns a row because the alias is worth tracking, but it anchors nothing. The ransomware on the 11th came from a disclosure feed reporting impact second-hand, still uncorroborated. So when the hunt lead asks where the ransomware claim came from, the answer is specific: a disclosure feed, not yet backed by vendor telemetry, which is why confidence in that detail is moderate rather than high. And the 3rd-before-11th ordering is not evidence the dumped credentials drove the deployment.

The gap is a finding in its own right. Nothing in the reporting covers how the actor got from initial access to impact: no lateral movement, no privilege escalation, no exfil. You record the absence instead of assuming the techniques you'd expect to see. That line tells the next analyst what to go collect.

Every claim ties to a source and a tier, the weak links read as weak, the causal leap is refused, and the gap is on the page. When someone asks how you know, you point at the rows.

What you get out of it

A timeline built this way holds up under questioning, because each conclusion traces to a cited, tiered source. It keeps thin sourcing visible instead of setting a forum post in the same font as a vendor advisory. It keeps confidence honest, because the structure makes you write "moderate" when the sourcing is moderate. And it hands off without a verbal briefing: the next analyst, the hunt lead, or the engineer writing a detection gets your evidence and your reasoning as separate things and can re-examine the assessment on its own terms. Mark it with the right TLP level and dissemination takes care of itself.

This is the shape of ThreatSpire's Evidence Timeline. It pulls in reporting from news, vendor research, and disclosure feeds, deduplicates it, tiers it by source, tags it to ATT&CK, and keeps it current per actor. The tool enforces the discipline; it doesn't supply it. The analyst who keeps reporting and assessment separate will produce defensible work in a spreadsheet, and better software won't rescue the one who doesn't.

None of this is sophisticated. It comes down to a habit: keep what was reported separate from what you concluded, and note where each piece came from. The work holds up because of that separation, not because of anything clever in how it's written.

See it in action

Want intelligence that drives decisions, not noise?