Knowledge Management in the Age of AI: Full System Design

April 16, 2026 knowledge-management ai decision-making methodology RAZEM brand24

This document is access-restricted.

The Problem in One Sentence

Brand24 makes decisions, but the decisions don’t remember themselves. The same questions resurface because the answers live in someone’s Slack thread, someone’s head, or someone’s last day.

Why Traditional Solutions Fail

Approach	Why it doesn’t stick
“Let’s use Confluence/Notion”	Creates a second place to look. People default to Slack. Wiki decays within 90 days.
“Let’s document everything”	Writing docs is overhead. Nobody writes them under deadline pressure. They go stale.
“Let’s record meetings”	45-minute recordings nobody rewatches. The decision is in minute 37.
“Let’s use AI search”	Searching garbage returns garbage faster. The input quality problem is upstream.

The core issue is not storage. It is capture discipline - making it effortless to log the decision at the moment it’s made, and effortless to find it at the moment it’s needed.

The Architecture: Three Layers

CAPTURE ──────────> MEMORY ──────────> RETRIEVAL
(where decisions    (where they        (how people
 enter the system)   live forever)      find them)

Layer 1 - Capture (works inside tools Brand24 already uses)

The Decision Card. One structured template. Six fields. Takes 60 seconds to fill.

Field	Example
What we decided	Switch billing provider from Stripe to Paddle
Why	VAT handling for EU customers - saves 15h/month finance
What we considered	Option A: Stripe Tax (too expensive at scale) / Option B: manual handling (doesn’t scale past 500 customers) / Option C [chosen]: Paddle (best fit - handles EU VAT natively)
Second choice	Option A (Stripe Tax) - would reconsider if price drops below $500/mo
Who decided	@tomek, @ania, approved by @michal in #payments 2026-04-10
What would reverse this	Paddle fees exceed 5% of MRR OR Stripe Tax drops below $500/mo
Confidence	0.85 (strong data on costs; unknown: Paddle support quality)
Expires / review date	2026-10-10 (6 months)

Why alternatives matter. The card doesn’t just record what was decided - it records what was rejected and why. When someone asks “why don’t we just use Stripe Tax?” six months later, the answer is already there. The second choice is explicit because it’s the one most likely to be relitigated. This pattern comes from a decision log I’ve maintained for 200+ decisions - the single most valuable field is not “what we decided” but “what we almost decided instead.”

Where it lives: a Slack workflow triggered by /decision in any channel. The card posts to a dedicated #decisions channel AND writes to the memory layer. Zero context-switching. Zero new tools to learn. 60 seconds of effort produces a permanent, searchable, AI-retrievable record.

Meeting capture variant: At the end of every meeting, one person fills the card. Not a transcript. Not minutes. One card per decision made. If no decision was made, the meeting was information exchange and doesn’t need a card (this distinction matters - it prevents documentation bloat).

Layer 2 - Memory (the knowledge graph)

Decision Cards are text. Text is searchable but not relatable. The memory layer adds structure - turning isolated cards into a navigable graph where decisions connect to each other, to the people who made them, and to the topics they affect.

The MemPalace architecture. I’ve built and used a semantic knowledge graph called MemPalace for 18 months of daily work. It stores 20,000+ entries across a graph of rooms (topics), drawers (entries), and typed relationships - with semantic search, deduplication-by-design (the system searches before writing to prevent duplicates), and knowledge graph facts that link entities together. The architecture is proven. Here is how it maps to Brand24:

                    ┌─────────────────────────────────┐
                    │         KNOWLEDGE GRAPH          │
                    │                                  │
  ┌──────────┐      │   [Decision]──supersedes──>[Decision]
  │  Slack   │──────│        │                       │
  │ /decision│      │    made_by              depends_on
  └──────────┘      │        │                       │
                    │    [Person]              [Decision]
  ┌──────────┐      │        │                       │
  │ Meeting  │──────│    member_of                topic
  │  notes   │      │        │                       │
  └──────────┘      │     [Team]──────owns────>[Topic]
                    │                                  │
                    └─────────────────────────────────┘

Four node types, six relationship types:

Node type	Example	Relationships
Decision	“Switch to Paddle”	`supersedes`, `depends_on`, `contradicts`, `relates_to`
Person	@tomek	`made_by` (links decisions to their author)
Team	#payments	`owns` (links teams to topic areas)
Topic	“Billing”	`topic` (clusters decisions into navigable domains)

What this enables that flat search cannot:

“Show me everything @tomek decided” - follows the made_by edges from a person node. When Tomek goes on vacation, his replacement sees the full decision trail.
“What decisions does the Paddle migration depend on?” - follows depends_on edges. Shows the chain of linked decisions. If one upstream decision changes, the downstream ones surface for review.
“What did we supersede?” - the supersedes edge preserves history. The old decision isn’t deleted; it’s marked as replaced. You can always see what used to be true and why it changed.
Deduplication by search-before-write - when someone types /decision and starts entering “billing provider,” the system searches the graph first and surfaces: “Existing decision DEC-047: Switch to Paddle (April 10). Update this one instead?” This prevents the fragmentation that kills wikis.

The Decision Timeline View

Beyond search, the graph enables a visual timeline - decisions as nodes on a horizontal axis, filterable by team, topic, person, or confidence level.

2026-Q1                         2026-Q2
──●────────●────────●──────────●────────●──
  │        │        │          │        │
  │        │        │          │        └─ DEC-052: API rate limits
  │        │        │          │           confidence: 0.9
  │        │        │          │           second choice: per-IP throttling
  │        │        │          │
  │        │        │          └─ DEC-047: Switch to Paddle [ACTIVE]
  │        │        │             supersedes DEC-031
  │        │        │             confidence: 0.85
  │        │        │             second choice: Stripe Tax
  │        │        │             review: 2026-10-10
  │        │        │
  │        │        └─ DEC-031: Stay on Stripe [SUPERSEDED by DEC-047]
  │        │
  │        └─ DEC-028: Hire senior backend
  │           confidence: 0.7
  │           second choice: promote junior + contractor
  │
  └─ DEC-019: Kill the mobile app
     confidence: 0.95
     second choice: outsource maintenance only

Each node expands to show:

The chosen option (highlighted)
Alternatives considered with reasons for/against each
The explicit second choice (the “Option B” - the one most likely to be revisited)
Confidence score (color-coded: green >0.8, yellow 0.5-0.8, red <0.5)
Status: ACTIVE / SUPERSEDED / EXPIRED / UNDER REVIEW
Review date (if set)

Filters: by team, by topic, by person, by date range, by confidence, by status. A product manager sees only product decisions. A new hire in engineering sees only engineering decisions from the last 6 months. The CEO sees everything, color-coded by confidence.

Implementation options (ascending complexity):

Option	Tool	Effort	Best for
Minimal	Google Sheet + Apps Script + Slack bot (flat, no graph)	1 week	<50 people, immediate start, prove the capture habit first
Standard	JSONL store + vector embeddings + Slack integration + simple timeline UI	3-4 weeks	50-200 people; adds search and timeline without full graph
MemPalace	Multi-tenant semantic graph (Neo4j or Dgraph) + vector search + Slack + timeline + AI retrieval	6-8 weeks	Full architecture; graph relationships, deduplication, onboarding digests, cross-domain reasoning

My recommendation for Brand24: Start Minimal in week 1 (one team, Google Sheet, Slack /decision command). Prove the capture habit works. Then build toward the MemPalace option over weeks 5-8 using the data already captured. The graph is only as good as the data in it - and the data comes from the capture habit, not the technology.

Layer 3 - Retrieval (AI-native from day one)

The retrieval layer is where AI earns its keep. Not as a chatbot. As a search interface that understands context.

Three retrieval modes:

Direct search: “What did we decide about billing?” → returns the Decision Card, not a Slack thread. Takes 3 seconds instead of 15 minutes of scrolling.
Contextual suggestion: When someone starts a Slack thread about billing, the system surfaces: “Related decision from 2026-04-10: Switch to Paddle. Made by @tomek. Review date: October.” This prevents re-litigation. The person can challenge the decision, but they know it exists.
Onboarding digest: New hire joins the #product team. System generates: “Here are the 12 active decisions in your domain, ordered by impact. 3 are up for review this quarter.” The new hire has institutional context on day 1 instead of month 3.

Layer 4 - Decision Quality (the AI doesn’t just store - it challenges)

The first three layers solve the memory problem: decisions are captured, stored, and findable. But Brand24’s problem statement also names “powtarzalne decyzje” - repetitive decisions. Repetitive decisions are not just a retrieval failure (“we didn’t know we already decided this”). They are also a quality failure (“we decided this, but nobody was confident enough to defend it when challenged”).

The fourth layer adds AI-assisted decision analysis at the moment of decision-making - not after.

When someone fills the Decision Card, the system runs three checks before saving:

Check 1 - Prior Decision Search (automatic)

Before the card is saved, the system searches the graph: “Has this topic been decided before?” If yes, it surfaces the prior decision with its alternatives, confidence, and review date. The person either updates the existing decision or explicitly supersedes it. This is the deduplication layer - it catches the “we already decided this” case.

Check 2 - Adversary Test (AI-generated)

The system takes the chosen option and generates the strongest counter-argument. Not a generic “are you sure?” - a specific, steel-manned objection based on the alternatives the person listed.

Example:

You chose: Switch to Paddle for EU VAT handling Strongest counter: Stripe Tax launched a flat-rate EU option in March 2026 at $0.50/transaction. At Brand24’s transaction volume (~3,000/month), that’s $1,500/month vs Paddle’s 5% of revenue. Have you priced Stripe Tax at current rates, not the rates from when this was last discussed?

This is the Zbigniew principle: every decision gets tested by an adversary before it’s locked in, not after. The adversary isn’t trying to block the decision - it’s trying to make sure the deciding team has seen the strongest version of the argument against it. If the counter-argument doesn’t change the decision, the confidence score goes up. If it does, the team just avoided an expensive mistake.

Check 3 - Bias Scan (pattern detection)

The system checks the decision card text for common cognitive bias patterns:

Signal in the card text	Likely bias	System response
“We’ve already invested three months in…”	Sunk cost fallacy	“Would you make this same decision if you were starting from zero today?”
“Everyone agrees this is the right approach”	Groupthink	“Who in the room has the strongest reason to disagree? Have they spoken?”
“It should only take two weeks”	Planning fallacy	“What took twice as long as expected last quarter? Apply that multiplier.”
“The competitor is doing it”	Bandwagon effect	“Does the competitor have the same constraints, team size, and customer base?”
“We need to move fast on this”	Urgency bias	“What is the actual cost of deciding next week instead of today?”

The bias scan doesn’t block the decision. It adds a one-line flag to the card: bias_flags: [planning_fallacy, urgency_bias]. Over time, the team can see which biases they’re most prone to - and the patterns become self-correcting.

Two Tiers: Everyday Decisions vs. High-Stakes Decisions

Not every decision needs the same depth. The system recognizes two tiers:

Tier 1 - Everyday decisions (80% of all decisions, 90 seconds total) Which Slack channel to use. Whether to attend a conference. Which bug to prioritize. Sprint scope. The Decision Card + adversary test + bias scan is sufficient. Fast, lightweight, gets the habit going.

Tier 2 - High-stakes decisions (20%, reserved for what shapes the company) Pricing model changes. Market entry. Senior hires. Technology stack. Partnerships. Acquisition offers. These get the full analytical engine - a structured multi-lens analysis that draws from two deep toolkits:

Toolkit A - Cognitive Biases Library (43 detectors)

The system includes a library of 43 cognitive biases organized by decision context. The bias scan in Tier 1 catches the 5 most common. Tier 2 runs the full set relevant to the decision type:

Decision type	Key biases to test	What they catch
Pricing / Revenue	Anchoring, Framing Effect, Endowment Effect, Loss Aversion	“We can’t lower the price” when the anchor is arbitrary
Hiring	Halo Effect, Similarity Bias, Courtesy Bias, Recency Bias	Hiring people who feel right instead of who are right
Product / Roadmap	Sunk Cost, IKEA Effect, Planning Fallacy, Survivorship Bias	Building what we’ve invested in instead of what users need
Strategy / Market	Confirmation Bias, Bandwagon Effect, Optimism Bias, Status Quo Bias	Following the market instead of reading it
Technology	Shiny Object Syndrome, Not-Invented-Here, Automation Bias	Adopting tools because they’re new, not because they solve the problem

Each bias detector is a specific question, not a label. “Anchoring detected” is useless. “Your price benchmark is last year’s contract - would you set the same price if you were pricing from scratch today?” is actionable.

Toolkit B - Thinking Tools (26 mental models)

For Tier 2 decisions, the system offers structured thinking tools - mental models that force the decision-maker to look at the problem from a specific angle:

Model	When to use it	The question it forces
Inversion	Any decision	“What would make this fail? Work backward from failure.”
Second-Order Effects	Product, strategy	“What happens AFTER the first consequence?”
Circle of Competence	Build vs buy, hiring	“Do we actually know this domain, or are we assuming we do?”
Margin of Safety	Budget, timeline, hiring	“What buffer do we need if our estimates are wrong by 2x?”
Opportunity Cost	Any “should we do X”	“What are we NOT doing by doing this?”
Reversibility Test	Any decision	“Can we undo this in 2 weeks? If yes, decide fast. If no, decide carefully.”

The full set of 26 models is available in the system. For any given decision, the AI recommends 2-3 relevant models based on the decision type and context. The decision-maker doesn’t need to know all 26 - the system selects and applies them.

Toolkit C - PARDES: The Five-Reader Engine (for the decisions that define the company)

For the highest-stakes decisions - the ones where getting it wrong costs a quarter or a key person - the system offers a structured five-lens analysis called PARDES. Each lens reads the same decision from a different angle:

Lens	Question	What it catches
Surface (Peshat)	What does the data literally say?	Hard numbers, verified facts, no interpretation yet
Pattern (Remez)	What converges from independent signals?	Multiple data sources pointing the same direction
Interpretation (Drash)	What must be true for this to make sense? Cui bono?	Hidden assumptions, who benefits from each option
Adversary (Sod)	What is the strongest argument AGAINST the preferred option?	Steel-manned counter-argument that must be addressed before deciding
Emergence	What appears from combining the other four that none showed alone?	The insight that only exists at the intersection

Example - “Should we raise prices 20%?”

Lens	Finding
Surface	Current MRR $X. Churn at Y%. Competitor Z charges 30% more.
Pattern	Three independent signals: support tickets mention “cheap” (customers undervalue), competitor raised prices and retained 94%, cost of new features exceeds current margin.
Interpretation	The pricing is subsidizing growth that has already happened. The customers who would churn at +20% are the customers with the highest support cost.
Adversary	“Raising prices during an economic downturn signals tone-deafness. Churn prediction models assume historical patterns; the macro environment is different. What if the 6% who churned from the competitor were the ones with the loudest social media voices?”
Emergence	The price increase is correct but the timing is the variable. Increase for new customers now; grandfather existing for 6 months; use the transition period to reduce support cost for the segment most likely to churn. The answer was not “yes/no to 20%” but “yes, phased, with a support intervention for the at-risk segment.”

PARDES takes 30-60 minutes for a senior decision. It produces a one-page analysis that the decision card links to. It is not for every decision - it is for the 5-10 decisions per quarter that shape the company’s trajectory.

The Human Layer: Structured Dissent (AI challenges the logic; this protocol challenges the silence)

The AI adversary test catches bad reasoning. It cannot catch the person in the room who sees the problem but doesn’t speak. The most expensive decisions in any organization are not the ones where the analysis was wrong - they’re the ones where someone knew it was wrong and said nothing because the social cost of dissent was too high.

This is a solved problem. Multiple disciplines have built protocols for it. The system integrates the best of them:

Management 3.0 Delegation Poker - for the “who decides” question:

Before a Tier 2 decision, each person involved privately selects a card from seven levels:

Level	Meaning
1 - Tell	“I will decide and inform you”
2 - Sell	“I will decide and explain why”
3 - Consult	“I will ask your input, then decide”
4 - Agree	“We decide together by consensus”
5 - Advise	“You decide, I advise”
6 - Inquire	“You decide, then tell me”
7 - Delegate	“You decide, I don’t need to know”

Cards are revealed simultaneously. The highest and lowest card holders explain their reasoning. This eliminates anchoring (nobody sees the boss’s card first) and surfaces the gap between how the team thinks authority is distributed and how it actually is. If the engineering lead picks “7 - Delegate” and the CEO picks “1 - Tell” on the same decision, the system has just surfaced a structural misalignment that no amount of Slack threading would have revealed.

The delegation level is recorded on the Decision Card. Over time, the graph shows patterns: “Engineering decisions average level 5 (team decides). Pricing decisions average level 2 (leadership decides and sells). Hiring decisions are bimodal (3 and 5, no consensus on who decides).” The bimodal ones are the decisions most likely to be relitigated.

Red Team Protocol (US Army doctrine) - for the “has this been stress-tested” question:

For Tier 2 decisions, one person is explicitly assigned the Red Team role: their job is to argue against the preferred option. Not as a personality trait (“she’s always the contrarian”) but as a structural assignment that rotates. This week, Tomek is the red team. Next week, Ania is. The role is the license.

This comes directly from US Army doctrine (The Red Team Handbook) and from the older Devil’s Advocate protocol formalized by Pope Gregory IX in 1234 for canonization decisions. In both cases, the insight is the same: dissent that depends on individual courage will fail. Dissent that is structurally mandated will succeed. The brave person who speaks up is a hero. A system that depends on heroes is a fragile system. A system that assigns the adversary role is antifragile.

The red team finding is recorded on the Decision Card alongside the adversary test. The AI generates a counter-argument from logic; the human red team generates a counter-argument from experience, politics, and the things the data doesn’t show.

The Concern Round - for the “is anyone uncomfortable” question:

Before a Tier 2 decision is finalized, the system prompts a structured round where each participant answers one question:

“What is your remaining concern about this decision, if any? If none, say ‘none.’“

The round proceeds in order. Each person speaks once. Nobody is interrupted. The concern is logged on the card.

This protocol draws from three traditions that independently discovered the same structure:

Quaker decision-making (clerking → sense of the meeting → testing concerns → discernment → silence): the “testing concerns” phase is structurally identical - every member voices remaining reservations before the meeting “settles” into a decision.
Sociocracy (consent-based governance): a decision proceeds not when everyone agrees, but when no one has a paramount objection. The distinction matters - consent is not consensus. “I wouldn’t choose this, but I can live with it and it won’t harm the organization” is consent. The concern round captures that gradient.
Special operations after-action reviews: rank is explicitly suspended. The most junior operator’s observation about what went wrong carries equal structural weight to the commander’s. The protocol makes this suspension formal, not aspirational.

The power of the concern round is that it converts silence from ambiguous (does silence mean agreement or suppression?) to explicit (“none” is an active statement, not a default). Over time, the decision graph shows which topics produce the most concerns, which people’s concerns most often predict problems, and which decisions had zero concerns and still failed (the groupthink signature).

Why this layer matters for Brand24 specifically

Brand24 is a data company. It sells signal detection to its customers. Applying the same principle internally - detecting weak signals in your own decision-making process before they become expensive mistakes - is not just good practice. It’s the thesis of the company applied to itself.

The system has three challenge layers that work together:

Layer	What it catches	Source
AI Adversary Test	Logical flaws, missing data, stale assumptions	Automated, runs on every card
Bias Scan	Cognitive bias patterns in the decision text	Automated, flags without blocking
Delegation Poker	Misalignment on who should be deciding	Human protocol, Tier 2 only
Red Team	Experience-based objections the data can’t show	Human role, rotated weekly
Concern Round	Suppressed doubt that would otherwise stay silent	Human protocol, before finalizing
PARDES Five-Reader	Emergent insight from combining all angles	Structured analysis, highest-stakes only

The two-tier system matches how decisions actually flow: most are fast and need only the AI challenge (Tier 1, 90 seconds). Some are consequential and need the full human + AI toolkit (Tier 2, 30-60 minutes). The system knows the difference and offers the right depth at the right moment.

The Process Change (the hard part)

Technology is 20% of this. The other 80% is a single habit change:

Every decision gets a card. No exceptions. 60 seconds.

This is enforced not by policy memos but by making the card easier than the alternative. The alternative is: someone asks the same question in 3 months, nobody remembers, the team re-discusses for 45 minutes, and reaches the same conclusion. The card costs 60 seconds. The re-discussion costs 45 minutes times 4 people. The math does the enforcement.

Three process rules (proven in 18 months of daily use):

Search before decide. Before opening a new discussion, search the decision log. If a prior decision exists, the conversation starts from “should we change this?” not “what should we do?” This alone eliminates 30-40% of repeated discussions.
One decision, one card. Not meeting minutes. Not a summary doc. One card per decision. This prevents the “it’s somewhere in the Q1 planning doc” problem. Atomic decisions are findable. Paragraph-length summaries are not.
Expiry dates are mandatory. Every decision either has a review date or is marked permanent. No decision lives in undefined limbo. When the review date arrives, the system surfaces it. The team either reaffirms, revises, or kills it. This prevents institutional inertia from masquerading as institutional knowledge.

90-Day Roadmap

Week	Action	Outcome
1-2	Build Slack `/decision` workflow. Deploy to one team (suggest: product or engineering).	Capture mechanism live. First 20-30 decision cards logged.
3-4	Collect feedback. Adjust card fields. Deploy to 2-3 more teams.	~100 cards. First “I found the answer in the decision log” moments.
5-8	Connect to AI retrieval layer (Claude API or equivalent). Enable contextual suggestions in Slack.	Search and suggestions active. Repeated-discussion rate measurable.
9-12	Build topic graph. Generate onboarding digests. First knowledge audit (what topics have decisions? what topics don’t?).	Full system operational. Baseline metrics for knowledge retention.

What This Is Not

This is not a wiki project. Wikis are where knowledge goes to die, maintained by the one person who cares until they leave.

This is not an AI chatbot project. Chatbots answer questions about knowledge that has already been captured. This system captures the knowledge in the first place.

This is a decision infrastructure project. It makes the organization’s decisions visible, searchable, and persistent - regardless of who made them, where the conversation happened, or whether the person who made them still works here.

The AI is the retrieval layer, not the capture layer. Humans make the decisions. The system makes the decisions findable. That distinction is the whole architecture.

Beyond Internal Use: The Productization Opportunity

Everything described above solves Brand24’s internal knowledge problem. But it also creates something Brand24 could sell.

Brand24 already sells signal detection to marketers: “here’s what the internet is saying about your brand.” The decision intelligence toolkit is the adjacent product: “here’s how to make better decisions about what to do with those signals.”

Every company that uses Brand24 for social listening faces the same downstream problem - the signals arrive, meetings happen, decisions are made, and three months later nobody remembers what was decided or why. Brand24 captures the external signals. The decision toolkit captures the internal response. Together, they close the loop from signal to action to institutional memory.

What this could look like as a product feature:

Brand24 customer receives a spike in negative sentiment about a product launch
The team discusses response options in Slack
/decision captures the response strategy with alternatives considered
Adversary test challenges: “Your competitor had a similar crisis last quarter and chose option B - here’s what happened”
Six months later, the next product launch team searches “how did we handle the sentiment spike?” and gets the Decision Card, not a Slack archive

The moat: every tool on the market helps companies find past decisions (Notion, Guru, Confluence). None of them help companies make better decisions at the moment of deciding. The adversary test, bias scan, PARDES engine, and structured dissent protocols - that is the layer nobody has built. The product is the challenge layer, not the archive.

Brand24 is uniquely positioned to build this because:

You already have the customer base (companies that need to act on intelligence)
You already have the AI infrastructure (NLP, sentiment analysis, signal detection)
The internal deployment becomes the proof-of-concept for the external product
The pricing model extends naturally: social listening (current) + decision intelligence (new tier)

This turns the knowledge management project from a cost center into a product R&D investment.

Built on patterns from 18 months of human-AI partnership research (RAZEM framework), a live semantic knowledge graph with 20,000+ entries (MemPalace), and a decision log with 200+ structured entries including explicit alternatives, confidence scores, and second-choice tracking. The protocols draw from Management 3.0 (Delegation Poker for authority clarity), US Army Red Team doctrine (structural dissent as an assignment, not a personality trait), Quaker decision-making (the testing-concerns phase), sociocratic consent governance, and special operations after-action review (rank suspension during evaluation). The “alternatives + second choice + confidence” card design comes from a log I’ve maintained daily since November 2025 - the most valuable field turned out to be not “what I decided” but “what I almost decided instead and why I didn’t.” This proposal scales that insight to organizational level.