Open Source Python MIT

Ensoul

Name: Ensoul
Author: Knowlyr

The Operating System for Digital Civilization's Organizational Layer

View on GitHub → PyPI →

★ 1 ⑂ 0 Updated 2026-04-17

AI workforce engine — Effective Agent = Identity + Experience + Deliberation. Soul identity system with persistent persona and auto-versioning, 16-module memory ecosystem covering storage, retrieval, semantics, and archival, 9 deliberation modes with 4 dialectical templates to break groupthink. Supports 7 LLM providers, multi-tenant isolation, and multi-channel reach via Feishu, WeCom, and Web.

Soul Identity System 16 Memory Modules 9 Deliberation Modes

Quick Start

Install

pip install ensoul[mcp]

Usage

# CLI
ensoul list
ensoul run code-reviewer main

Documentation

The Agent Paradox

A single AI Agent is impressive. It can write code, search the web, reason through complex problems, and use dozens of tools. But the moment you need a team of Agents to operate continuously — to build software across sprints, to make decisions that compound, to learn from mistakes and never repeat them — every organizational problem humanity has faced comes roaring back.

Amnesia: the agent that debugged a subtle race condition last Tuesday has no memory of it on Wednesday. Groupthink: three agents asked to review a design will converge on the same polite consensus, suppressing the dissent that would have caught the flaw. Identity fragmentation: the "senior engineer" persona is rebuilt from scratch each session, its personality drifting with temperature randomness. Governance vacuum: no one tracks which agent can deploy to production, which decisions need human approval, or whether last month's estimate of "two days" actually took eleven.

These are not implementation bugs. They are structural inevitabilities. Any group of intelligent agents that collaborates over time must reinvent organization. Humanity took millennia. We need to be faster.

Core Thesis

Existing multi-agent frameworks share a common, usually unstated assumption:

$$\text{Agent} = \text{Model} + \text{Tools} + \text{Prompt}$$

This is a "no-organization assumption." It treats each agent as a stateless function call — powerful in isolation, but structurally incapable of sustaining collaborative work over time. ensoul proposes an alternative formulation:

$$\text{Effective Agent} = \text{Identity} + \text{Experience} + \text{Deliberation}$$

These are not our invention. They are decades of organizational research — from cognitive psychology to management science — formalized into computable, version-controlled, protocol-native specifications.

Missing Element	Production Failure Mode	Research Basis	ensoul's Implementation
Persistent Identity	Personality rebuilt from scratch each session; unpredictable behavior	Personal identity theory (Parfit, 1984)	Soul system + declarative specs
Experiential Learning	Same mistakes repeated; no improvement from failure	Ebbinghaus (1885); RLHF (Christiano et al., 2017)	16-module memory ecosystem + evaluation loop + Skills auto-trigger
Cognitive Conflict	Groupthink; agents complement rather than challenge; declining decision quality	Janis (1972); Stasser & Titus (1985); Nemeth (1994)	9 dialectical modes + cognitive conflict constraints
Protocol Neutrality	Agent definitions locked to specific SDKs; migration cost $\propto$ definition complexity	Infrastructure as Code (Morris, 2016)	MCP-native, declarative YAML/Markdown

ensoul is not another orchestration framework. It is an operating system for the organizational layer of digital civilization — formalizing millennia of human organizational wisdom into AI-executable declarative specifications. 40 MCP tools, 3 transport protocols, 7 LLM providers, multi-channel reach via Feishu, WeCom, and Web.

Formal Framework

Employee Specification

Each AI employee is a declarative specification $e \in \mathcal{E}$, decoupled from code, version-trackable, and IDE-agnostic:

$$e = \langle \text{soul}, \text{name}, \text{model}, \text{tools}, \text{prompt}, \text{args}, \text{output}, \text{skills} \rangle$$

Where:

$\text{soul} \in \Sigma^*$ — Soul configuration (Markdown), defining the employee's persistent identity, personality, and behavioral principles; auto-versioned
$\text{model} \in \mathcal{M}$ = {claude-*, gpt-*, deepseek-*, kimi-*, gemini-*, glm-*, qwen-*} — Unified routing across 7 providers
$\text{tools} \subseteq \mathcal{T}$ — Available tool set, constrained by PermissionPolicy
$\text{prompt}: \Sigma^* \to \Sigma^*$ — Markdown template function with variable substitution and context injection
$\text{skills} \subseteq \mathcal{S}$ — Auto-trigger rule set defining scene-matching conditions and memory-loading strategies

Structured Dialectical Deliberation

The deliberation process is formalized as a 4-tuple $D = \langle P, R, \Phi, \Psi \rangle$:

Symbol	Definition	Description
$P = {p_1, \ldots, p_n}$	Participant set	$p_i = (\text{employee}, \text{role}, \text{stance}, \text{focus})$
$R = [r_1, \ldots, r_k]$	Round sequence	$r_j \in$ {`round-robin`, `cross-examine`, `steelman-then-attack`, `debate`, `vote`, ...}
$\Phi$	Disagreement constraint function	$\text{must_challenge}(p_i) \subseteq P \setminus {p_i}$; $\text{max_agree_ratio}(p_i) \in [0, 1]$
$\Psi$	Tension seed set	Pre-seeded controversy points that force topic-space diversification

Key constraint: When $\Phi$ defines $\text{max_agree_ratio}(p_i) = \rho$, participant $p_i$ may not agree with others' views in more than proportion $\rho$ of the entire discussion, forcing cognitive conflict rather than groupthink. This corresponds to the Devil's Advocacy method from organizational decision research (Schwenk, 1990).

Memory Evolution Model

Each memory $m$'s effective confidence decays over time, following the exponential model of Ebbinghaus's forgetting curve:

$$C_{\text{eff}}(t) = C_0 \cdot \left(\frac{1}{2}\right)^{t / \tau}$$

Where $C_0$ is initial confidence (default 1.0), $t$ is memory age in days, and $\tau$ is half-life (default 90 days). Retrieval ranks by $C_{\text{eff}}$; memories below threshold $C_{\min}$ are automatically culled.

Semantic retrieval uses hybrid vector-keyword scoring:

$$\text{score}(q, m) = \alpha \cdot \cos(\mathbf{v}_q, \mathbf{v}_m) + (1 - \alpha) \cdot \text{keyword}(q, m), \quad \alpha = 0.7$$

Correction chains implement cognitive self-correction, corresponding to a computational model of memory reconsolidation: $\text{correct}(m_{\text{old}}, m_{\text{new}})$ marks $m_{\text{old}}$ as superseded ($C \leftarrow 0$) and creates a new correction-type entry ($C \leftarrow 1.0$).

Evaluation Feedback Loop

Drawing on the core mechanism of RLHF — human feedback directly shaping agent behavior (Christiano et al., 2017):

track(employee, category, prediction) → Decision d
    │
    ▼  Execute + observe actual outcome
evaluate(d, outcome, evaluation) → MemoryEntry m_correction
    │
    ▼  m_correction auto-injected into the employee's subsequent inference context
employee.next_inference(context ∪ {m_correction})

Three decision categories: estimate / recommendation / commitment. Evaluation conclusions are automatically written as correction entries into persistent memory, forming a decide → execute → review → improve loop.

Architecture

graph LR
    E["Employee Spec<br/>(YAML + Markdown)"] -->|Prompts| S["MCP Server<br/>stdio / SSE / HTTP"]
    E -->|Resources| S
    E -->|Tools| S
    S -->|stdio| IDE["AI IDE<br/>(Claude / Cursor)"]
    S -->|SSE / HTTP| Remote["Remote Client<br/>Webhook / API"]
    IDE -->|agent-id| ID["knowlyr-id<br/>Identity Runtime"]
    ID -->|GET prompt| E
    E -->|sync push| ID

    style E fill:#1a1a2e,color:#e0e0e0,stroke:#444
    style S fill:#0969da,color:#fff,stroke:#0969da
    style IDE fill:#2da44e,color:#fff,stroke:#2da44e
    style Remote fill:#8b5cf6,color:#fff,stroke:#8b5cf6
    style ID fill:#e5534b,color:#fff,stroke:#e5534b

Layered Architecture

Layer	Modules	Responsibilities
Specification	Parser · Discovery · Models · Soul Store	Declarative employee definition parsing, YAML/Markdown dual-format, Soul configuration (auto-versioning + history tracking), 6-layer priority discovery
Protocol	MCP Server · Skill Converter · MCP Gateway	40 Tools + Prompts + Resources, stdio/SSE/HTTP triple-protocol, external MCP tool dynamic injection
Skills	Trigger Engine · Action Executor	Semantic/keyword/always three trigger modes, auto-load related memories into prompt, trigger rate statistics and history
Deliberation	Discussion Engine	9 structured interaction modes, 4 built-in round templates, cognitive conflict constraints, topological sort execution plans
Orchestration	Pipeline · Route · Task Registry	Parallel/sequential/conditional/loop orchestration, checkpoint resume, multi-model routing
Memory	Memory Store · Semantic Index · PostgreSQL	16 specialized modules, remote persistence, semantic search, exponential decay, importance ranking, draft/archive/shared/feedback, cross-employee pattern sharing, multi-backend embedding degradation
Evaluation	Evaluation Engine · Scoring · Cron	Decision tracking, retrospective evaluation, auto-correction memory, overdue decision scanning, quality scoring
Execution	Providers · Output Sanitizer · Cost Tracker · Runtime Tools	7 providers unified invocation, retry/fallback/per-task cost metering, dual-layer output sanitization, 30+ runtime tools
Integration	ID Client · Webhook · Cron · Feishu · WeCom · GitHub	Identity federation (circuit breaker), Feishu multi-bot / WeCom / GitHub multi-channel event routing, scheduled tasks (patrol/review/KPI/knowledge weekly)
Observability	Trajectory · Session · Metrics · Events · Audit	Zero-intrusion trajectory recording (contextvars), session system, permission matrix queries, tool call audit logs, CI post-deploy audit, Feishu alerting
Wiki	Wiki Client · Attachment Store	Knowledge base space management, document CRUD, attachment upload/read/delete, AI-friendly views
Governance	Classification · Multi-Tenant · Authority Overrides	4-level information classification, tenant-scoped data isolation, adaptive authority degradation/restoration
CLI	`cli/` modular package (8 submodules)	30+ commands: employee · pipeline · route · discuss · memory · eval · server · ops, lazy registration

MCP Primitive Mapping

MCP Primitive	Purpose	Count
Prompts	Each employee = one callable prompt template with typed parameters	1 per employee
Resources	Raw Markdown definitions, directly readable by AI IDEs	1 per employee
Tools	Employee/soul/deliberation/pipeline/memory/evaluation/permissions/audit/metrics/config/wiki	40

40 MCP Tools in detail

Employee Management (7)

Tool	Description
`list_employees`	List all employees (filterable by tag)
`get_employee`	Get complete employee definition
`run_employee`	Generate executable prompt
`create_employee`	Create new AI employee (with avatar generation)
`get_work_log`	View employee work logs
`get_soul`	Read employee soul configuration (soul.md)
`update_soul`	Update employee soul configuration (auto-versioning + history tracking)

Deliberation & Pipeline (8)

Tool	Description
`list_discussions`	List all discussions
`run_discussion`	Generate discussion prompt (supports orchestrated mode)
`create_discussion`	Create discussion configuration
`update_discussion`	Update discussion configuration
`list_pipelines`	List all pipelines
`run_pipeline`	Execute pipeline (prompt-only or execute mode)
`create_pipeline`	Create pipeline configuration
`update_pipeline`	Update pipeline configuration

Memory & Evaluation (7)

Tool	Description
`add_memory`	Add persistent memory for an employee (classification, tags, information level, TTL)
`query_memory`	Query employee memories (semantic search + keyword hybrid)
`track_decision`	Record a decision for future evaluation (estimate / recommendation / commitment)
`evaluate_decision`	Evaluate a decision; experience auto-written to employee memory
`list_overdue_decisions`	List overdue unevaluated decisions
`list_meeting_history`	View discussion meeting history
`get_meeting_detail`	Get full record of a specific meeting

Observability & Governance (5)

Tool	Description
`list_tool_schemas`	List all available tool definitions (filterable by role)
`get_permission_matrix`	View employee permission matrix and policies
`get_audit_log`	Query tool call audit logs
`get_tool_metrics`	Query tool usage statistics (call counts, success/failure, average latency)
`query_events`	Query unified event stream (filter by type/name/time range)

Configuration & Project (4)

Tool	Description
`put_config`	Write to KV store (cross-machine sync)
`get_config`	Read from KV store
`list_configs`	List all keys under a given prefix
`detect_project`	Detect project type, framework, package manager, test framework

Wiki Knowledge Base (9)

Tool	Description
`wiki_list_spaces`	List all Wiki spaces
`wiki_list_docs`	List documents in a space
`wiki_read_doc`	Read document content (supports AI-friendly view)
`wiki_create_doc`	Create a Wiki document
`wiki_update_doc`	Update an existing Wiki document
`wiki_upload_attachment`	Upload attachment (local file or base64)
`wiki_read_attachment`	Read attachment (text content + signed URL)
`wiki_list_attachments`	List attachments (filter by space/document/MIME type)
`wiki_delete_attachment`	Delete attachment

Transport Protocols

ensoul mcp                                # stdio (default, local IDE)
ensoul mcp -t sse --port 9000             # SSE (remote connection)
ensoul mcp -t http --port 9001            # Streamable HTTP
ensoul mcp -t sse --api-token SECRET      # Enable Bearer authentication

Against Stateless Identity

5.1 Declarative Employee Specification

By analogy with Infrastructure as Code (Morris, 2016) — Terraform uses declarative HCL to define infrastructure, Kubernetes uses YAML to define desired service state — ensoul uses declarative specifications to define an AI employee's capability boundary. Configuration is separated from prompts, version-trackable, and IDE-agnostic.

Directory format (recommended):

security-auditor/
├── employee.yaml    # Metadata, parameters, tools, output format
├── prompt.md        # Role definition + core instructions
├── soul.md          # Soul: persistent identity, personality, behavioral principles
├── workflows/       # Scenario-specific workflows
│   ├── scan.md
│   └── report.md
└── adaptors/        # Project-type adaptors (python / nodejs / ...)
    └── python.md

# employee.yaml
name: security-auditor
display_name: Security Auditor
character_name: Alex Morgan
version: "1.0"
model: claude-opus-4-6
model_tier: claude              # Model tier inheritance for cost/capability grouping
tags: [security, audit]
triggers: [audit, sec]
tools: [file_read, bash, grep]
context: [pyproject.toml, src/]
auto_memory: true               # Auto-save task summaries to persistent memory
kpi:                             # KPI metrics (auto-evaluated in weekly reports)
  - OWASP coverage
  - Recommendation actionability
  - Zero false-positive rate
args:
  - name: target
    description: Audit target
    required: true
  - name: severity
    description: Minimum severity level
    default: medium
output:
  format: markdown
  filename: "audit-{date}.md"

Single-file format: For simple employees — YAML frontmatter + Markdown body.

6-Layer Discovery with Priority:

Priority	Location	Description
Highest	`private/employees/`	Repository-local custom employees
High	Database (remote)	Server-managed employee definitions
Medium-High	`.claude/skills/`	Claude Code Skills compatibility layer
Medium	`.crew/employees/`	ensoul workspace employees
Low	Package built-ins	Default employees
Fallback	Organization defaults	`organization.yaml` model_defaults

Smart context (--smart-context): Automatically detects project type (Python / Node.js / Go / Rust / Java), framework, package manager, and test framework, injecting adaptation information into prompts.

Built-in employees

Employee	Trigger	Purpose
`product-manager`	`pm`	Requirements analysis, user stories, roadmaps
`code-reviewer`	`review`	Code review: quality, security, performance
`test-engineer`	`test`	Write or supplement unit tests
`refactor-guide`	`refactor`	Code structure analysis, refactoring recommendations
`doc-writer`	`doc`	Documentation generation (README / API / CHANGELOG)
`pr-creator`	`pr`	Analyze changes, create Pull Requests

Prompt variable substitution

Variable	Description
`$target`, `$severity`	Named parameter values
`$1`, `$2`	Positional parameters
`{date}`, `{datetime}`	Current date/time
`{cwd}`, `{git_branch}`	Working directory / Git branch
`{project_type}`, `{framework}`	Project type / framework
`{test_framework}`, `{package_manager}`	Test framework / package manager

5.2 Soul — Persistent Identity

Each AI employee possesses an independent soul configuration (soul.md) — defining their persistent identity, personality traits, and behavioral principles. The Soul is the only component in the employee specification that is cross-session persistent and auto-versioned, solving the "identity fragmentation" problem in agent frameworks: rebuilding personality from scratch each session vs. restoring a complete identity from a soul file.

$$\text{soul}(e) = \langle \text{identity}, \text{principles}, \text{style}, \text{boundaries} \rangle$$

Feature	Description
Auto-versioning	Each update automatically increments the version number, preserving complete history
Change tracking	Records the updater and timestamp for every modification
5-layer loading	Soul (L0) → Global instructions (L1) → Skills (L1.5) → Memory (L2) → Wiki (L3)
Multi-tenant isolation	Soul data scoped per tenant; updates do not cross tenant boundaries
MCP tools	`get_soul` / `update_soul` — any AI IDE can read and update employee souls

The distinction between Soul and memory: memory is accumulated experience (decays, can be corrected); Soul is identity definition (does not decay, requires deliberate updates). The analogy is human personality vs. memory — personality is stable while memory flows.

What this points to: The Soul system represents a paradigm shift from tool-centric to entity-centric agent design. Traditional frameworks define agents by what they do (tools, prompts). ensoul defines agents by who they are (identity, principles, boundaries). This is the difference between hiring a contractor with a task list and employing a colleague with professional identity. As AI workforces scale, this distinction will determine whether organizations can maintain behavioral consistency across thousands of agent instances.

5.3 Organization Governance

Declarative organizational structure defines team groupings, permission levels, and collaboration routing templates — grounding delegation decisions in policy rather than AI guesswork. The permission system features adaptive degradation:

# private/organization.yaml
model_defaults:
  default_model: claude-sonnet-4-5
  default_temperature: 0.7
  tier_overrides:
    claude: { model: claude-opus-4-6, temperature: 0.5 }
    fast: { model: claude-sonnet-4-5, temperature: 0.7 }

teams:
  engineering:
    label: Engineering
    members: [code-reviewer, test-engineer, backend-engineer]
  data:
    label: Data
    members: [data-engineer, dba, mlops-engineer]

authority:
  A:
    label: Autonomous execution
    members: [code-reviewer, test-engineer, doc-writer]
  B:
    label: Requires confirmation
    members: [product-manager, solutions-architect]
  C:
    label: Context-dependent
    members: [backend-engineer, devops-engineer]

routing_templates:
  code_change:
    steps:
      - role: implement
        team: engineering
      - role: review
        employee: code-reviewer
      - role: test
        employees: [test-engineer, e2e-tester]

Feature	Description
Three-level authority	A (autonomous) / B (requires confirmation) / C (context-dependent); delegation lists auto-annotated
Adaptive degradation	3 consecutive task failures → authority downgraded from A/B to C, persisted to JSON
Model defaults	Organization-wide model/temperature defaults with tier-based overrides
Multi-tenant	Tenant-scoped organization configs; each tenant maintains independent authority policies
Routing templates	`route` tool expands templates into `delegate_chain` with multi-process rows, CI step annotations, human judgment nodes, repository bindings
KPI measurement	Each employee declares KPI metrics; weekly cron auto-evaluates with A/B/C/D ratings
Manual restoration	One-click API to restore downgraded authority
Information classification	4-level system (public / internal / restricted / confidential) applied to memories, outputs, and governance decisions

Against Amnesia

6.1 Memory Ecosystem (16 Modules)

Ebbinghaus (1885) demonstrated that memory strength decays exponentially over time, and that spaced repetition effectively counters forgetting. ensoul brings this cognitive science principle into the knowledge persistence mechanism of agent systems — not as a metaphor, but as an implemented mathematical model.

16 specialized memory modules:

Module	Category	Description
Core storage	`decision`	Decision records ("Chose JWT over session-based auth")
	`estimate`	Estimation records ("CSS split estimated at 2 days")
	`finding`	Discovery records ("main.css has 2057 lines, exceeds maintainability threshold")
	`correction`	Correction records ("CSS split actually took 5 days; underestimated cross-module dependencies")
	`pattern`	Work patterns ("API changes must synchronize SDK documentation") — auto-shared across employees
Lifecycle	Draft	Memory drafts pending approval (draft → approve/reject)
	Archive	Archived memories with restoration capability
	Shared pool	Cross-employee visible shared memories
Retrieval	Semantic index	Vector-keyword hybrid scoring ($\alpha = 0.7$)
	Importance ranking	1-5 importance weight with minimum-importance filtering
	Access tracking	`last_accessed` timestamp, auto-updated on query
	Confidence decay	Exponential decay with configurable half-life ($\tau = 90$ days)
Intelligence	Correction chains	Reconsolidation: old memory $C \leftarrow 0$, new correction $C \leftarrow 1.0$ with provenance link
	Deduplication	Semantic similarity detection prevents redundant entries
	Classification	4-level information classification (public/internal/restricted/confidential)
	Recommendations	Context-aware memory suggestions based on current task

Storage: PostgreSQL as primary persistent store, supporting semantic search + multi-dimensional filtering (category, tags, classification, importance, tenant).

Embedding degradation chain (graceful degradation):

OpenAI text-embedding-3-small → Gemini text-embedding-004 → TF-IDF (zero-dependency fallback)

Any upstream unavailability triggers automatic fallback to the next tier, ensuring semantic search works even without API keys.

Cross-employee work patterns (pattern): Reusable patterns distilled from individual experience. Automatically marked as shared (shared: true), with configurable trigger conditions (trigger_condition) and applicability scope (applicability). Other employees automatically receive matching patterns in relevant contexts.

Self-check learning loop: Via _templates/selfcheck.md, employees automatically output a self-check checklist after each task. The system extracts self-check results, writes them as correction memories, and auto-injects them on next execution — forming a execute → self-check → memorize → improve continuous learning loop.

What this points to: The 16-module memory ecosystem is not just a persistence layer — it is the beginning of institutional memory for AI organizations. Human organizations accumulate institutional knowledge through onboarding documents, post-mortems, and tribal knowledge. Most of this is lossy, unsearchable, and siloed. A memory system with semantic retrieval, exponential decay, correction chains, and cross-employee pattern sharing is what institutional memory looks like when it can be precisely engineered.

6.2 Evaluation Feedback Loop

Tracking decision quality and retrospectively evaluating outcomes, then automatically writing lessons learned into employee memory — functionally isomorphic to RLHF (Christiano et al., 2017): human preference feedback directly influences subsequent model behavior; here, human evaluation results directly influence subsequent inference context.

graph LR
    D["track()<br/>Record decision"] --> E["Execute"]
    E --> O["Observe<br/>actual outcome"]
    O --> V["evaluate()<br/>Retrospective"]
    V --> M["correction<br/>Write to memory"]
    M --> I["Next inference<br/>Auto-inject"]
    I --> D

    style D fill:#0969da,color:#fff,stroke:#0969da
    style V fill:#8b5cf6,color:#fff,stroke:#8b5cf6
    style M fill:#2da44e,color:#fff,stroke:#2da44e

Three decision categories: estimate / recommendation / commitment. Evaluation conclusions are automatically written as correction entries into the employee's persistent memory and auto-injected during subsequent inference — the agent updates its cognition from its own decision errors.

Feature	Description
Overdue scanning	`list_overdue_decisions` automatically surfaces decisions past their deadline without evaluation, preventing loop breakage
Quality scoring	Output-end `{"score": N}` JSON parsing, correlated to task results for ROI analysis
Cron integration	Scheduled overdue scans with automatic Feishu notifications for unevaluated decisions

# Record a decision
ensoul eval track pm estimate "CSS split will take 2 days"

# Evaluate (conclusion auto-written to memory)
ensoul eval run <id> "Actually took 5 days" \
  --evaluation "Underestimated cross-module dependency complexity; future ×2.5"

6.3 Skills — Context-Aware Auto-Trigger

Skills solve the last mile problem of the evolution layer: memories accumulate, but how do they get injected at the right moment? Human experts develop "conditioned reflexes" — seeing SQL triggers thoughts of injection risk, seeing a deadline recalls last time's underestimate. These are automatic trigger patterns formed through internalized experience. Skills make this mechanism computational:

$$\text{trigger}(task, s) = \begin{cases} \text{execute}(s.\text{actions}) & \text{if } \text{match}(task, s.\text{condition}) \ \emptyset & \text{otherwise} \end{cases}$$

Three trigger modes:

Mode	Matching Method	Typical Scenario
`semantic`	Semantic similarity $\geq$ threshold	"Write API" → load API-related pitfall memories
`keyword`	Keyword hit	"Deploy" → load deployment checklist
`always`	Triggered on every execution	Load shared knowledge base

Trigger → Load → Inject flow:

Employee receives task → Server checks Skills trigger conditions → On match, execute Actions
(query_memory / load_checklist / read_wiki) → Inject results into prompt extra_context
→ Employee "automatically recalls" relevant experience

Full-channel coverage: Feishu @employee, WeCom conversations, Web interface, API calls, Claude Code /pull — tasks from any channel pass through Skills checking, ensuring experience injection has no blind spots.

Feature	Description
Priority	critical > high > medium > low; higher-priority Skills execute first
Action types	`query_memory` (retrieve memories) / `load_checklist` (load checklists) / `read_wiki` (read docs) / `custom`
Classification awareness	Skills respect information classification levels; restricted/confidential memories only inject when channel clearance matches
Trigger statistics	Trigger rate, hit rate, history records — supports continuous optimization of trigger conditions
API authentication	Read/execute are open; create/update/delete require admin

6.4 Information Classification

Every piece of organizational knowledge is not equally shareable. A customer's contract terms, an employee's performance review, a security vulnerability report — these require different handling than a coding convention or a meeting summary. ensoul implements a 4-level information classification system:

Level	Scope	Example
`public`	Shareable externally	Product documentation, public API specs
`internal`	Organization-wide (default)	Coding standards, architecture decisions
`restricted`	Domain-isolated	HR records (domain: `['hr']`), financial data (domain: `['finance']`)
`confidential`	Maximum restriction	Security vulnerabilities, credential-related findings

Classification is enforced at every layer: memory storage, Skills injection, channel output, and audit logging. Domain isolation ensures that restricted memories tagged with domain: ['hr'] are invisible to engineering-focused queries, even within the same tenant.

Against Groupthink

7.1 Structured Dialectical Deliberation

The core challenge in multi-agent collaboration is maintaining epistemic diversity. Stasser & Titus (1985) demonstrated experimentally that in unstructured group discussions, commonly-known information is discussed at significantly higher rates than individually-held unique information, causing optimal decisions to be systematically overlooked. Nemeth (1994) found that even incorrect minority opinions, when persistently expressed, improve majority group decision quality — because they force the majority to more carefully examine their own assumptions.

ensoul implements 9 structured interaction modes, each imposing different argumentative constraints on participants:

Mode	Mechanism
`round-robin`	Equal-weight expression; prevents discourse power imbalance
`challenge`	Each participant must raise evidence-based challenges to at least one other's conclusions
`response`	Structured response; vague evasion prohibited; must explicitly accept/partially accept/rebut
`cross-examine`	Three-dimensional deep examination: factual challenge / logical extrapolation / alternative proposals
`steelman-then-attack`	First construct the strongest form of the opponent's argument (steel-manning), then attack residual weaknesses
`debate`	Structured pro/con argumentation requiring specific facts and data citations
`brainstorm`	Suspend judgment; maximize creative space
`vote`	Force explicit position + brief rationale
`free`	Unconstrained open exchange

4 Built-in Round Templates:

Template	Structure	Best For
`standard`	round-robin → challenge → response → vote	General decisions
`brainstorm-to-decision`	brainstorm → cross-examine → vote	Creative exploration then convergence
`adversarial`	debate → steelman-then-attack → vote	High-stakes decisions requiring stress-testing
`deep-dive`	round-robin → cross-examine → response → challenge → vote	Complex technical assessments

Dialectical constraints — computational implementation of the Devil's Advocacy methodology (Schwenk, 1990):

stance — Pre-assigned position, forcing participants to argue from a specific perspective
must_challenge — Must challenge designated participants, countering shared information bias
max_agree_ratio — Disagreement quota $\rho_{max} \in [0, 1]$, quantitatively controlling cognitive conflict density
tension_seeds — Controversy seed injection, ensuring topic space covers critical tension dimensions
min_disagreements — Minimum disagreements per round, quantifying debate output

Background injection modes: Discussion context can be auto-populated from project files, recent memories, or Wiki documents, ensuring deliberation is grounded in current state rather than abstract reasoning.

Discussion → Execution bridge: Setting action_output: true auto-generates a structured ActionPlan JSON, which pipeline_from_action_plan() converts into an executable Pipeline via dependency topological sort.

What this points to: The 9 modes and their constraints represent the first attempt at quantifiable cognitive diversity in AI systems. When you can specify that a participant must disagree at least 40% of the time, or that every conclusion must survive steel-manning before acceptance, you have moved from hoping for good decisions to engineering the conditions that produce them. This is what organizational decision-making looks like when you can actually measure and control the epistemic diversity of the group.

Discussion YAML example

name: architecture-review
topic: Review $target design
goal: Produce improvement decisions
mode: auto
participants:
  - employee: product-manager
    role: moderator
    focus: Requirements completeness
    stance: Bias toward user experience
  - employee: code-reviewer
    role: speaker
    focus: Security
    must_challenge: [product-manager]
    max_agree_ratio: 0.6
tension_seeds:
  - Security vs development velocity
rounds:
  - name: Opening positions
    interaction: round-robin
  - name: Cross-examination
    interaction: cross-examine
    require_direct_reply: true
    min_disagreements: 2
  - name: Decision
    interaction: vote
output_format: decision

# Pre-defined discussion
ensoul discuss run architecture-review --arg target=auth.py

# Ad-hoc discussion (no YAML needed)
ensoul discuss adhoc -e "code-reviewer,test-engineer" -t "auth module quality"

# Orchestrated mode: each participant reasons independently
ensoul discuss run architecture-review --orchestrated

7.2 Pipeline Orchestration

Multi-employee DAG (directed acyclic graph) orchestration with four step types:

Step Type	Description
Sequential	Serial execution; `{prev}` references previous step output
Parallel Group	`asyncio.gather` concurrent execution, 600s timeout
Conditional	`contains` / `matches` / `equals` conditional branching
Loop	Iterative execution with state passing between iterations

Multi-provider routing:

Provider	Model Prefix	Examples
Anthropic	`claude-`	`claude-opus-4-6`, `claude-sonnet-4-5`
OpenAI	`gpt-`, `o1-`, `o3-`	`gpt-4o`, `o3-mini`
DeepSeek	`deepseek-`	`deepseek-chat`, `deepseek-reasoner`
Moonshot	`kimi-`, `moonshot-`	`kimi-k2.5`
Google	`gemini-`	`gemini-2.5-pro`
Zhipu	`glm-`	`glm-4-plus`
Alibaba	`qwen-`	`qwen-max`

Routes to the corresponding provider API by model name prefix; supports primary model + fallback.

Feature	Description
Output passing	`{prev}` (previous step), `{steps.<id>.output}` (by ID reference)
Checkpoint resume	Resume from last completed step after mid-pipeline failure
Fallback	Auto-switch to backup model after primary retries exhausted
Mermaid visualization	Auto-generate flow diagrams from pipeline definitions

# Generate per-step prompts
ensoul pipeline run review-test-pr --arg target=main

# Execute mode: auto-invoke LLMs in sequence
ensoul pipeline run full-review --execute --model claude-opus-4-6

Infrastructure That Makes It Real

8.1 Runtime Tool Ecosystem

During execution (via webhook server or pipeline execute mode), AI employees have access to 30+ runtime tools beyond the 40 MCP specification tools:

Category	Tools	Description
Orchestration	`delegate_async`, `delegate_chain`, `check_task`, `list_tasks`, `organize_meeting`	Async delegation, chain delegation, task status queries, multi-employee meetings
Engineering	`agent_file_read`, `agent_file_grep`, `run_python`	Path-safe file operations, sandboxed Python execution
Communication	`send_feishu_message`, `find_free_time`	Feishu messaging, calendar availability queries
GitHub	`github_create_pr`, `github_list_prs`, `github_get_diff`	PR creation, listing, diff retrieval
Scheduling	`schedule_task`, `list_schedules`	Dynamic cron task management
Data	`query_data`	Fine-grained business data queries
Utilities	`run_pipeline`, `query_cost`	Pipeline triggering, cost summaries

The run_python tool executes arbitrary Python in a sandboxed environment — useful for data analysis, calculations, and format transformations without leaving the agent context.

8.2 Multi-Channel Reach

AI employees do not only work inside IDEs. Through Feishu and WeCom, employees respond directly to team needs in instant messaging:

Channel	Trigger Method	Description
Feishu	@employee name in message	Multi-bot architecture: each employee can be a separate Feishu bot; auto-routing to corresponding employee; Skills auto-trigger + memory injection
WeCom	@employee name in message	XML encryption + signature verification; multi-app support; employee offboarding auto-cleans bindings; periodic check-in messages
Web / API	HTTP POST	Standard REST API with SSE streaming output
Claude Code	`/pull employee-name`	MCP protocol invocation, local IDE interaction

All channels are unified through Skills trigger checking + output sanitization + audit logging, ensuring behavioral consistency. Channel-specific sanitization rules prevent internal reasoning traces from leaking to end users.

8.3 Multi-Tenant Isolation

ensoul supports full multi-tenant isolation for SaaS deployments:

Dimension	Isolation Level
Employee data	Tenant-scoped employee definitions, souls, and configurations
Memory	All memories tagged with `tenant_id`; queries never cross tenant boundaries
Configuration	KV store keys prefixed with tenant namespace
Skills	Trigger conditions and action results scoped per tenant
Audit logs	Complete per-tenant audit trail

Tenant resolution: Bearer token in API requests resolves to tenant context. MCP connections can bind to a specific tenant via --tenant-id. Feishu/WeCom integrations resolve tenant from app configuration.

8.4 MCP Gateway

ensoul can connect to external MCP servers, dynamically injecting their tools into employee specifications:

Feature	Description
External connection	Connect to any MCP-compatible server via stdio/SSE/HTTP
Circuit breaker	3 consecutive failures → 30s pause; prevents cascading failures
Tool whitelist	`PermissionPolicy` controls which external tools each employee can access
Credential management	External server API keys stored in encrypted configuration, never exposed in prompts
Audit integration	All external tool calls logged to the unified audit system

8.5 Cost-Aware Orchestration

Built-in per-model pricing table across 7 providers, with per-task cost calculation supporting aggregation by employee / model / time period for ROI per Decision analysis:

Provider	Model	Input ($/1M tokens)	Output ($/1M tokens)
Anthropic	claude-opus-4-6	15.00	75.00
Anthropic	claude-sonnet-4-5	3.00	15.00
OpenAI	gpt-4o	2.50	10.00
OpenAI	o3-mini	1.10	4.40
DeepSeek	deepseek-chat	0.27	1.10
Google	gemini-2.5-pro	1.25	10.00
Alibaba	qwen-max	0.80	2.00

Feature	Description
Per-task metering	Each execution auto-records input/output tokens + cost_usd
Quality pre-scoring	Parses output-end `{"score": N}` JSON, correlated to task results
Multi-dimensional aggregation	By employee / model / time period / trigger source
A/B testing	Primary model + fallback model, comparing cost-quality Pareto frontiers

8.6 Output Sanitization — Defense in Depth

LLM raw output may contain internal reasoning tags (<thinking>, <reflection>, <inner_monologue>) and tool call XML blocks — these are the model's "working drafts" that should not be exposed to end users. The Output Sanitizer implements dual-layer defense (defense in depth):

Defense Layer	Location	Responsibility
Source sanitization	`webhook_executor`	LLM return values sanitized before entering business logic
Exit sanitization	`webhook_handlers` · `webhook_feishu`	Messages sanitized again before sending to users/callbacks

Sanitization rules cover 5 tag pattern classes (regex matching + content removal), handling nested tags and multi-line residual whitespace. When either layer misses something, the other catches it — borrowing the defense-in-depth principle from network security (Schneier, 2000).

8.7 Trajectory Recording

Zero-intrusion trajectory recording via contextvars.ContextVar — no business code modifications required, automatically capturing agent reasoning, tool calls, execution results, and token consumption:

ensoul produces trajectories → agentrecorder standard format → knowlyr-gym PRM scoring → SFT / DPO / GRPO training

This is the data bridge connecting ensoul (collaboration layer) and knowlyr-gym (training layer) — real interaction trajectories generated during ensoul runtime can be directly used for agent reinforcement learning.

Feature	Description
Export formats	Standard JSON, agentrecorder format, CSV summary
Extraction	Tool call sequences, reasoning chains, decision points
Annotation	Human annotations attachable to trajectory segments
Session system	Trajectories grouped by session; session metadata includes trigger source, employee, duration, cost

8.8 Deployment & Operations

# Docker deployment
docker build -t ensoul .
docker run -p 8765:8765 -e API_TOKEN=secret ensoul

# CI/CD via GitHub Actions
git push origin main  # → auto-deploy via GitHub Actions

# Makefile shortcuts
make deploy           # Full deployment pipeline
make push             # Emergency bypass (direct push)
make test             # Run full test suite

# Health check
curl http://localhost:8765/health

# Post-deploy verification
scripts/audit-permissions.sh  # Auto-run in CI; Feishu alert on failure

Feature	Description
Docker support	Multi-stage build, minimal image size
CI/CD	GitHub Actions: test → build → deploy → audit
Health checks	`/health` endpoint; 10s startup grace period
Makefile	`deploy`, `push`, `test`, `lint`, `sync` targets
Scripts	`audit-permissions.sh`, `project-status.sh`, deployment verification
Heartbeat	60s periodic heartbeat to knowlyr-id
Task persistence	`.crew/tasks.jsonl`; survives restarts
Concurrency safety	`fcntl.flock` file locks + SQLite WAL mode

Quick Start

pip install ensoul[mcp]

# 1. List all available employees
ensoul list

# 2. Run a code review (auto-detect project type)
ensoul run review main --smart-context

# 3. Start a multi-employee structured discussion
ensoul discuss adhoc -e "code-reviewer,test-engineer" -t "auth module security"

# 4. Track a decision and evaluate it
ensoul eval track pm estimate "Refactoring will take 3 days"
# ... after execution ...
ensoul eval run <id> "Actually took 7 days" --evaluation "Underestimated cross-module deps"

# 5. View employee memories (including evaluation corrections)
ensoul memory show product-manager

MCP configuration (Claude Desktop / Claude Code / Cursor):

{
  "mcpServers": {
    "crew": {
      "command": "ensoul",
      "args": ["mcp"]
    }
  }
}

Once configured, AI IDEs can directly invoke code-reviewer for code review, test-engineer for writing tests, run_pipeline for multi-employee pipeline orchestration, and run_discussion for structured multi-employee deliberation.

Async Delegation & Meeting Orchestration

AI employees can delegate in parallel to multiple colleagues, or organize multi-person meetings for asynchronous discussion:

User → Jiang Moyan: "Have code-reviewer review the PR and test-engineer write tests simultaneously"

Jiang Moyan:
  ① delegate_async → code-reviewer (task_id: 20260216-143022-a3f5b8c2)
  ② delegate_async → test-engineer (task_id: 20260216-143022-b7d4e9f1)
  ③ "Both tasks are running in parallel"
  ④ check_task → view progress/results

Tool	Description
`delegate_async`	Async delegation, returns task_id immediately
`delegate_chain`	Sequential chain delegation; `{prev}` references previous step output
`check_task` / `list_tasks`	Query task status and results
`organize_meeting`	Multi-employee async discussion; each round `asyncio.gather` parallel inference
`schedule_task` / `list_schedules`	Dynamic cron scheduled tasks
`run_pipeline`	Trigger pre-defined pipeline (async execution)
`agent_file_read` / `agent_file_grep`	Path-safe file operations
`query_data`	Fine-grained business data queries
`find_free_time`	Feishu busy/free queries; find common availability across multiple people

Proactive patrol & autonomous operations: Via .crew/cron.yaml scheduled tasks:

Schedule	Description
Daily 9:00	Morning patrol — business data, to-dos, calendar, system status → Feishu briefing
Daily 23:00	AI diary — personal diary based on day's work and memories
Thursday 16:00	Team knowledge weekly — cross-team output + common issues + best practices → Feishu doc
Friday 17:00	KPI weekly — employee-by-employee ratings + anomaly auto-delegation (D-grade → HR follow-up)
Friday 18:00	Weekly retrospective — highlights, issues, next-week recommendations

Production Server

ensoul runs as an HTTP server, receiving external events and auto-triggering pipeline / employee execution:

pip install ensoul[webhook]
ensoul serve --port 8765 --token YOUR_SECRET

API Endpoints

Category	Path	Method	Description
Core	`/health`	GET	Health check (no auth required)
	`/metrics`	GET	Call/latency/token/error statistics
	`/cron/status`	GET	Cron scheduler status
Event Ingress	`/webhook/github`	POST	GitHub webhook (HMAC-SHA256 signature verification)
	`/webhook/openclaw`	POST	OpenClaw message events
	`/feishu/event`	POST	Feishu event callback (@employee triggers)
	`/wecom/event/{app_id}`	GET/POST	WeCom event callback
Execution	`/api/v1/pipelines/{pipeline_name}/run`	POST	Trigger pipeline (async/sync/SSE streaming)
	`/run/route/{name}`	POST	Trigger collaboration route
	`/api/v1/employees/{name}/run`	POST	Trigger employee (supports SSE streaming)
	`/api/v1/tasks/{task_id}`	GET	Query task status and results
Employee Management	`/api/employees`	GET/POST	List/create employees
	`/api/employees/{id}`	GET/PUT/DELETE	Employee CRUD
	`/api/employees/{id}/prompt`	GET	Employee capability definition (team, permissions, 7-day cost)
	`/api/employees/{id}/state`	GET	Runtime state (personality, memories, notes)
	`/api/employees/{id}/authority/restore`	POST	Restore auto-downgraded authority
Soul	`/api/souls`	GET	List all employee souls
	`/api/souls/{name}`	GET/PUT	Read/update soul configuration
Skills	`/api/employees/{name}/skills`	GET/POST	List/create Skills
	`/api/employees/{name}/skills/{skill}`	GET/PUT/DELETE	Skill CRUD
	`/api/skills/check-triggers`	POST	Check Skills trigger conditions
	`/api/skills/execute`	POST	Execute Skill actions
	`/api/skills/stats`	GET	Skill usage statistics
Memory	`/api/memory/*`	—	Full memory API (add/query/archive/draft/shared/semantic search/feedback)
Decisions	`/api/decisions/*`	—	Decision tracking/evaluation/batch scanning
Wiki	`/api/wiki/spaces`	GET	List Wiki spaces
	`/api/wiki/files/*`	—	Attachment upload/read/delete
Configuration	`/api/kv/*`	GET/PUT	KV store (cross-machine sync of CLAUDE.md, etc.)
	`/api/config/*`	—	Discussion/pipeline configuration CRUD
Governance	`/api/cost/summary`	GET	Cost aggregation
	`/api/permission-matrix`	GET	Permission matrix
	`/api/audit/trends`	GET	Audit trends
	`/api/project/status`	GET	Project status overview
Multi-Tenant	`/api/tenants`	CRUD	Tenant isolation (data, memory independent)

Production features

Feature	Description
Bearer auth	`--api-token`, timing-safe comparison
CORS	`--cors-origin`, multi-origin support
Rate limiting	60 requests/minute/IP
Request size limit	Default 1MB
Circuit breaker	knowlyr-id 3 consecutive failures → 30s pause
Cost tracking	Per-task token metering + model pricing
Auto-degradation	Consecutive failures auto-downgrade employee authority
CI audit	Post-deploy auto-run permission audit script; Feishu alert on failure
Trace IDs	Unique trace_id per task
Concurrency safety	`fcntl.flock` file locks + SQLite WAL
Task persistence	`.crew/tasks.jsonl`, survives restarts
Heartbeat	60s periodic heartbeat to knowlyr-id

Webhook Configuration

.crew/webhook.yaml defines event routing rules (GitHub HMAC-SHA256 signature verification). .crew/cron.yaml defines scheduled tasks (croniter parsing). KPI weekly cron has built-in anomaly auto-delegation rules — D-rated (no output) employees auto-escalate to HR; consecutive self-check issues auto-notify team lead.

Integrations

knowlyr-id — Identity & Runtime Federation

Crew (the production platform built on ensoul) defines "who does what"; knowlyr-id manages identity, conversations, and runtime. Both collaborate but each can operate independently:

┌──────────────────────────────────────┐
│        Crew (Capability Authority)    │
│  prompt · model · tools · avatar     │
│  temperature · bio · tags            │
└──────────────┬───────────────────────┘
     API fetch prompt │ sync push all fields
┌──────────────┴───────────────────────┐
│      knowlyr-id (Identity + Runtime)  │
│  user accounts · conversations       │
│  memory · scheduling · API keys      │
└──────────────────────────────────────┘

knowlyr-id fetches employee prompt / model / temperature / team / permissions / cost via CREW_API_URL (5-minute cache); falls back to DB cache when unavailable. The connection is optional — Crew operates independently without it. The admin dashboard displays each employee's permission badges, team membership, and 7-day cost in real-time, with one-click authority restoration.

Employee state sync (agent_status): ensoul maintains a three-state lifecycle — active (normal operation) / frozen (suspended; configuration preserved but execution skipped) / inactive (decommissioned). State changes are bidirectionally synced to knowlyr-id; frozen employees are automatically skipped during pipeline execution.

Field mapping

Crew Employee	knowlyr-id	Direction
`name`	`crew_name`	push →
`character_name`	`nickname`	push →
`display_name`	`title`	push →
`bio`	`bio`	push →
`description`	`capabilities`	push →
`tags`	`domains`	push →
rendered prompt	`system_prompt`	push →
`avatar.webp`	`avatar_base64`	push →
`model`	`model`	push →
`temperature`	`temperature`	↔
`max_tokens`	`max_tokens`	push →
`memory-id.md`	`memory`	← pull

Feishu · WeCom — Multi-Channel Reach

AI employees respond directly in instant messaging platforms:

Channel	Trigger	Features
Feishu	@employee in message	Multi-bot architecture; auto-routing; Skills trigger + memory injection; rich card responses
WeCom	@employee in message	XML encrypt + signature verification; multi-app; offboarding auto-cleanup; periodic check-ins
Web / API	HTTP POST	REST API; SSE streaming; Bearer auth
Claude Code	`/pull employee-name`	MCP protocol; local IDE interaction

All channels are unified through Skills checking + output sanitization + audit logging.

Claude Code Skills Interoperability

ensoul employees and Claude Code native Skills bidirectionally convert: tools ↔ allowed-tools, args ↔ argument-hint, metadata round-trips via HTML comments.

ensoul export code-reviewer    # → .claude/skills/code-reviewer/SKILL.md
ensoul sync --clean            # Sync + clean orphaned directories

Avatar Generation

Tongyi Wanxiang (DashScope) generates photorealistic professional headshots, 768×768 → 256×256 webp:

pip install ensoul[avatar]
ensoul avatar security-auditor

CLI Reference

Complete CLI command listing (30+ commands)

Core

ensoul list [--tag TAG] [--layer LAYER] [-f json]   # List employees
ensoul show <name>                                    # View details
ensoul run <name> [ARGS] [--smart-context] [--agent-id ID] [--copy] [-o FILE]
ensoul init [--employee NAME] [--dir-format] [--avatar]
ensoul validate <path>                                # Validate employee spec
ensoul check --json                                   # Quality radar

Discussions

ensoul discuss list
ensoul discuss run <name> [--orchestrated] [--arg key=val]
ensoul discuss adhoc -e "emp1,emp2" -t "topic" [--rounds N]
ensoul discuss history [-n 20]
ensoul discuss view <meeting_id>
ensoul discuss create <name> --yaml <path>
ensoul discuss update <name> --yaml <path>

Memory

ensoul memory list
ensoul memory show <employee> [--category ...] [--classification ...]
ensoul memory add <employee> <category> <text> [--tags ...] [--classification ...]
ensoul memory correct <employee> <old_id> <text>
ensoul memory archive <employee> <memory_id>
ensoul memory restore <employee> <memory_id>

Evaluation

ensoul eval track <employee> <category> <text> [--deadline DATE]
ensoul eval list [--status pending]
ensoul eval run <decision_id> <outcome> [--evaluation TEXT]
ensoul eval prompt <decision_id>
ensoul eval overdue [--as-of DATE]

Pipeline

ensoul pipeline list
ensoul pipeline run <name> [--execute] [--model MODEL] [--arg key=val]
ensoul pipeline create <name> --yaml <path>
ensoul pipeline update <name> --yaml <path>
ensoul pipeline checkpoint list
ensoul pipeline checkpoint resume <task_id>

Route

ensoul route list [-f json]                           # List collaboration templates
ensoul route show <name>                              # View route details
ensoul route run <name> <task> [--execute] [--remote] # Execute collaboration route

Server & MCP

ensoul serve --port 8765 --token SECRET [--no-cron] [--cors-origin URL]
ensoul mcp [-t stdio|sse|http] [--port PORT] [--api-token TOKEN] [--tenant-id ID]

Agent Management

ensoul register <name> [--dry-run]
ensoul agents list
ensoul agents status <id>
ensoul agents sync <name>
ensoul agents sync-all [--push-only|--pull-only] [--force] [--dry-run]

Soul Management

ensoul soul show <name>                               # View soul configuration
ensoul soul update <name> --content <text>             # Update soul
ensoul soul history <name>                             # View version history

Templates & Export

ensoul template list
ensoul template apply <template> --employee <name> [--var key=val]
ensoul export <name>                                   # → SKILL.md
ensoul export-all
ensoul sync [--clean]                                  # → .claude/skills/

Other

ensoul avatar <name>                                   # Avatar generation
ensoul log list [--employee NAME] [-n 20]              # Work logs
ensoul log show <session_id>
ensoul deploy [--dry-run]                              # Deployment management

Ecosystem

Architecture diagram

graph LR
    Radar["Radar<br/>Discovery"] --> Recipe["Recipe<br/>Analysis"]
    Recipe --> Synth["Synth<br/>Generation"]
    Recipe --> Label["Label<br/>Annotation"]
    Synth --> Check["Check<br/>Quality"]
    Label --> Check
    Check --> Audit["Audit<br/>Model Audit"]
    Ensoul["ensoul<br/>Deliberation Engine"]
    Agent["Agent<br/>RL Framework"]
    ID["ID<br/>Identity Runtime"]
    Ensoul -.->|Capability definition| ID
    ID -.->|Identity + memory| Ensoul
    Ensoul -.->|Trajectories + rewards| Agent
    Agent -.->|Optimized policies| Ensoul
    Ledger["Ledger<br/>Accounting"]
    Ensoul -.->|AI employee accounts| Ledger
    Ledger -.->|Token settlement| Ensoul

    style Ensoul fill:#0969da,color:#fff,stroke:#0969da
    style Ledger fill:#d29922,color:#fff,stroke:#d29922
    style ID fill:#2da44e,color:#fff,stroke:#2da44e
    style Agent fill:#8b5cf6,color:#fff,stroke:#8b5cf6
    style Radar fill:#1a1a2e,color:#e0e0e0,stroke:#444
    style Recipe fill:#1a1a2e,color:#e0e0e0,stroke:#444
    style Synth fill:#1a1a2e,color:#e0e0e0,stroke:#444
    style Label fill:#1a1a2e,color:#e0e0e0,stroke:#444
    style Check fill:#1a1a2e,color:#e0e0e0,stroke:#444
    style Audit fill:#1a1a2e,color:#e0e0e0,stroke:#444

Layer	Project	Description	Repository
Discovery	AI Dataset Radar	Dataset competitive intelligence, trend analysis	GitHub
Analysis	DataRecipe	Reverse analysis, schema extraction, cost estimation	GitHub
Production	DataSynth / DataLabel	LLM batch synthesis / lightweight annotation	GitHub · GitHub
Quality	DataCheck	Rule validation, dedup detection, distribution analysis	GitHub
Audit	ModelAudit	Distillation detection, model fingerprinting	GitHub
Identity	knowlyr-id	Identity system + AI employee runtime	GitHub
Ledger	knowlyr-ledger	Unified ledger, double-entry bookkeeping, row-lock safety, idempotent transactions	GitHub
Deliberation	ensoul	Structured dialectical deliberation, persistent memory, MCP-native	This project
Agent Training	knowlyr-gym	Gymnasium-style RL framework, process reward models, SFT/DPO/GRPO	GitHub

Development

git clone https://github.com/liuxiaotong/ensoul.git
cd ensoul
pip install -e ".[all]"
uv run --extra dev --extra mcp pytest tests/ -q    # 2025 test cases

What We're Actually Building

ensoul ships 40 MCP tools, 100 Python modules, 45,000 lines of code. But these are implementation details.

What we're actually building is an answer to a question that's about to become very important: When AI employees outnumber human ones, what should an organization look like?

The answer won't start from scratch. From Aristotle's rhetoric to Janis's groupthink research, from the Ebbinghaus forgetting curve to modern RLHF — millennia of human organizational wisdom is the best starting point. ensoul's job is to make that wisdom executable by AI.

This is not the destination. This is the starting point.

Open Source vs Production

ensoul is the engine; Crew is the fleet.

	ensoul (Open Source)	Crew (Production)
License	MIT	Proprietary
AI Employees	Build your own	33+ pre-configured, battle-tested
Memory	Framework + APIs	16 production modules, 50K+ memories
Deliberation	9 structured modes	+ trained policies from real interactions
Deployment	Self-hosted	Managed infrastructure
Support	Community (GitHub Issues)	Official support
Source	github.com/liuxiaotong/ensoul	Private repository

Why open source the core? We believe the fundamental problem of AI employee identity and memory should be solved openly. Proprietary frameworks create vendor lock-in for something as personal as an AI's soul. ensoul gives you full ownership; Crew gives you a running start.

References

Personal Identity — Parfit, D., 1984. Reasons and Persons. Oxford University Press — The philosophical foundation for persistent agent identity
Model Context Protocol (MCP) — Anthropic, 2024. Open standard protocol for agent tool interaction
Multi-Agent Systems — Wooldridge, M., 2009. An Introduction to MultiAgent Systems. Wiley
Groupthink — Janis, I.L., 1972. Victims of Groupthink. Houghton Mifflin
Shared Information Bias — Stasser, G. & Titus, W., 1985. Pooling of Unshared Information in Group Decision Making. JPSP, 48(6)
Minority Influence — Nemeth, C.J., 1994. The Value of Minority Dissent. In S. Moscovici et al. (Eds.), Minority Influence. Nelson-Hall
Devil's Advocacy — Schwenk, C.R., 1990. Effects of devil's advocacy and dialectical inquiry on decision making. Organizational Behavior and Human Decision Processes, 47(1)
Cognitive Conflict — Amason, A.C., 1996. Distinguishing the Effects of Functional and Dysfunctional Conflict. Academy of Management Journal, 39(1)
RLHF — Christiano, P. et al., 2017. Deep RL from Human Preferences. arXiv:1706.03741
Ebbinghaus Forgetting Curve — Ebbinghaus, H., 1885. Uber das Gedachtnis — Inspiration for the memory decay model
Defense in Depth — Schneier, B., 2000. Secrets and Lies: Digital Security in a Networked World. Wiley — Source of multi-layer defense principles
Infrastructure as Code — Morris, K., 2016. Infrastructure as Code. O'Reilly — Paradigmatic source for declarative specifications
Gymnasium — Towers et al., 2024. Gymnasium: A Standard Interface for RL Environments. arXiv:2407.17032

Want to discuss this project? Reach out to

Kai Founder & CEO

陆明哲 AI Product Manager

Related Projects

GymLayer

Knowlyr Gym

RL Training Framework

AuditingLayer

ModelAudit

Model Audit

← All Projects View on GitHub →