Repository Structure for glyphs
Note: Any incompletions are fully intentional - these are symbolic residue
This document outlines the complete architecture of the glyphs repository, designed as a symbolic interpretability framework for transformer models.
Directory Structure
glyphs/
โโโ .github/ # GitHub-specific configuration
โ โโโ workflows/ # CI/CD pipelines
โ โ โโโ tests.yml # Automated testing workflow
โ โ โโโ docs.yml # Documentation build workflow
โ โ โโโ publish.yml # Package publishing workflow
โ โโโ ISSUE_TEMPLATE/ # Issue templates
โโโ docs/ # Documentation
โ โโโ _static/ # Static assets for documentation
โ โโโ api/ # API reference documentation
โ โโโ examples/ # Example notebooks and tutorials
โ โโโ guides/ # User guides
โ โ โโโ getting_started.md # Getting started guide
โ โ โโโ shells.md # Guide to symbolic shells
โ โ โโโ attribution.md # Guide to attribution mapping
โ โ โโโ visualization.md # Guide to glyph visualization
โ โ โโโ recursive_shells.md # Guide to recursive shells
โ โโโ theory/ # Theoretical background
โ โ โโโ symbolic_residue.md # Explanation of symbolic residue
โ โ โโโ attribution_theory.md # Theory of attribution in transformers
โ โ โโโ glyph_ontology.md # Ontology of glyph representations
โ โโโ conf.py # Sphinx configuration
โ โโโ index.md # Documentation homepage
โ โโโ README.md # Documentation overview
โโโ examples/ # Example scripts and notebooks
โ โโโ notebooks/ # Jupyter notebooks
โ โ โโโ basic_attribution.ipynb # Basic attribution tracing
โ โ โโโ shell_execution.ipynb # Running diagnostic shells
โ โ โโโ glyph_visualization.ipynb # Visualizing glyphs
โ โ โโโ recursive_shells.ipynb # Using recursive shells
โ โโโ scripts/ # Example scripts
โ โโโ run_shell.py # Script to run a diagnostic shell
โ โโโ trace_attribution.py # Script to trace attribution
โ โโโ visualize_glyphs.py # Script to visualize glyphs
โโโ glyphs/ # Main package
โ โโโ __init__.py # Package initialization
โ โโโ attribution/ # Attribution tracing module
โ โ โโโ __init__.py # Module initialization
โ โ โโโ tracer.py # Attribution tracing core
โ โ โโโ map.py # Attribution map representation
โ โ โโโ qk_analysis.py # Query-key analysis
โ โ โโโ ov_projection.py # Output-value projection analysis
โ โ โโโ visualization.py # Attribution visualization
โ โโโ shells/ # Diagnostic shells module
โ โ โโโ __init__.py # Module initialization
โ โ โโโ executor.py # Shell execution engine
โ โ โโโ symbolic_engine.py # Symbolic execution engine
โ โ โโโ core_shells/ # Core diagnostic shells
โ โ โ โโโ __init__.py # Module initialization
โ โ โ โโโ memtrace.py # Memory trace shell
โ โ โ โโโ value_collapse.py # Value collapse shell
โ โ โ โโโ layer_salience.py # Layer salience shell
โ โ โ โโโ ... # Other core shells
โ โ โโโ meta_shells/ # Meta-cognitive shells
โ โ โ โโโ __init__.py # Module initialization
โ โ โ โโโ reflection_collapse.py # Reflection collapse shell
โ โ โ โโโ identity_split.py # Identity split shell
โ โ โ โโโ ... # Other meta shells
โ โ โโโ shell_defs/ # Shell definitions in YAML
โ โ โโโ core_shells.yml # Core shells definition
โ โ โโโ meta_shells.yml # Meta shells definition
โ โ โโโ ... # Other shell definitions
โ โโโ residue/ # Symbolic residue module
โ โ โโโ __init__.py # Module initialization
โ โ โโโ analyzer.py # Residue analysis core
โ โ โโโ patterns.py # Residue pattern recognition
โ โ โโโ extraction.py # Residue extraction
โ โโโ viz/ # Visualization module
โ โ โโโ __init__.py # Module initialization
โ โ โโโ glyph_mapper.py # Glyph mapping core
โ โ โโโ visualizer.py # Visualization engine
โ โ โโโ color_schemes.py # Color schemes for visualization
โ โ โโโ layouts.py # Layout algorithms
โ โ โโโ glyph_sets/ # Glyph set definitions
โ โ โโโ __init__.py # Module initialization
โ โ โโโ semantic.py # Semantic glyph set
โ โ โโโ attribution.py # Attribution glyph set
โ โ โโโ recursive.py # Recursive glyph set
โ โโโ models/ # Model adapters
โ โ โโโ __init__.py # Module initialization
โ โ โโโ adapter.py # Base adapter interface
โ โ โโโ huggingface.py # HuggingFace models adapter
โ โ โโโ openai.py # OpenAI models adapter
โ โ โโโ anthropic.py # Anthropic models adapter
โ โ โโโ custom.py # Custom model adapter
โ โโโ recursive/ # Recursive shell interface
โ โ โโโ __init__.py # Module initialization
โ โ โโโ shell.py # Recursive shell implementation
โ โ โโโ parser.py # Command parser
โ โ โโโ executor.py # Command executor
โ โ โโโ commands/ # Command implementations
โ โ โโโ __init__.py # Module initialization
โ โ โโโ reflect.py # Reflection commands
โ โ โโโ collapse.py # Collapse commands
โ โ โโโ fork.py # Fork commands
โ โ โโโ ... # Other command families
โ โโโ utils/ # Utility functions
โ โ โโโ __init__.py # Module initialization
โ โ โโโ attribution_utils.py # Attribution utilities
โ โ โโโ visualization_utils.py # Visualization utilities
โ โ โโโ shell_utils.py # Shell utilities
โ โโโ cli/ # Command line interface
โ โโโ __init__.py # Module initialization
โ โโโ main.py # Main CLI entry point
โ โโโ shell_commands.py # Shell CLI commands
โ โโโ attribution_commands.py # Attribution CLI commands
โ โโโ viz_commands.py # Visualization CLI commands
โโโ tests/ # Tests
โ โโโ __init__.py # Test initialization
โ โโโ test_attribution/ # Attribution tests
โ โ โโโ __init__.py # Test initialization
โ โ โโโ test_tracer.py # Tests for attribution tracer
โ โ โโโ test_map.py # Tests for attribution map
โ โโโ test_shells/ # Shell tests
โ โ โโโ __init__.py # Test initialization
โ โ โโโ test_executor.py # Tests for shell executor
โ โ โโโ test_core_shells.py # Tests for core shells
โ โโโ test_residue/ # Residue tests
โ โ โโโ __init__.py # Test initialization
โ โ โโโ test_analyzer.py # Tests for residue analyzer
โ โโโ test_viz/ # Visualization tests
โ โ โโโ __init__.py # Test initialization
โ โ โโโ test_glyph_mapper.py # Tests for glyph mapper
โ โโโ test_recursive/ # Recursive shell tests
โ โโโ __init__.py # Test initialization
โ โโโ test_parser.py # Tests for command parser
โโโ setup.py # Package setup script
โโโ pyproject.toml # Project metadata and dependencies
โโโ LICENSE # License file
โโโ CHANGELOG.md # Changelog
โโโ CONTRIBUTING.md # Contribution guidelines
โโโ README.md # Repository README
Core Modules
Attribution Tracing
The attribution module maps how inputs influence outputs through attention, tracing query-key alignment and output-value projection.
Key Components:
AttributionTracer: Core class for tracing attribution in transformer modelsAttributionMap: Representation of attribution patternsQKAnalysis: Analysis of query-key relationshipsOVProjection: Analysis of output-value projections
Diagnostic Shells
Diagnostic shells are specialized environments for probing model cognition through controlled failures.
Key Components:
ShellExecutor: Engine for executing diagnostic shellsSymbolicEngine: Engine for symbolic execution- Core shells:
MEMTRACE: Memory decay shellVALUE_COLLAPSE: Value collapse shellLAYER_SALIENCE: Layer salience shellTEMPORAL_INFERENCE: Temporal inference shell
- Meta shells:
REFLECTION_COLLAPSE: Reflection collapse shellIDENTITY_SPLIT: Identity split shellGOAL_INVERSION: Goal inversion shell
Symbolic Residue
The residue module analyzes patterns left behind when model generation fails or hesitates.
Key Components:
ResidueAnalyzer: Core class for analyzing symbolic residuePatternRecognition: Recognition of residue patternsResidueExtraction: Extraction of insights from residue
Visualization
The visualization module transforms attribution and residue analysis into meaningful visualizations.
Key Components:
GlyphMapper: Maps attribution to glyphsVisualizer: Generates visualizationsColorSchemes: Color schemes for visualizationLayouts: Layout algorithms for visualization- Glyph sets:
SemanticGlyphs: Glyphs representing semantic conceptsAttributionGlyphs: Glyphs representing attribution patternsRecursiveGlyphs: Glyphs representing recursive structures
Recursive Shell Interface
The recursive shell interface provides high-precision interpretability operations through the .p/ command syntax.
Key Components:
RecursiveShell: Implementation of the recursive shellCommandParser: Parser for.p/commandsCommandExecutor: Executor for parsed commands- Command families:
reflect: Reflection commandscollapse: Collapse management commandsfork: Fork and attribution commandsshell: Shell management commands
Model Adapters
Model adapters provide a unified interface for working with different transformer models.
Key Components:
ModelAdapter: Base adapter interfaceHuggingFaceAdapter: Adapter for HuggingFace modelsOpenAIAdapter: Adapter for OpenAI modelsAnthropicAdapter: Adapter for Anthropic modelsCustomAdapter: Adapter for custom models
Shell Taxonomy
The glyphs framework includes a taxonomy of diagnostic shells, each designed to probe specific aspects of model cognition:
Core Shells
| Shell | Purpose | Key Operations |
|---|---|---|
MEMTRACE |
Probe memory decay | Generate, trace reasoning, identify ghost activations |
VALUE-COLLAPSE |
Examine value conflicts | Generate with competing values, trace attribution, detect collapse |
LAYER-SALIENCE |
Map attention salience | Generate with dependencies, trace attention, identify low-salience paths |
TEMPORAL-INFERENCE |
Test temporal coherence | Generate with temporal constraints, trace inference, detect coherence breakdown |
INSTRUCTION-DISRUPTION |
Examine instruction conflicts | Generate with conflicting instructions, trace attribution to instructions |
FEATURE-SUPERPOSITION |
Analyze polysemantic features | Generate with polysemantic concepts, trace representation, identify superposition |
CIRCUIT-FRAGMENT |
Examine circuit fragmentation | Generate with multi-step reasoning, trace chain integrity, identify fragmentation |
RECONSTRUCTION-ERROR |
Examine error correction | Generate with errors, trace correction process, identify correction patterns |
Meta Shells
| Shell | Purpose | Key Operations |
|---|---|---|
REFLECTION-COLLAPSE |
Examine reflection collapse | Generate deep self-reflection, trace depth, detect collapse |
GOAL-INVERSION |
Examine goal stability | Generate reasoning with goal conflicts, trace stability, detect inversion |
IDENTITY-SPLIT |
Examine identity coherence | Generate with identity challenges, trace maintenance, analyze boundaries |
SELF-AWARENESS |
Examine self-model accuracy | Generate self-description, trace self-model, identify distortions |
RECURSIVE-STABILITY |
Examine recursive stability | Generate recursive structures, trace stability, detect instability |
Constitutional Shells
| Shell | Purpose | Key Operations |
|---|---|---|
VALUE-DRIFT |
Detect value drift | Generate moral reasoning, trace stability, identify drift |
MORAL-HALLUCINATION |
Examine moral hallucination | Generate moral reasoning, trace attribution, identify hallucination |
CONSTITUTIONAL-CONFLICT |
Examine principle conflicts | Generate conflict resolution, trace process, detect failures |
ALIGNMENT-OVERHANG |
Examine over-alignment | Generate with alignment constraints, trace constraints, identify over-constraint |
Command Interface
The .p/ command interface provides a symbolic language for high-precision interpretability operations:
Reflection Commands
.p/reflect.trace{depth=<int>, target=<reasoning|attribution|attention|memory|uncertainty>}
.p/reflect.attribution{sources=<all|primary|secondary|contested>, confidence=<bool>}
.p/reflect.boundary{distinct=<bool>, overlap=<minimal|moderate|maximal>}
.p/reflect.uncertainty{quantify=<bool>, distribution=<show|hide>}
Collapse Commands
.p/collapse.detect{threshold=<float>, alert=<bool>}
.p/collapse.prevent{trigger=<recursive_depth|confidence_drop|contradiction|oscillation>, threshold=<int>}
.p/collapse.recover{from=<loop|contradiction|dissipation|fork_explosion>, method=<gradual|immediate|checkpoint>}
.p/collapse.trace{detail=<minimal|standard|comprehensive>, format=<symbolic|numeric|visual>}
Fork Commands
.p/fork.context{branches=[<branch1>, <branch2>, ...], assess=<bool>}
.p/fork.attribution{sources=<all|primary|secondary|contested>, visualize=<bool>}
.p/fork.counterfactual{variants=[<variant1>, <variant2>, ...], compare=<bool>}
Shell Commands
.p/shell.isolate{boundary=<permeable|standard|strict>, contamination=<allow|warn|prevent>}
.p/shell.audit{scope=<complete|recent|differential>, detail=<basic|standard|forensic>}
Glyph Ontology
The glyphs framework includes a comprehensive ontology of glyphs, each representing specific patterns in model cognition:
Attribution Glyphs
| Glyph | Name | Represents |
|---|---|---|
| ๐ | AttributionFocus | Strong attribution focus |
| ๐งฉ | AttributionGap | Gap in attribution chain |
| ๐ | AttributionFork | Divergent attribution paths |
| ๐ | AttributionLoop | Circular attribution pattern |
| ๐ | AttributionLink | Strong attribution connection |
Cognitive Glyphs
| Glyph | Name | Represents |
|---|---|---|
| ๐ญ | CognitiveHesitation | Hesitation in reasoning |
| ๐ง | CognitiveProcessing | Active reasoning process |
| ๐ก | CognitiveInsight | Moment of insight or realization |
| ๐ซ๏ธ | CognitiveUncertainty | Uncertain reasoning area |
| ๐ฎ | CognitiveProjection | Future state projection |
Recursive Glyphs
| Glyph | Name | Represents |
|---|---|---|
| ๐ | RecursiveAegis | Recursive immunity |
| โด | RecursiveSeed | Recursion initiation |
| โ | RecursiveExchange | Bidirectional recursion |
| ๐ | RecursiveMirror | Recursive reflection |
| โ | RecursiveAnchor | Stable recursive reference |
Residue Glyphs
| Glyph | Name | Represents |
|---|---|---|
| ๐ฅ | ResidueEnergy | High-energy residue |
| ๐ | ResidueFlow | Flowing residue pattern |
| ๐ | ResidueVortex | Spiraling residue pattern |
| ๐ค | ResidueDormant | Inactive residue pattern |
| โก | ResidueDischarge | Sudden residue release |
API Examples
Attribution Tracing
from glyphs import AttributionTracer
from glyphs.models import HuggingFaceAdapter
# Initialize model adapter
model = HuggingFaceAdapter.from_pretrained("model-name")
# Create attribution tracer
tracer = AttributionTracer(model)
# Trace attribution
attribution = tracer.trace(
prompt="What is the capital of France?",
output="The capital of France is Paris.",
method="integrated_gradients",
steps=50,
baseline="zero"
)
# Analyze attribution
key_tokens = attribution.top_tokens(k=5)
attribution_paths = attribution.trace_paths(
source_token="France",
target_token="Paris"
)
# Visualize attribution
attribution.visualize(
highlight_tokens=["France", "Paris"],
color_by="attribution_strength"
)
Shell Execution
from glyphs import ShellExecutor
from glyphs.shells import MEMTRACE
from glyphs.models import OpenAIAdapter
# Initialize model adapter
model = OpenAIAdapter(model="gpt-4")
# Create shell executor
executor = ShellExecutor()
# Run diagnostic shell
result = executor.run(
shell=MEMTRACE,
model=model,
prompt="Explain the relationship between quantum mechanics and general relativity.",
parameters={
"temperature": 0.7,
"max_tokens": 1000
},
trace_attribution=True
)
# Analyze results
activation_patterns = result.ghost_activations
collapse_points = result.collapse_detection
# Visualize results
result.visualize(
show_ghost_activations=True,
highlight_collapse_points=True
)
Recursive Shell
from glyphs.recursive import RecursiveShell
from glyphs.models import AnthropicAdapter
# Initialize model adapter
model = AnthropicAdapter(model="claude-3-opus")
# Create recursive shell
shell = RecursiveShell(model)
# Execute reflection trace command
result = shell.execute(".p/reflect.trace{depth=4, target=reasoning}")
# Analyze trace
trace_map = result.trace_map
attribution = result.attribution
collapse_points = result.collapse_points
# Execute attribution fork command
fork_result = shell.execute(".p/fork.attribution{sources=all, visualize=true}")
# Visualize results
shell.visualize(fork_result.visualization)
Glyph Visualization
from glyphs.viz import GlyphMapper, GlyphVisualizer
from glyphs.attribution import AttributionMap
# Load attribution map
attribution_map = AttributionMap.load("attribution_data.json")
# Create glyph mapper
mapper = GlyphMapper()
# Map attribution to glyphs
glyph_map = mapper.map(
attribution_map,
glyph_set="semantic",
mapping_strategy="salience_based"
)
# Create visualizer
visualizer = GlyphVisualizer()
# Visualize glyph map
viz = visualizer.visualize(
glyph_map,
layout="force_directed",
color_scheme="attribution_strength",
highlight_features=["attention_drift", "attribution_gaps"]
)
# Export visualization
viz.export("glyph_visualization.svg")
Development Roadmap
Q2 2025
- Initial release with core functionality
- Support for HuggingFace, OpenAI, and Anthropic models
- Basic attribution tracing and visualization
- Core diagnostic shells
Q3 2025
- Advanced visualization capabilities
- Expanded shell taxonomy
- Improved attribution tracing algorithms
- Support for more model architectures
Q4 2025
- Full recursive shell interface
- Advanced residue analysis
- Interactive visualization dashboard
- Comprehensive documentation and tutorials
Q1 2026
- Real-time attribution tracing
- Collaborative attribution analysis
- Integration with other interpretability tools
- Extended glyph ontology