BLISP Research Program

Papers

Published and Forthcoming

PAPER 1

The Grounding Gate: Verified Tool Selection for AI-Driven Research

Published Draft

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20817087

When an AI system selects a computational tool, its proposal is an assertion: "this tool is appropriate." Without verification, the assertion flows directly into execution. We call this the assertion gap: the distance between a tool selection that is valid and one that is verified.

We present a grounding gate that closes this gap through evidence-carrying tool selection. Each tool call carries explicit evidence—match mode, confidence score, and a cryptographic capability hash—linking the selection to the user's terms. A deterministic verification function checks this evidence before execution; proposals lacking evidence are rejected even if they name real capabilities. The architecture enforces a wall property: no unverified tool selection reaches execution. Evidence stability is ensured by a behavior-derived identity model where discovery metadata is excluded from capability hashes by construction.

We evaluate on 30 prompts across 5 categories (4 strategy families, 9 metrics, 36 valid combinations). An assertion-only pipeline (schema validation, no verification) executes unwarranted capabilities at 23.3%; the verified pipeline reduces this to 10.0% (Fisher exact p = 0.027), eliminating them entirely on undiscoverable prompts (100% to 0%). Repeated executions produce bit-identical hashes across all 50 runs; an 8-layer execution hash decomposes provenance for fault localization without re-execution. Verification overhead is under 14 ms.

PDF Source (.tar.gz) Artifacts Zenodo

@article{dionysopoulos2026grounding,
  title   = {The Grounding Gate: Verified Tool Selection
             for AI-Driven Research},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20817087},
  note    = {Published draft, BLISP Research Program Paper 1 v2},
  url     = {https://blisp.ai/papers/paper1.pdf}
}

PAPER 2

Canonical Execution Semantics for Stochastic Program Generators

Published Draft

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20457255

When programs are generated by stochastic systems, independently generated programs that represent the same intended computation arrive in different surface forms, producing different hashes, different provenance records, and failed replay comparisons. We argue that execution systems for stochastic generators require a canonical execution boundary: an architectural invariant that partitions the pipeline into a stochastic upstream and a deterministic downstream. Four mechanisms enforce the boundary: typed specifications, a canonicalization pipeline (278 surface forms to 235 canonical operations), 8-layer execution hashing, and description/identity separation. Evaluated on 1,200 stochastic LLM generations with 50-run replay determinism.

PDF Zenodo Artifacts

PAPER 3

Execution Categories for Stochastic Program Generators: Quotient Semantics for Deterministic Executable Identity

Published Draft

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20457403

We define a registry-indexed execution category whose objects are typed executable artifacts and whose morphisms are admissible pipeline transformations. The operational equivalence generated by the system's rewrite rules forms a congruence: equivalent subexpressions remain equivalent under arbitrary well-typed pipeline composition. The resulting quotient category gives precise meaning to deterministic execution identity. Content-addressed hashing serves as a computable operational witness of quotient membership.

PDF Zenodo Artifacts

PAPER 4

Provenance Algebra for Deterministic AI Execution: Replay Semantics for Stochastic Program Generators

Published Draft

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20457667

Provenance for deterministic execution systems is not metadata but a semantic factorization of execution identity. We define a provenance map over the quotient execution category, assigning to each execution equivalence class an 8-layer hash record that decomposes execution identity into semantic dependency boundaries. A dependency-indexed composition law establishes that pipeline provenance is determined by stage provenance and the declared dependency map. Enables replay equivalence, divergence localization, partial replay, and provenance-preserving registry evolution.

PDF Zenodo Artifacts

PAPER 5

Proposal Collapse and Execution Fibers in Stochastic Program Generation

Published Draft

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20457990

Two distinct kinds of variation emerge when stochastic generators propose executable specifications. Surface-form variation is absorbed by canonicalization (intra-fiber). Execution ambiguity creates clean transitions between execution classes (inter-fiber). Across 2,200 proposals with controlled perturbations: synonym rewording stays within fibers (rho = 0.985), metric/family substitutions produce zero same-fiber mass (rho = 0.000) with perfect stability (sigma = 1.000). The adjacency graph is sparse (density = 0.095).

PDF Zenodo Artifacts

PAPER 6

The Semantic Structure of Execution: An Empirical Study of Predictive Coordinates in Computational Operations

Published Draft

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20612709

A single 7-valued coordinate (DependencyClass) classifies operations by data-dependency shape and predicts four independent optimizer behaviors—fusion eligibility, window semantics, pipeline position, and state management—with 99.6% accuracy (243/244 behavior predictions, z = 13.0, p < 10⁻³⁸ vs random baseline). The coordinate is not a descriptive label; it is a predictive object that determines execution behavior from semantic structure alone. Conditional mutual information analysis confirms that DependencyClass provides information about optimizer behavior beyond what operation name alone provides.

PDF Zenodo Artifacts

PAPER 7

Semantic Coordinates as Predictive Objects in Time-Series Computation

Published Draft

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20706294

A frozen taxonomy trained on 61 operations generalizes to 25 unseen operations at 100% accuracy (100/100 holdout predictions) with zero recalibration. Coordinate ablation confirms that the full coordinate is minimal—removing any single dimension degrades prediction. Random baselines with equivalent cardinality achieve chance accuracy. The result establishes semantic coordinates as predictive objects: they predict optimizer behavior, not merely describe it.

PDF Zenodo Artifacts

PAPER 8

Dependency Shape Predicts Execution Behavior Across Independent Data Processing Systems

Published Draft

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20706086

A frozen 8-valued dependency-shape taxonomy, built without inspecting either target system, predicts three execution behaviors (streaming eligibility, buffering requirements, warmup) in Polars (Rust, morsel-driven parallelism) and DuckDB (C++, push-based pipelines). Buffering predictions reach 96.7% accuracy in both systems, with the single shared error (filter) reflecting a classification boundary. Combined accuracy across 180 predictions is 91.1%, with zero errors from incorrect dependency-shape assignments. All errors trace to architectural choices and API conventions, not to the taxonomy itself.

PDF Zenodo Artifacts

PAPER 9

Agents Reconstruct Execution Identity Algebra Under Task Pressure

Published Draft

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20706156

Independent frontier model families (Anthropic, OpenAI, Google), working on independent domains (finance, SQL, build/CI), reconstruct structurally equivalent execution-identity primitives under task pressure. Nine question tiers of increasing difficulty elicit eight primitives: normalization, canonical identity, equivalence classes, grouping, composite rewriting, replay mappings, computation DAGs, and policy checking. 7/8 primitives converge above 0.90 across 55 runs. Reconstruction is convergent, staged, and expensive (~178,000 tokens per reconstruction). A reference implementation materializes the same eight primitives as persistent, composable, domain-portable infrastructure at zero marginal query cost.

PDF Zenodo Artifacts

PAPER 10

False Hits in Parameterized Pipeline Caching: Why Safe Compositional Replay Requires Congruence

Published Draft

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20815342

Parameterized computational pipelines that share input data but differ in semantics can produce false cache hits when the cache key does not induce a congruence on the pipeline space. We prove that safe compositional caching requires the cache key to satisfy a congruence property: semantically equivalent pipelines must produce the same key, and semantically distinct pipelines must not. A cache keyed on content hashes (DataHash) violates this requirement—97 false hits in a 1,000-pipeline experiment. A cache keyed on computation identity hashes (MOR_HSH, incorporating behavioral capability hashes) induces a congruence and produces zero false hits. The identity hash is the same behavior-derived hash used for verified tool selection in Paper 1.

PDF Zenodo

PAPER 11

Verified AI Actions: Closing the Pre-Action Legitimacy Gap

Position Paper

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20816935

Every production tool-calling protocol—MCP, OpenAI function calling, LangChain—treats tool selection as an assertion sufficient for execution. No evidence is required. No verification occurs. We identify a verification gap between what AI systems assert and what they can justify, independently named from three directions: the assertion gap in tool selection, the pre-action legitimacy gap in regulatory compliance, and the principle that “generation is not permission” in formal agent safety. We map the emerging landscape into three layers: post-hoc audit (deployed), policy gates (emerging), and semantic verification (gap). We present the first implemented and empirically evaluated system for runtime semantic verification of AI tool selection. In 180 trials, semantic verification reduced uncaught wrong-tool selection from 23.3% to 10.0% (Fisher exact p = 0.027). Identity-gated caching eliminates all 97 false hits observed under data-hash keying. A runtime grounding wall enforces that no unverified selection reaches execution.

PDF Zenodo

PAPER 12

Computational Identity

Submitted

Thomas Dionysopoulos

DOI: 10.5281/zenodo.20830084

Computing systems identify artifacts at six established layers—names, versions, content hashes, type signatures, source locations, and output fingerprints. None answers: do these two expressions describe the same computation? We define Computational Identity (CI): a deterministic, content-addressed identifier derived from the canonical planned computation graph of an expression. CI identifies what computation is planned, independent of how it is written, what data it operates on, or where it executes. We implement CI in two domains: a domain-specific computation language (97 false hits eliminated, 515 comparisons, 0 mismatches) and SQL (22/22 TPC-H equivalence classes collapsed, 56 variants, 0 false equivalences). CI provides structural equivalence—not semantic equivalence—and we characterize five concrete boundary cases.

PDF Zenodo

Paper 12 — Technical Notes

TECHNICAL NOTE

Supplementary Evidence and Conservative Index Characterization

Working Note

Thomas Dionysopoulos

Consolidates three experiments extending Paper 12: (A) agent tool governance with CI-augmented authorization on IronClaw (3 scenarios, 3/3 gaps detected), (B) cross-version drift detection in Polars dataframe query plans (5/5 queries drifted between v0.20 and v1.38), and (C) behavioral CI extracted by three independent language models over 45 tools (58% strict collapse, 87% majority, CI catches 88% of agent errors vs 38% for schema). Formalizes the Conservative Index Theorem: CI is sound (zero false positives), canonicalization monotonically reduces false negatives, and incompleteness is inherent (Rice’s theorem). Cross-domain canonicalization table across BLISP, SQL, Polars, and agent tools.

PDF

TECHNICAL NOTE

Computational Identity Applied to Agent Tool Governance: A Case Study on IronClaw

Case Study

Thomas Dionysopoulos

Applies CI to IronClaw, an agent runtime with deny-by-default authorization. Demonstrates three security gaps in schema-based tool authorization that CI closes: schema blindness (same schema, different computation), LLM semantic drift (same query rewritten, hallucinated filter detected), and supply chain drift (certified template modified by dependency update). All CI values are real SHA-256 hashes computed over canonicalized DataFusion logical plans.

PDF

TECHNICAL NOTE

Polars CI Case Study: Computational Identity on Dataframe Query Plans

Case Study

Thomas Dionysopoulos

Applies CI to Polars, a dataframe library with lazy evaluation. Tests two versions (0.20.31 and 1.38.1): 3/5 syntax equivalences collapse, 5/5 identical queries drift between versions (optimizer changes silently invalidate caches), and 2 boundary cases (join commutativity, node type differences) are consistent. Versioned CI makes drift explicit by construction. The same flaw-detect-fix pattern as the SQL case study in Paper 12.

PDF

Draft 13 — Candidate Paper (Experiment Complete)

DRAFT 13

Computational Identity as a Conservative Index for Execution Agreement

Draft

Thomas Dionysopoulos

Characterizes CI as a conservative index: matching identity guarantees matching execution (zero false positives), but non-matching identity does not guarantee different execution (false negatives exist). Proves that canonicalization monotonically reduces false negatives while preserving soundness (Canonicalization Monotonicity Theorem). Measures false negative rates across three domains: DSL (47 aliases, 17% reduction), SQL (56→22 equivalence classes, 61% reduction), and agent tool selection (45 tools × 3 models, 58% strict / 87% majority agreement). In a controlled agent error detection experiment (90 decisions), CI catches 88% of errors vs 38% for schema validation. CI strictly dominates: 4 CI-only catches, 0 schema-only catches. Includes out-of-sample evaluation on 30 held-out tools (57% collapse, consistent with main experiment). Status: experiment complete, results preserved, writing stopped pending decision on standalone paper vs Paper 12 appendix.

PDF

Artifact	Value
Git commit	106b5fd
Version	2.0 (revised title, abstract, grounding wall)
DOI	10.5281/zenodo.20817087
Registry snapshot	236 capabilities (DIC at tag)
SHA-256 (PDF)	a54f49aaed2effcf702d693e327f525caf499fd9a4abb3fa028718cb0633bb4c
Prompts evaluated	30 (5 categories, 4 families, 9 metrics)
Hash stability	50 runs, bit-identical
Test suite	1,600 tests, 0 failures (incl. 9 grounding wall property tests)

Citation

How to Cite

If you reference the BLISP research program or any individual paper, please use the following.

Paper 1

@article{dionysopoulos2026grounding,
  title   = {The Grounding Gate: Verified Tool Selection
             for AI-Driven Research},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20817087},
  note    = {Published draft, BLISP Research Program Paper 1 v2},
  url     = {https://blisp.ai/papers/paper1.pdf}
}

Paper 2

@article{dionysopoulos2026canonical,
  title   = {Canonical Execution Semantics for Stochastic Program
             Generators},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20457255},
  note    = {Published draft, BLISP Research Program Paper 2},
  url     = {https://blisp.ai/papers/paper2.pdf}
}

Paper 3

@article{dionysopoulos2026categories,
  title   = {Execution Categories for Stochastic Program Generators:
             Quotient Semantics for Deterministic Executable Identity},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20457403},
  note    = {Published draft, BLISP Research Program Paper 3},
  url     = {https://blisp.ai/papers/paper3.pdf}
}

Paper 4

@article{dionysopoulos2026provenance,
  title   = {Provenance Algebra for Deterministic {AI} Execution:
             Replay Semantics for Stochastic Program Generators},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20457667},
  note    = {Published draft, BLISP Research Program Paper 4},
  url     = {https://blisp.ai/papers/paper4.pdf}
}

Paper 5

@article{dionysopoulos2026fibers,
  title   = {Proposal Collapse and Execution Fibers in Stochastic
             Program Generation},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20457990},
  note    = {Published draft, BLISP Research Program Paper 5},
  url     = {https://blisp.ai/papers/paper5.pdf}
}

Paper 6

@article{dionysopoulos2026semantic,
  title   = {The Semantic Structure of Execution: An Empirical Study of
             Predictive Coordinates in Computational Operations},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20612709},
  note    = {Published draft, BLISP Research Program Paper 6},
  url     = {https://doi.org/10.5281/zenodo.20612709}
}

Paper 7

@article{dionysopoulos2026predictive,
  title   = {Semantic Coordinates as Predictive Objects in Time-Series
             Computation},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20706294},
  note    = {Published draft, BLISP Research Program Paper 7},
  url     = {https://doi.org/10.5281/zenodo.20706294}
}

Paper 8

@article{dionysopoulos2026transfer,
  title   = {Dependency Shape Predicts Execution Behavior Across
             Independent Data Processing Systems},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20706086},
  note    = {Published draft, BLISP Research Program Paper 8},
  url     = {https://doi.org/10.5281/zenodo.20706086}
}

Paper 9

@article{dionysopoulos2026convergence,
  title   = {Agents Reconstruct Execution Identity Algebra Under
             Task Pressure},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20706156},
  note    = {Published draft, BLISP Research Program Paper 9},
  url     = {https://doi.org/10.5281/zenodo.20706156}
}

Paper 10

@article{dionysopoulos2026falsehits,
  title   = {When Data-Hash Caching Fails: False Hits in
             Parameterized Pipeline Search},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20815342},
  note    = {Published draft, BLISP Research Program Paper 10},
  url     = {https://blisp.ai/papers/paper10.pdf}
}

Paper 11

@article{dionysopoulos2026verifiedactions,
  title   = {Verified {AI} Actions: Closing the Pre-Action
             Legitimacy Gap},
  author  = {Dionysopoulos, Thomas},
  year    = {2026},
  doi     = {10.5281/zenodo.20816935},
  note    = {Position paper, BLISP Research Program Paper 11},
  url     = {https://blisp.ai/papers/paper11.pdf}
}

Research Program

@misc{blisp2026research,
  title        = {BLISP Research Program: Admissibility, Execution,
                  Provenance, and Capability-Grounded AI Systems},
  author       = {Dionysopoulos, Thomas},
  year         = {2026},
  howpublished = {\url{https://blisp.ai/papers}},
  note         = {11-paper program; all papers published}
}

Asset	Format	Link
Paper 1 — PDF	PDF, 13 pages	paper1.pdf
Paper 1 — Source	LaTeX tarball	paper1-source.tar.gz
Paper 1 — Artifacts	Prompts, verification scripts	artifacts/
Paper 2 — PDF	PDF, 23 pages	paper2.pdf
Paper 3 — PDF	PDF, 14 pages	paper3.pdf
Paper 4 — PDF	PDF, 15 pages	paper4.pdf
Paper 5 — PDF	PDF, 12 pages	paper5.pdf
Paper 6 — PDF	PDF	paper6.pdf
Paper 7 — PDF	PDF, 14 pages	paper7.pdf
Paper 8 — PDF	PDF, 17 pages	paper8.pdf
Paper 9 — PDF	PDF, 17 pages	paper9.pdf
Paper 10 — PDF	PDF	paper10.pdf
Paper 11 — PDF	PDF, 5 pages	paper11.pdf
All Papers — Artifacts	Zenodo packages + source	blisp-research

BLISP Research Program

Research Program Structure

Cite the Program

Published and Forthcoming

Paper 12 — Technical Notes

Draft 13 — Candidate Paper (Experiment Complete)

Research Dependency Graph

Build Provenance

Paper 1 v2 — The Grounding Gate

How to Cite

Paper 1

Paper 2

Paper 3

Paper 4

Paper 5

Paper 6

Paper 7

Paper 8

Paper 9

Paper 10

Paper 11

Research Program

Paper Assets

Readership