ProvenioArt Intelligence

Research Data

Supplementary materials for: Provenio: A Deployed Multi-Ontology Art Knowledge Graph with MCP-Based LLM Grounding for Cultural Heritage Queries

ISWC 2026 In-Use Track submission · Hoyeob Kim (Georgia Institute of Technology / Mildo Group)

Datasets

Pilot Query Set (20 queries + ground truth)

Download (15 KB)

Expert-curated art historical queries across four categories: provenance reconstruction, attribution dispute, iconographic identification, and market comparables. Each query includes ground-truth key entities compiled from primary sources.

pilot_queries.json

Benchmark Results

Download (82 KB)

Full model response logs for Claude Sonnet 4.6 and Codex/GPT-5.4, with per-query entity accuracy scores, abstention annotations, and hallucination flags.

pilot_20260427_025751.json

MCP Tool Interface Schema

Download (4 KB)

JSON Schema definitions for all five Provenio-MCP tools: artwork_provenance, artist_network, iconographic_search, market_signal, critical_reception. Input/output contracts with PROV-O uncertainty fields.

mcp_tool_schema.json

Academic API Access

Non-commercial academic research access to the Provenio KG API is available at no cost for verified institutional email addresses.

ceo@provenio.art →

Benchmark Methodology

The pilot benchmark comprises 20 art historical queries across four categories (provenance reconstruction, attribution dispute, iconographic identification, market comparables), with 5 queries per category.

Ground truth for each query was compiled from primary sources: auction house records, museum collection catalogs, peer-reviewed attribution studies, and legal case records. Entities were cross-checked against the Provenio KG for internal consistency.

Hallucination Rate (HR) is reported for HIGH/VERY HIGH risk queries. Entity Accuracy (EA) is macro-averaged across all 20 queries using case-insensitive substring matching with romanization normalization. Abstention Rate (AR) reflects manually annotated explicit hedging phrases.

Statistical tests: Mann-Whitney U=309, p=0.0030 (rrb=0.545, large effect; conservative independent-samples assumption); Fisher's exact p=0.0022 for abstention asymmetry. Wilson 95% CI for 10% HR at n=20: [2.8%, 29.2%].

Citation

@inproceedings{kim2026provenio,
  author    = {Kim, Hoyeob},
  title     = {Provenio: A Deployed Multi-Ontology Art Knowledge
               Graph with MCP-Based LLM Grounding for Cultural
               Heritage Queries},
  booktitle = {Proceedings of ISWC 2026 In-Use Track},
  year      = {2026},
  publisher = {Springer}
}