00d62c1bA dynamic primitive arena for ARC-style grid tasks: ARC JSON loader, category routing, wide candidate families, MDL selector, transparent DCC trace, LOO validation, and visual failure mining.
easy_signature, macro, symmetry, motion, integer_expand, and crop_like→core/object are useful lanes. Blind global chain2 is weak; routed chain2 has real but narrow value.This is a private research prototype for failure mining, primitive discovery, routing, and MDL/DCC diagnostics. It is not an ARC Prize submission, not an official benchmark claim, and not an exposure of proprietary MDLxDCC core internals.
The useful measurements here are train-fit exacts, HIGH confidence, LOO survival, new primitive-family wins, near-miss structure, exact per runtime, and whether the arena learns where to spend DCC budget.
The local arc_report.html pages are valuable because they show train input, target, winner prediction, exact/near-miss status, categories, candidate counts, and failure buckets. Instead of embedding every run report, the v0.4 package should carry a compact atlas plus links back to per-run reports.
00d62c1b62ab2642007bbfb7Package rule: place the generated key bundle beside this page as arc_v04_key_bundle/. Keep the huge original out_arc_v04_overnight_w10 outside the MDLxDCC package unless needed locally.
| Run | Cat | Family cat | Tasks | Exact | HIGH | LOO avg | Exact/hour | Runtime | Chain2 exact | Exact families |
|---|---|---|---|---|---|---|---|---|---|---|
28_cat_symmetry_wide | symmetry | all | 160 | 14 | 12 | 0.091 | 6.5 | 2:09:37 | 0 | erode_color:1; fill_holes:2; line_connect:1; object_to_marker:1; outline_foreground:1; ray_cast:1; recolor_by_map:1; scale_up_integer:1 |
32_cat_easy_signature_wide | easy_signature | all | 160 | 14 | 12 | 0.084 | 150.5 | 0:05:35 | 0 | erode_color:1; global_color_replace:1; line_connect:1; recolor_by_map:2; remove_background:1; rotate:1; scale_up_integer:2; substitution_expand:1 |
21_cat_macro_wide | macro | all | 160 | 14 | 11 | 0.082 | 6.5 | 2:08:16 | 0 | crop_largest_object:1; crop_non_bg_bbox:1; diagonal_connect:1; extract_object:1; fill_holes:1; gravity_pack:1; line_connect:2; recolor_by_map:1 |
15_cat_crop_like_wide | crop_like | all | 160 | 11 | 10 | 0.080 | 1.9 | 5:40:03 | 0 | block_compress:1; compress_blank_rows_cols:1; crop_largest_object:2; crop_non_bg_bbox:1; extract_object:1; pad_resize:1; remove_background:1; tile_repeat:3 |
11_cat_in_canvas_wide | in_canvas | all | 160 | 11 | 9 | 0.066 | 2.6 | 4:12:26 | 0 | diagonal_connect:1; fill_holes:1; gravity_pack:1; line_connect:2; outline_foreground:1; recolor_by_map:1; rotate:1; symmetry_complete:1 |
10_cat_same_shape_wide | same_shape | all | 160 | 11 | 9 | 0.066 | 2.4 | 4:36:48 | 0 | diagonal_connect:1; fill_holes:1; gravity_pack:1; line_connect:2; outline_foreground:1; recolor_by_map:1; rotate:1; symmetry_complete:1 |
45_paint_family_color_morph | recolor_or_paint | color,morphology,line | 220 | 11 | 9 | 0.050 | 5.0 | 2:12:52 | 2 | chain2:2; diagonal_connect:1; erode_color:1; fill_holes:1; line_connect:2; outline_foreground:1; ray_cast:1; recolor_by_map:1 |
17_cat_sequence_wide | sequence | all | 160 | 10 | 9 | 0.064 | 3.9 | 2:35:35 | 0 | crop_largest_object:1; crop_non_bg_bbox:1; extract_object:1; fill_holes:1; gravity_pack:1; line_connect:1; recolor_by_map:1; substitution_expand:1 |
24_cat_frame_wide | frame | all | 160 | 10 | 9 | 0.063 | 4.7 | 2:08:05 | 0 | block_compress:1; crop_largest_object:1; crop_non_bg_bbox:1; extract_object:1; outline_foreground:1; substitution_expand:2; tile_repeat:3 |
51_crop_family_core_object | crop_like | core,object | 186 | 10 | 8 | 0.060 | 107.8 | 0:05:34 | 1 | chain2:1; crop_largest_object:2; crop_non_bg_bbox:1; extract_object:1; pad_resize:1; remove_background:1; tile_repeat:3 |
23_cat_multicolor_object_wide | multicolor_object | all | 160 | 10 | 8 | 0.057 | 4.4 | 2:15:57 | 0 | block_compress:1; crop_largest_object:1; extract_object:1; gravity_pack:1; pad_resize:1; recolor_by_map:1; rotate:1; scale_up_integer:1 |
30_cat_hard_wide | hard | all | 120 | 8 | 7 | 0.066 | 1.2 | 6:34:19 | 0 | block_compress:1; compress_blank_rows_cols:1; crop_largest_object:1; extract_object:1; pad_resize:1; tile_repeat:2; wire_connect:1 |
20_cat_recolor_or_paint_wide | recolor_or_paint | all | 160 | 8 | 6 | 0.047 | 3.7 | 2:10:09 | 0 | diagonal_connect:1; fill_holes:1; line_connect:2; outline_foreground:1; recolor_by_map:1; symmetry_complete:1; wire_connect:1 |
25_cat_holes_wide | holes | all | 160 | 6 | 6 | 0.048 | 2.1 | 2:54:49 | 0 | crop_largest_object:1; extract_object:1; fill_holes:1; mirror:1; pad_resize:1; tile_repeat:1 |
| Run | Cat | Family cat | Tasks | Exact | HIGH | LOO avg | Exact/hour | Runtime | Chain2 exact | Exact families |
|---|---|---|---|---|---|---|---|---|---|---|
43_expand_family_macro | expand | macro | 68 | 4 | 4 | 0.070 | 348.0 | 0:00:41 | 0 | scale_up_integer:2; substitution_expand:2 |
49_motion_family_motion_object | motion | motion,object | 19 | 2 | 2 | 0.105 | 346.6 | 0:00:21 | 0 | extract_multicolor_object:1; translate_object:1 |
60_full_integer_expand_macro | integer_expand | macro | 38 | 4 | 4 | 0.113 | 167.9 | 0:01:26 | 0 | scale_up_integer:2; substitution_expand:2 |
14_cat_integer_expand_wide | integer_expand | all | 38 | 4 | 4 | 0.121 | 154.4 | 0:01:33 | 0 | scale_up_integer:2; substitution_expand:2 |
32_cat_easy_signature_wide | easy_signature | all | 160 | 14 | 12 | 0.084 | 150.5 | 0:05:35 | 0 | erode_color:1; global_color_replace:1; line_connect:1; recolor_by_map:2; remove_background:1; rotate:1; scale_up_integer:2; substitution_expand:1 |
51_crop_family_core_object | crop_like | core,object | 186 | 10 | 8 | 0.060 | 107.8 | 0:05:34 | 1 | chain2:1; crop_largest_object:2; crop_non_bg_bbox:1; extract_object:1; pad_resize:1; remove_background:1; tile_repeat:3 |
27_cat_motion_wide | motion | all | 19 | 5 | 5 | 0.307 | 68.4 | 0:04:23 | 0 | crop_non_bg_bbox:1; mirror:1; remove_background:1; rotate:1; translate_object:1 |
44_color_family_color | color | color | 220 | 2 | 1 | 0.007 | 51.8 | 0:02:19 | 1 | chain2:1; recolor_by_map:1 |
46_object_family_object | object | object | 220 | 4 | 3 | 0.025 | 49.6 | 0:04:50 | 0 | extract_multicolor_object:2; extract_object:1; translate_object:1 |
48_morphology_family_morph | morphology | morphology | 220 | 3 | 2 | 0.017 | 45.0 | 0:04:00 | 1 | chain2:1; fill_holes:2 |
47_multicolor_family_object_color | multicolor_object | object,color | 220 | 3 | 1 | 0.009 | 33.1 | 0:05:26 | 0 | extract_multicolor_object:1; extract_object:1; recolor_by_map:1 |
62_full_color_family | color | color | 80 | 1 | 0 | 0.006 | 26.8 | 0:02:14 | 0 | recolor_by_map:1 |
32_cat_easy_signature_wide gives 14 exact in 5:35; 51_crop_family_core_object gives 10 exact in 5:34; 43_expand_family_macro gives 4 exact in 41 seconds. These are the lanes v0.5 should remember and route toward first.The 50-task global baseline with chain2 and the no-chain2 ablation both produced 4 exact / 3 HIGH. The no-chain2 run used roughly half the runtime. That keeps chain2 out of the global default.
But routed chain2 produced 5 exact rows across 5 unique tasks, with 3 robust LOO=1.0 rows. The useful cases are mostly color/morph/crop compositions such as line_connect → component_recolor or fill_holes → global_color_replace.
--chain2-policy off|always|routed|exact-only. Default should be routed, with category evidence and per-family caps.| Exact win family | Rows |
|---|---|
tile_repeat | 20 |
fill_holes | 17 |
substitution_expand | 16 |
line_connect | 16 |
recolor_by_map | 15 |
wire_connect | 14 |
scale_up_integer | 12 |
crop_largest_object | 12 |
extract_object | 11 |
translate_object | 9 |
diagonal_connect | 8 |
outline_foreground | 7 |
gravity_pack | 6 |
symmetry_complete | 6 |
crop_non_bg_bbox | 6 |
pad_resize | 6 |
| Near-miss family | Rows |
|---|---|
global_color_replace | 747 |
identity | 690 |
pad_resize | 537 |
line_connect | 341 |
erode_color | 240 |
fill_holes | 232 |
translate_object | 187 |
diagonal_connect | 184 |
frame_extract | 172 |
crop_color_bbox | 170 |
row_col_project | 156 |
tile_repeat | 135 |
extract_multicolor_object | 125 |
crop_largest_object | 103 |
component_recolor | 91 |
chain2 | 82 |
Row counts are not unique-task counts because the same task can appear in several category slices. Still, the repeated winners are informative: tile_repeat, fill_holes, substitution_expand, line_connect, recolor_by_map, wire_connect, scale_up_integer, crop_largest_object, and extract_object are the current strongest families.
Across the 40-run bundle, v0.4 finds 39 unique exact train-fit task ids; 33 have best LOO=1.0. Baseline global runs found 4; the rest came from slicing and routing.
| First run that found task | New exact task ids | Robust LOO=1.0 | Task ids |
|---|---|---|---|
01_dev50_wide_all | 4 | 3 | 007bbfb7, 00d62c1b, 070dd51e, 0d3d703e |
10_cat_same_shape_wide | 8 | 7 | 1e0a9b12, 1f876c06, 22168020, 22eb0ac0, 25ff71a9, 3c9b0459, 4347f46a, 496994bd |
13_cat_expand_wide | 3 | 3 | 5b6cbef5, 60c09cac, c59eb873 |
15_cat_crop_like_wide | 11 | 10 | 1cf80156, 1f85a75f, 2013d3e2, 23b5c85d, 28bf18c6, 5614dbcf, 5bd6f4ac, 68b67ca3, 73182012, a740d043, be94b721 |
25_cat_holes_wide | 1 | 1 | 67a3c6ac |
28_cat_symmetry_wide | 4 | 4 | 623ea044, 6f8cd79b, 88a10436, a5313dff |
32_cat_easy_signature_wide | 3 | 2 | 9dfd6313, aabf363d, c8f0f002 |
44_color_family_color | 1 | 1 | 62ab2642 |
45_paint_family_color_morph | 2 | 2 | 0d87d2a6, 7b6016b9 |
48_morphology_family_morph | 1 | 0 | 91714a58 |
51_crop_family_core_object | 1 | 0 | b9b7f026 |
| Failure bucket | Rows |
|---|---|
needs_depth2_composition | 4985 |
near_miss_small_rule_gap | 2801 |
needs_natural_law_primitive | 1795 |
needs_in_canvas_rule | 1180 |
needs_relational_object_reasoning | 1086 |
needs_wire_crossing_or_signal_priority | 302 |
needs_crop_then_recolor_or_mask_cleanup | 220 |
needs_substitution_or_morphogenesis | 206 |
needs_crop_resize_or_canvas_transform | 137 |
unknown_or_small_missing_step | 25 |
needs_transform_then_crop_or_recolor | 17 |
| Run | Cat | Family cat | Tasks | Exact | HIGH | LOO avg | Exact/hour | Runtime | Chain2 exact | Exact families |
|---|---|---|---|---|---|---|---|---|---|---|
81_holdout_line_wide | line | line | 100 | 0 | 0 | 0.003 | 0.0 | 3:51:30 | 0 | |
41_sequence_family_line | sequence | line | 220 | 1 | 1 | 0.006 | 0.4 | 2:49:12 | 0 | line_connect:1 |
52_holes_family_morph_natural | holes | morphology,natural | 220 | 2 | 2 | 0.018 | 0.4 | 5:00:51 | 0 | fill_holes:2 |
31_cat_slow_risk_quick | slow_risk | all | 116 | 2 | 2 | 0.022 | 0.6 | 3:29:37 | 0 | crop_largest_object:1; wire_connect:1 |
30_cat_hard_wide | hard | all | 120 | 8 | 7 | 0.066 | 1.2 | 6:34:19 | 0 | block_compress:1; compress_blank_rows_cols:1; crop_largest_object:1; extract_object:1; pad_resize:1; tile_repeat:2; wire_connect:1 |
40_line_family_line | line | line | 220 | 4 | 3 | 0.019 | 1.4 | 2:50:41 | 0 | diagonal_connect:1; line_connect:2; wire_connect:1 |
61_full_line_family | line | line | 80 | 2 | 2 | 0.025 | 1.9 | 1:03:26 | 0 | diagonal_connect:1; wire_connect:1 |
15_cat_crop_like_wide | crop_like | all | 160 | 11 | 10 | 0.080 | 1.9 | 5:40:03 | 0 | block_compress:1; compress_blank_rows_cols:1; crop_largest_object:2; crop_non_bg_bbox:1; extract_object:1; pad_resize:1; remove_background:1; tile_repeat:3 |
26_cat_morphology_wide | morphology | all | 160 | 6 | 6 | 0.048 | 2.0 | 2:57:28 | 0 | crop_largest_object:1; extract_object:1; fill_holes:1; mirror:1; pad_resize:1; tile_repeat:1 |
25_cat_holes_wide | holes | all | 160 | 6 | 6 | 0.048 | 2.1 | 2:54:49 | 0 | crop_largest_object:1; extract_object:1; fill_holes:1; mirror:1; pad_resize:1; tile_repeat:1 |
needs_depth2_composition is too broad. It appears almost everywhere and must be split into object matching, macro-grid, line completion, marker instruction, crop/resize, color-role transfer, local cellular rule, and canvas transformation buckets. identity and global_color_replace near-misses are also too noisy and should be treated as diagnostic hints, not strong routes.
arc_az.json and optional task cache storing task signature hash, categories, winning families, failed families, near-misses, runtime cost, and best candidate summaries.--cat-report, and --chain2-policy routed; skip expensive lanes when shape evidence says impossible.--cat line,color --cat-mode all and not:slow_risk.exact_but_loo_bad, identity_fallback_suspect, and near_miss_small_rule_gap subtypes.arc_report.html inside the MDLxDCC package.Canonical source files remain the arena code, batch file, summarizer, and JSON outputs. The compact key bundle is the MDLxDCC-facing artifact.
python arc_mdlxdcc_arena_v0_4.py --data-dir D:\8Z\8z\ARC\ARC-AGI-2 --split dev --max-tasks 50 --outdir out_arc_v04_quick_w10 --mode both --fresh --budget wide --workers 10 --progress-every 5
python summarize_arc_overnight.py out_arc_v04_overnight_w10
python arc_collect_keydata_v04.py out_arc_v04_overnight_w10 --out out_arc_v04_keybundle
Expected local package link target: arc_v04_key_bundle/arc_visual_atlas.html, arc_v04_key_bundle/arc_report_hub.html, arc_v04_key_bundle/run_index.csv, and arc_v04_key_bundle/task_index.csv.