Incremental report.
This commit is contained in:
		| @@ -59,6 +59,110 @@ | |||||||
|  |  | ||||||
| > 说明:完整复现命令与更全面的实验汇总,见 `docs/description/Performance_Benchmark.md`。 | > 说明:完整复现命令与更全面的实验汇总,见 `docs/description/Performance_Benchmark.md`。 | ||||||
|  |  | ||||||
|  | ### 3.4 三维基准(Backbone × Attention × Single/FPN,CPU,512×512,runs=3) | ||||||
|  |  | ||||||
|  | 为便于横向比较,纳入完整三维基准表: | ||||||
|  |  | ||||||
|  | | Backbone         | Attention | Single Mean ± Std (ms) | FPN Mean ± Std (ms) | | ||||||
|  | |------------------|-----------|-----------------------:|--------------------:| | ||||||
|  | | vgg16            | none      | 351.65 ± 1.88          | 719.33 ± 3.95       | | ||||||
|  | | vgg16            | se        | 349.76 ± 2.00          | 721.41 ± 2.74       | | ||||||
|  | | vgg16            | cbam      | 354.45 ± 1.49          | 744.76 ± 29.32      | | ||||||
|  | | resnet34         | none      | 90.99 ± 0.41           | 117.22 ± 0.41       | | ||||||
|  | | resnet34         | se        | 90.78 ± 0.47           | 115.91 ± 1.31       | | ||||||
|  | | resnet34         | cbam      | 96.50 ± 3.17           | 111.09 ± 1.01       | | ||||||
|  | | efficientnet_b0  | none      | 40.45 ± 1.53           | 127.30 ± 0.09       | | ||||||
|  | | efficientnet_b0  | se        | 46.48 ± 0.26           | 142.35 ± 6.61       | | ||||||
|  | | efficientnet_b0  | cbam      | 47.11 ± 0.47           | 150.99 ± 12.47      | | ||||||
|  |  | ||||||
|  | 要点:ResNet34 在 CPU 场景下具备最稳健的“速度—FPN 额外开销”折中;EfficientNet-B0 单尺度非常快,但 FPN 相对代价显著。 | ||||||
|  |  | ||||||
|  | ### 3.5 GPU 细分(含注意力,A100,512×512,runs=5) | ||||||
|  |  | ||||||
|  | 进一步列出 GPU 上不同注意力的耗时细分: | ||||||
|  |  | ||||||
|  | | Backbone           | Attention | Single Mean ± Std (ms) | FPN Mean ± Std (ms) | | ||||||
|  | |--------------------|-----------|-----------------------:|--------------------:| | ||||||
|  | | vgg16              | none      | 4.53 ± 0.02            | 8.51 ± 0.002        | | ||||||
|  | | vgg16              | se        | 3.80 ± 0.01            | 7.12 ± 0.004        | | ||||||
|  | | vgg16              | cbam      | 3.73 ± 0.02            | 6.95 ± 0.09         | | ||||||
|  | | resnet34           | none      | 2.32 ± 0.04            | 2.73 ± 0.007        | | ||||||
|  | | resnet34           | se        | 2.33 ± 0.01            | 2.73 ± 0.004        | | ||||||
|  | | resnet34           | cbam      | 2.46 ± 0.04            | 2.74 ± 0.004        | | ||||||
|  | | efficientnet_b0    | none      | 3.69 ± 0.07            | 4.38 ± 0.02         | | ||||||
|  | | efficientnet_b0    | se        | 3.76 ± 0.06            | 4.37 ± 0.03         | | ||||||
|  | | efficientnet_b0    | cbam      | 3.99 ± 0.08            | 4.41 ± 0.02         | | ||||||
|  |  | ||||||
|  | 要点:GPU 环境下注意力对耗时的影响较小;ResNet34 仍是单尺度与 FPN 的最佳选择,FPN 额外开销约 +18%。 | ||||||
|  |  | ||||||
|  | ### 3.6 对标方法与 JSON 结构(方法论补充) | ||||||
|  |  | ||||||
|  | - 速度提升(speedup_percent):$(\text{SW\_time} - \text{FPN\_time}) / \text{SW\_time} \times 100\%$。 | ||||||
|  | - 显存节省(memory_saving_percent):$(\text{SW\_mem} - \text{FPN\_mem}) / \text{SW\_mem} \times 100\%$。 | ||||||
|  | - 精度保障:匹配数不显著下降(例如 FPN_matches ≥ SW_matches × 0.95)。 | ||||||
|  |  | ||||||
|  | 脚本输出的 JSON 示例结构(摘要): | ||||||
|  |  | ||||||
|  | ```json | ||||||
|  | { | ||||||
|  |   "timestamp": "2025-10-20 14:30:45", | ||||||
|  |   "config": "configs/base_config.yaml", | ||||||
|  |   "model_path": "path/to/model_final.pth", | ||||||
|  |   "layout_path": "test_data/layout.png", | ||||||
|  |   "template_path": "test_data/template.png", | ||||||
|  |   "device": "cuda:0", | ||||||
|  |   "fpn": { | ||||||
|  |     "method": "FPN", | ||||||
|  |     "mean_time_ms": 245.32, | ||||||
|  |     "std_time_ms": 12.45, | ||||||
|  |     "gpu_memory_mb": 1024.5, | ||||||
|  |     "num_runs": 5 | ||||||
|  |   }, | ||||||
|  |   "sliding_window": { | ||||||
|  |     "method": "Sliding Window", | ||||||
|  |     "mean_time_ms": 352.18, | ||||||
|  |     "std_time_ms": 18.67 | ||||||
|  |   }, | ||||||
|  |   "comparison": { | ||||||
|  |     "speedup_percent": 30.35, | ||||||
|  |     "memory_saving_percent": 21.14, | ||||||
|  |     "fpn_faster": true, | ||||||
|  |     "meets_speedup_target": true, | ||||||
|  |     "meets_memory_target": true | ||||||
|  |   } | ||||||
|  | } | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | ### 3.7 复现实验命令(便携) | ||||||
|  |  | ||||||
|  | CPU 注意力对比: | ||||||
|  |  | ||||||
|  | ```zsh | ||||||
|  | PYTHONPATH=. uv run python tests/benchmark_attention.py \ | ||||||
|  |   --device cpu --image-size 512 --runs 10 \ | ||||||
|  |   --backbone resnet34 --places backbone_high desc_head | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | 三维基准: | ||||||
|  |  | ||||||
|  | ```zsh | ||||||
|  | PYTHONPATH=. uv run python tests/benchmark_grid.py \ | ||||||
|  |   --device cpu --image-size 512 --runs 3 \ | ||||||
|  |   --backbones vgg16 resnet34 efficientnet_b0 \ | ||||||
|  |   --attentions none se cbam \ | ||||||
|  |   --places backbone_high desc_head | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | GPU 三维基准(如可用): | ||||||
|  |  | ||||||
|  | ```zsh | ||||||
|  | PYTHONPATH=. uv run python tests/benchmark_grid.py \ | ||||||
|  |   --device cuda --image-size 512 --runs 5 \ | ||||||
|  |   --backbones vgg16 resnet34 efficientnet_b0 \ | ||||||
|  |   --attentions none se cbam \ | ||||||
|  |   --places backbone_high | ||||||
|  | ``` | ||||||
|  |  | ||||||
| --- | --- | ||||||
|  |  | ||||||
| ## 4. 数据与训练建议(Actionable Recommendations) | ## 4. 数据与训练建议(Actionable Recommendations) | ||||||
|   | |||||||
		Reference in New Issue
	
	Block a user
	 Jiao77
					Jiao77