此篇博客借鉴同济自豪兄的教程与数据集,点击进入原链接。
预练习语义切割模型猜测,本文介绍两种办法,一种为单命令行,一种为PythonAPI。
一、单命令行完成
单命令行运转时速度较慢,但易于理解,了解每个过程的详细意义,合适新手学习。
进入 mmsegmentation 主目录
import os
os.chdir('mmsegmentation')
载入测验图画
from PIL import Image
# Image.open('data/street_uk.jpeg')
下载素材至data
目录
假如报错Unable to establish SSL connection.
,重新运转代码块即可。
# 伦敦街景图片
!wget https://www.6hu.cc/wp-content/uploads/2023/12/224869-JdyRbw.jpeg -P data
# 卧室图片
!wget https://www.6hu.cc/wp-content/uploads/2023/12/224869-rgRUpi.jpg -P data
# 上海驾车街景视频,视频来历:https://www.youtube.com/watch?v=ll8TgCZ0plk
!wget https://zihao-download.obs.cn-east-3.myhuaweicloud.com/detectron2/traffic.mp4 -P data
# 街拍视频,2022年3月30日
!wget https://zihao-openmmlab.obs.cn-east-3.myhuaweicloud.com/20220713-mmdetection/images/street_20220330_174028.mp4 -P data
从Model Zoo中挑选预练习语义切割模型的config文件和checkpoint权重文件
Model Zoo:github.com/open-mmlab/…
留意,config文件和checkpoint文件是一一对应的,请留意,一一对应。
- Segformer算法,Cityscpaes数据集预练习模型
configs/segformer/segformer_mit-b5_8xb1-160k_cityscapes-1024x1024.py
https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b5_8x1_1024x1024_160k_cityscapes/segformer_mit-b5_8x1_1024x1024_160k_cityscapes_20211206_072934-87a052ec.pth
- Segformer算法,ADE20K数据集预练习模型
configs/segformer/segformer_mit-b5_8xb2-160k_ade20k-640x640.py
https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b5_640x640_160k_ade20k/segformer_mit-b5_640x640_160k_ade20k_20210801_121243-41d2845b.pth
Segformer算法,Cityscpaes数据集预练习模型
Cityscapes语义切割数据集:www.cityscapes-dataset.com
19个类别
‘road’, ‘sidewalk’, ‘building’, ‘wall’, ‘fence’, ‘pole’, ‘traffic light’, ‘traffic sign’, ‘vegetation’, ‘terrain’, ‘sky’, ‘person’, ‘rider’, ‘car’, ‘truck’, ‘bus’, ‘train’, ‘motorcycle’, ‘bicycle’
!python demo/image_demo.py
data/street_uk.jpeg
configs/segformer/segformer_mit-b5_8xb1-160k_cityscapes-1024x1024.py
--device cuda:0
--opacity 0.5
Image.open('outputs/B1_uk_segformer.jpg')
Segformer算法,ADE20K数据集预练习模型
ADE20K 150个类别
groups.csail.mit.edu/vision/data…
!python demo/image_demo.py
data/bedroom.jpg
configs/segformer/segformer_mit-b5_8xb2-160k_ade20k-640x640.py
--device cuda:0
--opacity 0.5
# Image.open('outputs/B1_Segformer_ade20k.jpg')
假如报错RuntimeError: CUDA out of memory.
显存不够
处理办法:换用更大显存的GPU实例、换用占用显存更小的模型、缩小输入图画尺度、重启kernel
二、Python API完成
预练习语义切割模型猜测-单张图画-Python API
视频链接可以参见,同济子豪兄:space.bilibili.com/1900783
进入 mmsegmentation 主目录
import os
os.chdir('mmsegmentation')
导入工具包
import numpy as np
import cv2
from mmseg.apis import init_model, inference_model, show_result_pyplot
import mmcv
import matplotlib.pyplot as plt
%matplotlib inline
载入模型
# 模型 config 配置文件
config_file = 'configs/segformer/segformer_mit-b5_8xb1-160k_cityscapes-1024x1024.py'
# 模型 checkpoint 权重文件
checkpoint_file = 'https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b5_8x1_1024x1024_160k_cityscapes/segformer_mit-b5_8x1_1024x1024_160k_cityscapes_20211206_072934-87a052ec.pth'
也要留意一一对应关系。
model = init_model(config_file, checkpoint_file, device='cuda:0')
载入测验图画
img_path = 'data/street_uk.jpeg'
img_bgr = cv2.imread(img_path)
img_bgr.shape
# 输出为
(1500, 2250, 3)
plt.imshow(img_bgr[:,:,::-1])
plt.show()
语义切割推理猜测
result = inference_model(model, img_bgr)
result.keys()
输出为 ['seg_logits', 'pred_sem_seg']
语义切割猜测成果-定性类别
# 类别:0-18,共 19 个 类别
result.pred_sem_seg.data.shape
# 输出为 torch.Size([1, 1500, 2250])
np.unique(result.pred_sem_seg.data.cpu())
# 输出为 array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 13, 15])
result.pred_sem_seg.data.shape
#输出为 torch.Size([1, 1500, 2250])
result.pred_sem_seg.data
# 输出:tensor([[[10, 10, 10, ..., 10, 10, 10],
[10, 10, 10, ..., 10, 10, 10],
[10, 10, 10, ..., 10, 10, 10],
...,
[ 0, 0, 0, ..., 0, 0, 0],
[ 0, 0, 0, ..., 0, 0, 0],
[ 0, 0, 0, ..., 0, 0, 0]]], device='cuda:0')
pred_mask = result.pred_sem_seg.data[0].detach().cpu().numpy()
plt.imshow(pred_mask)
plt.show()
语义切割猜测成果-定量置信度
# 置信度
result.seg_logits.data.shape
#输出 torch.Size([19, 1500, 2250])
可视化语义切割猜测成果-办法一
plt.figure(figsize=(14, 8))
plt.imshow(img_bgr[:,:,::-1])
plt.imshow(pred_mask, alpha=0.55) # alpha 高亮区域透明度,越小越挨近原图
plt.axis('off')
plt.savefig('outputs/B2-1.jpg')
plt.show()
可视化语义切割猜测成果-办法二(和原图并排显示)
plt.figure(figsize=(14, 8))
plt.subplot(1,2,1)
plt.imshow(img_bgr[:,:,::-1])
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(img_bgr[:,:,::-1])
plt.imshow(pred_mask, alpha=0.6) # alpha 高亮区域透明度,越小越挨近原图
plt.axis('off')
plt.savefig('outputs/B2-2.jpg')
plt.show()
可视化语义切割猜测成果-办法三
依照mmseg/datasets/cityscapes.py
界说的配色方案,原程序的API
from mmseg.apis import show_result_pyplot
img_viz = show_result_pyplot(model, img_path, result, opacity=0.8, title='MMSeg', out_file='outputs/B2-3.jpg')
# 输出:/environment/miniconda3/lib/python3.7/site-packages/mmengine/visualization/visualizer.py:196: UserWarning: Failed to add <class 'mmengine.visualization.vis_backend.LocalVisBackend'>, please provide the `save_dir` argument.
warnings.warn(f'Failed to add {vis_backend.__class__}, '
opacity操控透明度,越小,越挨近原图。
img_viz.shape
# 输出:(1500, 2250, 3)
plt.figure(figsize=(14, 8))
plt.imshow(img_viz)
plt.show()
可视化语义切割猜测成果-办法四(加图例)
from mmseg.datasets import cityscapes
import numpy as np
import mmcv
from PIL import Image
# 获取类别号和调色板
classes = cityscapes.CityscapesDataset.METAINFO['classes']
palette = cityscapes.CityscapesDataset.METAINFO['palette']
opacity = 0.15 # 透明度,越大越挨近原图
# 将切割图按调色板染色
# seg_map = result[0].astype('uint8')
seg_map = pred_mask.astype('uint8')
seg_img = Image.fromarray(seg_map).convert('P')
seg_img.putpalette(np.array(palette, dtype=np.uint8))
from matplotlib import pyplot as plt
import matplotlib.patches as mpatches
plt.figure(figsize=(14, 8))
im = plt.imshow(((np.array(seg_img.convert('RGB')))*(1-opacity) + mmcv.imread(img_path)*opacity) / 255)
# 为每一种色彩创立一个图例
patches = [mpatches.Patch(color=np.array(palette[i])/255., label=classes[i]) for i in range(len(classes))]
plt.legend(handles=patches, bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0., fontsize='large')
plt.savefig('outputs/B2-4.jpg')
plt.show()