1. 参数设置1.1 新建模型1.2 修改训练模型2. 制作标签3. 训练报错OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
1. 参数设置
1.1 新建模型
dnf.yml
1.2 修改训练模型
yolov7.yaml
,如果设置为1.0,会报错RuntimeError: CUDA out of memory。如果生成了train.cache和valid.cache两个文件,需要删除掉。我这里是生成在labels文件夹下
# parameters nc: 12 # number of classes # depth_multiple: 1.0 # model depth multiple # width_multiple: 1.0 # layer channel multiple depth_multiple: 0.67 # model depth multiple width_multiple: 0.75 # layer channel multiple
2. 制作标签
我将标签数据放在了data目录下
- images
- labels
0 qigong 1 door 2 monster1 3 monster2 4 monster3 5 monster4 6 boss 7 material 8 money 9 box 10 option 11 ban 12 reddoor 13 jianying 14 renying
- train.py
- yolo7.py
3. 训练
python train.py --workers 8 --device 0 --batch-size 32 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml python train.py --workers 8 --device 0 --batch-size 32 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7x.yaml --weights '' --name yolov7x --hyp data/hyp.scratch.p5.yaml
python train.py --epoch 10 --device 0 --batch-size 32 --data data/dnf.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7dnf --hyp data/hyp.scratch.p5.yaml # 训练200次 python train.py --epoch 40 --device 0 --data data/dnf.yaml --cfg cfg/training/yolov7.yaml --weights '' python train.py --epoch 40 --device 0 --data data/dnf.yaml --cfg cfg/training/yolov7.yaml --weights '' --img 1280 1280 python train.py --epoch 2 --device 0 --data data/dnf.yaml --cfg cfg/training/yolov7e6.yaml --weights ''
接着训练
# 训练100次 python train.py --epoch 10 --device 0 --data data/dnf.yaml --cfg cfg/training/yolov7.yaml --weights ./runs/train/exp2/weights/best.pt
python train.py --data ./data/dnf.yaml --cfg ./models/yolov5s.yaml --weights '' --epoch 100 --device 0 python train.py --data ./data/dnf.yaml --cfg ./models/yolov5m.yaml --weights '' --epoch 300 --device 0 python train.py --data ./data/dnf.yaml --cfg ./models/yolov5m.yaml --weights ./runs/train/exp6/weights/best.pt --epoch 300 --device 0
报错
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized. OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
解决
网上查找相关解决方案,一般采用以下方式:
import os os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"
但是这种方式治表不治里,在下一个项目运行中还有可能出行同样的问题。
本问题出现主要是因为torch包中包含了名为libiomp5md.dll的文件,与Anaconda环境中的同一个文件出现了某种冲突,所以需要删除一个。
在Anaconda文件夹下搜索,如下图,删除Anaconda包中libiomp5md.dll这个文件,即下图所选定的那个。
但如果是在某个python环境下,则需要删除的是该环境下的对应文件。
也就是:
- 如果在Anaconda的base环境下:删除..\Anaconda3\Library\bin\libiomp5md.dll
- 如果是在某个env(例如名为work)下:删除..\Anaconda3\envs\work\Library\bin\libiomp5md.dll