本文檔基于openvino_2022.2。
一.簡(jiǎn)介
OpenVINO? 工具包是一個(gè)綜合工具包,用于快速開發(fā)解決各種任務(wù)的應(yīng)用程序和解決方案,包括模擬人類視覺、自動(dòng)語音識(shí)別、自然語言處理、推薦系統(tǒng)等。
2018年發(fā)布,開源、商用免費(fèi)。
1.OpenVINO? 工具包
- 在邊緣啟用基于 CNN 的深度學(xué)習(xí)推理
- 支持跨英特爾? CPU、英特爾? 集成顯卡、英特爾? 神經(jīng)計(jì)算棒 2 和英特爾? 視覺加速器設(shè)計(jì)與英特爾? Movidius? VPU 的異構(gòu)執(zhí)行
- 通過易于使用的計(jì)算機(jī)視覺功能庫(kù)和預(yù)先優(yōu)化的內(nèi)核加快上市時(shí)間
- 包括對(duì)計(jì)算機(jī)視覺標(biāo)準(zhǔn)的優(yōu)化調(diào)用,包括 OpenCV* 和 OpenCL?
2.OpenVINO? 工具包組件
-
深度學(xué)習(xí)模型優(yōu)化器
- 跨平臺(tái)的命令行工具包,支持導(dǎo)入來自主流的深度學(xué)習(xí)框架的模型,模型文件可能來自tensorflow、pytorch、caffe、MXNet、ONNX等深度學(xué)習(xí)框架與工具生成。模型優(yōu)化器支持對(duì)導(dǎo)入模型的轉(zhuǎn)換、優(yōu)化、導(dǎo)出中間格式文件。
-
深度學(xué)習(xí)推理引擎
- 一組統(tǒng)一的 C++/Python API函數(shù),允許在許多硬件類型上進(jìn)行高性能推理,包括英特爾? CPU、英特爾? 集成顯卡、英特爾? 神經(jīng)計(jì)算棒 2、采用英特爾? Movidius? 視覺處理單元 (VPU) 的英特爾? 視覺加速器設(shè)計(jì)。
推理引擎示例:一組簡(jiǎn)單的控制臺(tái)應(yīng)用程序,演示如何在您的應(yīng)用程序中使用推理引擎。
-
深度學(xué)習(xí)工作臺(tái)(DL Workbench)
- 基于 Web 的圖形環(huán)境,是官方的 OpenVINO? 圖形界面,旨在使預(yù)訓(xùn)練深度學(xué)習(xí)計(jì)算機(jī)視覺和自然語言處理模型的生成變得更加容易。
訓(xùn)練后優(yōu)化工具:用于校準(zhǔn)模型然后以 INT8 精度執(zhí)行它的工具。
附加工具:一組用于處理模型的工具,包括Benchmark App、Cross Check Tool、Compile tool。
-
Open Model Zoo
- 包括針對(duì)各種視覺問題的深度學(xué)習(xí)解決方案,包括對(duì)象識(shí)別、人臉識(shí)別、姿勢(shì)估計(jì)、文本檢測(cè)和動(dòng)作識(shí)別
- 附加工具:一組用于處理模型的工具,包括Accuracy Checker Utility和Model Downloader。
- 預(yù)訓(xùn)練模型文檔:Open Model Zoo github倉(cāng)庫(kù)中提供的預(yù)訓(xùn)練模型文檔。
- Tensorflow預(yù)訓(xùn)練模型庫(kù)
Deep Learning Streamer (DL Streamer):基于 GStreamer 的流分析框架,用于構(gòu)建媒體分析組件的圖形。DL Streamer 可以通過英特爾? Distribution of OpenVINO? 工具包安裝程序進(jìn)行安裝。
OpenCV:為英特爾? 硬件編譯的 OpenCV 社區(qū)版本
英特爾? 媒體 SDK:(僅在面向 Linux 的英特爾? OpenVINO? 工具套件分發(fā)版中)
3.OpenVINO? 工具包工作流程
- 支持部署設(shè)備
- Intel? CPU (e.g. Intel? Core? i7-1165G7)
- dGPU (e.g. Intel? Iris? Xe MAX) 集成顯卡
- iGPU (e.g. Intel? UHD Graphics 620 (iGPU)) 獨(dú)立顯卡
- Intel? Movidius? Myriad? X VPU (e.g. Intel? Neural Compute Stick 2 (Intel? NCS2))
- GNA (處理器集成的高斯和神經(jīng)加速器):旨在提供人工智能語音和音頻應(yīng)用程序,例如神經(jīng)噪聲消除。
二、安裝OpenVINO組件
1.環(huán)境依賴
- 操作系統(tǒng)
- Ubuntu 18.04 long-term support (LTS), 64-bit
- Ubuntu 20.04 long-term support (LTS), 64-bit
- 硬件設(shè)備
- 6th to 12th generation Intel? Core? processors and Intel? Xeon? processors
- 3rd generation Intel? Xeon? Scalable processor (formerly code named Cooper Lake)
- Intel? Xeon? Scalable processor (formerly Skylake and Cascade Lake)
- Intel Atom? processor with support for Intel? Streaming SIMD Extensions 4.1 (Intel? SSE4.1)
- Intel Pentium? processor N4200/5, N3350/5, or N3450/5 with Intel? HD Graphics
- Intel? Iris? Xe MAX Graphics
- Intel? Neural Compute Stick 2
- Intel? Vision Accelerator Design with Intel? Movidius? VPUs
2.下載與安裝
到Intel? Distribution of OpenVINO? Toolkit下載選擇下載openvino development tools或openvino runtime。
1)PIP 安裝OpenVINO Development Tools
安裝OpenVINO Development Tools會(huì)一并安裝OpenVINO Runtime。
# Step 1: Create and activate virtual environment
python3 -m venv openvino_env
source openvino_env/bin/activate
# Step 2: Upgrade pip to latest version
python -m pip install --upgrade pip
# Step 3: Download and install the package
pip install openvino-dev[ONNX,tensorflow2,mxnet,kaldi,caffe,pytorch]==2022.2.0
# 在當(dāng)前目錄會(huì)出現(xiàn)openvino_env文件夾
$ tree openvino_env/ -L 2
openvino_env/
├── bin
│ ├── accuracy_check
│ ├── activate
│ ├── activate.csh
│ ├── activate.fish
│ ├── Activate.ps1
│ ├── backend-test-tools
│ ├── benchmark_app # 評(píng)估模型
│ ├── check-model
│ ├── check-node
│ ├── convert_annotation
│ ├── convert-caffe2-to-onnx
│ ├── convert-onnx-to-caffe2
│ ├── cpuinfo
│ ├── easy_install
│ ├── easy_install-3.8
│ ├── estimator_ckpt_converter
│ ├── f2py
│ ├── f2py3
│ ├── f2py3.8
│ ├── google-oauthlib-tool
│ ├── huggingface-cli
│ ├── imagecodecs
│ ├── imageio_download_bin
│ ├── imageio_remove_bin
│ ├── import_pb_to_tensorboard
│ ├── lsm2bin
│ ├── markdown_py
│ ├── mo # Model Optimizer
│ ├── nib-conform
│ ├── nib-convert
│ ├── nib-dicomfs
│ ├── nib-diff
│ ├── nib-ls
│ ├── nib-nifti-dx
│ ├── nib-roi
│ ├── nib-stats
│ ├── nib-tck2trk
│ ├── nib-trk2tck
│ ├── nltk
│ ├── normalizer
│ ├── omz_converter # Open Model Zoo工具:預(yù)訓(xùn)練模型轉(zhuǎn)IR文件
│ ├── omz_data_downloader # Open Model Zoo工具:下載數(shù)據(jù)
│ ├── omz_downloader # Open Model Zoo工具:下載預(yù)訓(xùn)練模型
│ ├── omz_info_dumper
│ ├── omz_quantizer
│ ├── opt_in_out
│ ├── parrec2nii
│ ├── pip
│ ├── pip3
│ ├── pip3.10
│ ├── pip3.8
│ ├── pot # Post-training Optimization Tool
│ ├── pydicom
│ ├── pyrsa-decrypt
│ ├── pyrsa-encrypt
│ ├── pyrsa-keygen
│ ├── pyrsa-priv2pub
│ ├── pyrsa-sign
│ ├── pyrsa-verify
│ ├── python -> python3
│ ├── python3 -> /usr/bin/python3
│ ├── saved_model_cli
│ ├── skivi
│ ├── tensorboard
│ ├── tflite_convert
│ ├── tf_upgrade_v2
│ ├── tiff2fsspec
│ ├── tiffcomment
│ ├── tifffile
│ ├── toco
│ ├── toco_from_protos
│ ├── tqdm
│ ├── transformers-cli
│ └── wheel
├── include
├── lib
│ └── python3.8
├── lib64 -> lib
├── pyvenv.cfg
└── share
├── doc
└── python-wheels
├── appdirs-1.4.3-py2.py3-none-any.whl
├── CacheControl-0.12.6-py2.py3-none-any.whl
├── certifi-2019.11.28-py2.py3-none-any.whl
├── chardet-3.0.4-py2.py3-none-any.whl
├── colorama-0.4.3-py2.py3-none-any.whl
├── contextlib2-0.6.0-py2.py3-none-any.whl
├── distlib-0.3.0-py2.py3-none-any.whl
├── distro-1.4.0-py2.py3-none-any.whl
├── html5lib-1.0.1-py2.py3-none-any.whl
├── idna-2.8-py2.py3-none-any.whl
├── ipaddr-2.2.0-py2.py3-none-any.whl
├── lockfile-0.12.2-py2.py3-none-any.whl
├── msgpack-0.6.2-py2.py3-none-any.whl
├── packaging-20.3-py2.py3-none-any.whl
├── pep517-0.8.2-py2.py3-none-any.whl
├── pip-20.0.2-py2.py3-none-any.whl
├── pkg_resources-0.0.0-py2.py3-none-any.whl
├── progress-1.5-py2.py3-none-any.whl
├── pyparsing-2.4.6-py2.py3-none-any.whl
├── requests-2.22.0-py2.py3-none-any.whl
├── retrying-1.3.3-py2.py3-none-any.whl
├── setuptools-44.0.0-py2.py3-none-any.whl
├── six-1.14.0-py2.py3-none-any.whl
├── toml-0.10.0-py2.py3-none-any.whl
├── urllib3-1.25.8-py2.py3-none-any.whl
├── webencodings-0.5.1-py2.py3-none-any.whl
└── wheel-0.34.2-py2.py3-none-any.whl
$ pip list
Package Version
---------------------------- -----------
...
google-auth 2.13.0
google-auth-oauthlib 0.4.6
google-pasta 0.2.0
...
keras 2.9.0
...
mxnet 1.7.0.post2
...
numpy 1.23.1
onnx 1.11.0
opencv-python 4.6.0.66
openvino 2022.2.0
openvino-dev 2022.2.0
openvino-telemetry 2022.1.1
...
scikit-image 0.19.3
scikit-learn 0.24.2
scipy 1.5.4
...
tensorboard 2.9.1
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorflow 2.9.1
tensorflow-estimator 2.9.0
tensorflow-io-gcs-filesystem 0.27.0
...
torch 1.8.1
torchvision 0.9.1
tqdm 4.64.1
transformers 4.23.1
...
2)Docker 安裝OpenVINO Development Tools
通過docker hub獲取鏡像。
# Intel CPU
docker run -it --rm openvino/ubuntu18_dev
# Intel GPU
docker run -it --rm --device /dev/dri openvino/ubuntu18_dev
# NCS2(單個(gè)VPU)
docker run -it --rm --device-cgroup-rule='c 189:* rmw' -v /dev/bus/usb:/dev/bus/usb openvino/ubuntu18_dev
# HDDL(多個(gè)VPU)
docker run -it --rm --device=/dev/ion:/dev/ion -v /var/tmp:/var/tmp openvino/ubuntu18_dev
容器說明
# 默認(rèn)進(jìn)入工作目錄,如/opt/intel/openvino_2022.2.0.7713
$ tree -L 2
.
|-- docs
| |-- OpenVINO-GetStarted-online.html
| |-- OpenVINO-Install-Linux-online.html
| |-- OpenVINO-OpenVX-documentation.html
| |-- OpenVINO-documentation-online.html
| |-- licensing
|-- extras
| |-- opencv
|-- install_dependencies
| |-- 97-myriad-usbboot.rules
| |-- install_NCS_udev_rules.sh
| |-- install_NEO_OCL_driver.sh
| |-- install_openvino_dependencies.sh
|-- licensing
| |-- DockerImage_readme.txt
| |-- third-party-programs-docker-dev.txt
| |-- third-party-programs-docker-runtime.txt
|-- python
| |-- python3.6
| |-- python3.7
| |-- python3.8
| |-- python3.9
|-- runtime
| |-- 3rdparty
| |-- cmake
| |-- include
| |-- lib
| |-- version.txt
|-- samples
| |-- c
| | |-- CMakeLists.txt
| | |-- build_samples.sh
| | |-- common
| | |-- hello_classification
| | |-- hello_nv12_input_classification
| |-- cpp
| | |-- CMakeLists.txt
| | |-- benchmark_app
| | |-- build
| | |-- build_samples.sh
| | |-- classification_sample_async
| | |-- common
| | |-- hello_classification
| | |-- hello_nv12_input_classification
| | |-- hello_query_device
| | |-- hello_reshape_ssd
| | |-- model_creation_sample
| | |-- samples_bin
| | |-- speech_sample
| | |-- thirdparty
| |-- python
| |-- classification_sample_async
| |-- hello_classification
| |-- hello_query_device
| |-- hello_reshape_ssd
| |-- model_creation_sample
| |-- requirements.txt
| |-- setup.cfg
| |-- speech_sample
|-- setupvars.sh
|-- tools
|-- cl_compiler
|-- compile_tool
|-- deployment_manager
|-- requirements.txt
|-- requirements_caffe.txt
|-- requirements_kaldi.txt
|-- requirements_mxnet.txt
|-- requirements_onnx.txt
|-- requirements_pytorch.txt
|-- requirements_tensorflow.txt
|-- requirements_tensorflow2.txt
$ ls /usr/local/bin/omz*
/usr/local/bin/omz_converter /usr/local/bin/omz_downloader /usr/local/bin/omz_quantizer
/usr/local/bin/omz_data_downloader /usr/local/bin/omz_info_dumper
OpenVINO? 工具套件組件對(duì)比
2021 | 2022 |
---|---|
Inference Engine Runtime | 進(jìn)化為OpenVINO? Runtime |
Samples | 保留,進(jìn)行了精簡(jiǎn),移除了與OMZ demo中重復(fù)的示例,且只保留用于理解API用法的示例 |
Dev tools,含MO, POT, DLWB,以及OMZ中的下載、轉(zhuǎn)換等工具[注2] | 不再默認(rèn)包含,需要單獨(dú)通過pip進(jìn)行安裝 |
非Dev tools,含deployment manager, compile_tool等 | 保留 |
OpenCV | 不再默認(rèn)包含,需要通過單獨(dú)提供的腳本下載和安裝 |
DL Workbench的下載安裝腳本 | 從安裝包中移除,單獨(dú)通過pip安裝 |
DL Streamer | 從安裝包中移除,單獨(dú)通過APT進(jìn)行安裝 |
Media SDK | Media SDK進(jìn)化為One VPL[注3],從安裝包中移除 |
Demo應(yīng)用(來自于OMZ) | 從安裝包中移除 |
3)Docker安裝dl workbench
https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Run_Locally.html#windows
# Manage Docker as a non-root user
$ sudo groupadd docker
$ sudo usermod -aG docker $USER
$ newgrp docker # activate the changes to groups
$ docker ps
$ docker pull openvino/workbench:latest
$ docker run -p 0.0.0.0:5665:5665 --name workbench -it --rm openvino/workbench:latest
waiting for server to start..... done
server started
waiting for server to shut down..... done
server stopped
[Workbench] PostgreSQL init process complete.
[Workbench] PostgreSQL applying migrations...
waiting for server to start..... done
server started
打開瀏覽器,輸入http://127.0.0.1:5665.
三、使用OpenVINO組件
1.使用openvino-dev容器
基于openvino development tools。
1)容器基礎(chǔ)使用
# 初始化openvino環(huán)境變量
$ source /opt/intel/openvino/setupvars.sh
# 初始化openvino-opencv環(huán)境變量,否則無法拉流
$ source /opt/intel/openvino/extras/opencv/setupvars.sh
# 查看設(shè)備信息
$ cd /opt/intel/openvino_2022.2.0.7713/samples/python/hello_query_device
$ python3 hello_query_device.py
[ INFO ] Available devices:
[ INFO ] CPU :
[ INFO ] SUPPORTED_PROPERTIES:
[ INFO ] AVAILABLE_DEVICES:
[ INFO ] RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 1, 1
[ INFO ] RANGE_FOR_STREAMS: 1, 8
[ INFO ] FULL_DEVICE_NAME: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
[ INFO ] OPTIMIZATION_CAPABILITIES: WINOGRAD, FP32, FP16, INT8, BIN, EXPORT_IMPORT
[ INFO ] CACHE_DIR:
[ INFO ] NUM_STREAMS: 1
[ INFO ] AFFINITY: Affinity.CORE
[ INFO ] INFERENCE_NUM_THREADS: 0
[ INFO ] PERF_COUNT: False
[ INFO ] INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ] PERFORMANCE_HINT: PerformanceMode.UNDEFINED
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] GPU :
[ INFO ] SUPPORTED_PROPERTIES:
[ INFO ] AVAILABLE_DEVICES: 0
[ INFO ] RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 2, 1
[ INFO ] RANGE_FOR_STREAMS: 1, 2
[ INFO ] OPTIMAL_BATCH_SIZE: 1
[ INFO ] MAX_BATCH_SIZE: 1
[ INFO ] FULL_DEVICE_NAME: Intel(R) Iris(R) Xe Graphics [0x9a49] (iGPU)
[ INFO ] DEVICE_UUID: UNSUPPORTED TYPE
[ INFO ] DEVICE_TYPE: Type.INTEGRATED
[ INFO ] DEVICE_GOPS: UNSUPPORTED TYPE
[ INFO ] OPTIMIZATION_CAPABILITIES: FP32, BIN, FP16, INT8
[ INFO ] GPU_DEVICE_TOTAL_MEM_SIZE: UNSUPPORTED TYPE
[ INFO ] GPU_UARCH_VERSION: 12.0.0
[ INFO ] GPU_EXECUTION_UNITS_COUNT: 96
[ INFO ] GPU_MEMORY_STATISTICS: UNSUPPORTED TYPE
[ INFO ] PERF_COUNT: False
[ INFO ] MODEL_PRIORITY: Priority.MEDIUM
[ INFO ] GPU_HOST_TASK_PRIORITY: Priority.MEDIUM
[ INFO ] GPU_QUEUE_PRIORITY: Priority.MEDIUM
[ INFO ] GPU_QUEUE_THROTTLE: Priority.MEDIUM
[ INFO ] GPU_ENABLE_LOOP_UNROLLING: True
[ INFO ] CACHE_DIR:
[ INFO ] PERFORMANCE_HINT: PerformanceMode.UNDEFINED
[ INFO ] COMPILATION_NUM_THREADS: 8
[ INFO ] NUM_STREAMS: 1
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] DEVICE_ID: 0
2)運(yùn)行openvino樣例
直接運(yùn)行python樣例:
# CPU
docker run -it --rm <image_name>
/bin/bash -c "cd ~ && omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 /opt/intel/openvino/samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp CPU"
# GPU
docker run -itu root:root --rm --device /dev/dri:/dev/dri <image_name>
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp GPU"
# MYRIAD
docker run -itu root:root --rm --device-cgroup-rule='c 189:\* rmw' -v /dev/bus/usb:/dev/bus/usb <image_name>
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp MYRIAD"
# HDDL
docker run -itu root:root --rm --device=/dev/ion:/dev/ion -v /var/tmp:/var/tmp -v /dev/shm:/dev/shm <image_name>
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && umask 000 && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp HDDL"
編譯運(yùn)行C++樣例:
# 容器中
$ cd /opt/intel/openvino_2022.2.0.7713/samples/cpp
# 編譯樣例
$ ./build_samples.sh
$ tree samples_bin/
samples_bin/
|-- benchmark_app
|-- classification_sample_async
|-- hello_classification
|-- hello_nv12_input_classification
|-- hello_query_device
|-- hello_reshape_ssd
|-- model_creation_sample
|-- speech_sample
3)命令行使用
# 查看可獲取的預(yù)訓(xùn)練模型
$ omz_downloader --print_all
Sphereface
aclnet
aclnet-int8
action-recognition-0001
age-gender-recognition-retail-0013
alexnet
......
yolo-v3-onnx
yolo-v3-tf
yolo-v3-tiny-onnx
yolo-v3-tiny-tf
yolo-v4-tf
yolo-v4-tiny-tf
yolof
yolox-tiny
# 測(cè)試openvino運(yùn)行模型
$ cd /opt/intel/openvino_2022.2.0.7713/samples/python/hello_classification/
# 1.下載預(yù)訓(xùn)練模型
$ omz_downloader --name alexnet
$ tree public/alexnet/
|-- alexnet.caffemodel
|-- alexnet.prototxt
|-- alexnet.prototxt.orig
# 2.轉(zhuǎn)化模型
$ omz_converter --name alexnet
$ tree public/alexnet/
|-- FP16
| |-- alexnet.bin
| |-- alexnet.mapping
| -- alexnet.xml
|-- FP32
| |-- alexnet.bin
| |-- alexnet.mapping
| -- alexnet.xml
|-- alexnet.caffemodel
|-- alexnet.prototxt
|-- alexnet.prototxt.orig
# 3.運(yùn)行模型
$ curl -O https://storage.openvinotoolkit.org/data/test_data/images/banana.jpg
$ python3 hello_classification.py public/alexnet/FP16/alexnet.xml banana.jpg CPU/GPU/AUTO
# 4.模型基準(zhǔn)測(cè)試
$ benchmark_app -m public/alexnet/FP16/alexnet.xml -i banana.jpg -d CPU/GPU -niter 128 -api sync/async
Latency:
Median: 34.38 ms
AVG: 34.60 ms
MIN: 19.57 ms
MAX: 69.10 ms
Throughput: 115.30 FPS
# 容器資源占用
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
f8e127db77d8 openvino-ubuntu18_dev 786.64% 1.324GiB / 7.383GiB 17.93% 0B / 0B 483MB / 74.4MB 19
# 查看Intel GPU消耗
sudo apt-get install -y intel-gpu-tools
sudo intel_gpu_top
intel-gpu-top - 0/ 0 MHz; 100% RC6; ----- (null); 0 irqs/s
IMC reads: ------ (null)/s
IMC writes: ------ (null)/s
ENGINE BUSY MI_SEMA MI_WAIT
Render/3D/0 99.65% | | 0% 0%
Blitter/0 0.00% | | 0% 0%
Video/0 0.00% | | 0% 0%
Video/1 0.00% | | 0% 0%
VideoEnhance/0 0.00% | | 0% 0%
2.使用openvino_notebooks樣例
https://github.com/openvinotoolkit/openvino_notebooks/blob/main/README_cn.md
1)容器中安裝jupyter
參考遠(yuǎn)程服務(wù)器(ubuntu20.04)+docker容器內(nèi)jupyter遠(yuǎn)程使用
基于上述的openvino-dev容器環(huán)境中安裝jupyter和啟動(dòng)jupyter-notebook。
apt-get update
apt-get install vim
pip install jupyter
# 生成jupyter notebook的配置文件
jupyter-notebook --generate-config
# 修改配置文件
vim ~/.jupyter/jupyter_notebook_config.py
# 允許通過任意綁定服務(wù)器的ip訪問
c.NotebookApp.ip = '*'
# 用于訪問的端口
c.NotebookApp.port = 8888 #注意這里與前面開出的容器端口要一致
# 不自動(dòng)打開瀏覽器
c.NotebookApp.open_browser = False
#允許遠(yuǎn)程訪問
c.NotebookApp.allow_remote_access = True
# 啟動(dòng)jupyter
$ jupyter notebook -ip 0.0.0.0 --allow-root --port 8888 --no-browser
......
http://127.0.0.1:8888/?token=xxx
使用瀏覽器訪問notebook,輸入token登錄:如 http://192.168.1.10:8888/
2)下載并使用openvino_notebooks工程樣例
apt-get install git
cd ~; git clone https://github.com/openvinotoolkit/openvino_notebooks
在瀏覽器的notebook中打開樣例的ipynb文件即可,如:openvino_notebooks/notebooks/001-hello-world/001-hello-world.ipynb。
四、模型處理
OpenVINO? 支持多種模型格式,并允許將它們轉(zhuǎn)換為自己的 OpenVINO IR。
1.OpenVINO模型處理工具
https://docs.openvino.ai/latest/omz_tools_downloader.html
- mo:模型優(yōu)化器可以將預(yù)訓(xùn)練深度學(xué)習(xí)模型:TensorFlow、PyTorch、PaddlePaddle、MXNet、Caffe、Kaldi 或 ONNX 轉(zhuǎn)換為 OpenVINO 中間表示格式 (IR)。
- .xml - 描述整個(gè)模型拓?fù)洌總€(gè)階層,相連性和參數(shù)值。
- .bin - 包含每層已經(jīng)訓(xùn)練好的權(quán)值和偏移值。
- 包含功能:
- Convert(轉(zhuǎn)換)
- Optimize(優(yōu)化)
- Conversion weights and offsets(轉(zhuǎn)換權(quán)重與偏置)
- pot:訓(xùn)練后優(yōu)化工具可以在推理過程中將權(quán)重和激活從浮點(diǎn)精度量化到整數(shù)精度(例如,8 位)。
- 不同的硬件平臺(tái)支持不同的整數(shù)精度和量化參數(shù),POT 通過引入“目標(biāo)設(shè)備”的概念來抽象這種復(fù)雜性。
- 需要一個(gè)未標(biāo)注的數(shù)據(jù)集進(jìn)行量化。
- Open Model Zoo工具:針對(duì)Open Model Zoo的模型進(jìn)行一鍵化處理。
- omz_downloader:從在線資源下載模型文件。
- omz_converter:將其他格式模型裝換為IR格式模型。
- omz_quantizer:將 IR 格式的全精度模型量化為低精度版本。
- omz_info_dumper:以穩(wěn)定的機(jī)器可讀格式打印有關(guān)模型的信息。
- omz_data_downloader:從安裝位置復(fù)制數(shù)據(jù)集的數(shù)據(jù)。
- benchmark_app:Benchmark C++ Tool 在支持的設(shè)備上評(píng)估深度學(xué)習(xí)推理性能。
2.各類模型格式轉(zhuǎn)換
OpenVINO IR(中間表示):OpenVINO? 的專有格式。
ONNX、PaddlePaddle:直接支持的格式,OpenVINO 提供 C++ 和 Python API 用于將它們直接導(dǎo)入 OpenVINO Runtime,無需任何事先轉(zhuǎn)換。
TensorFlow、PyTorch、MXNet、Caffe、Kaldi:間接支持的格式,它們需要轉(zhuǎn)換為前面列出的格式之一。使用模型優(yōu)化器執(zhí)行從這些格式到 OpenVINO IR 的轉(zhuǎn)換。在某些情況下,需要使用其他轉(zhuǎn)換器作為中介。
1)mo參數(shù)說明
# 可選參數(shù):
--framework {onnx,mxnet,tf,kaldi,caffe,paddle}
# 與框架無關(guān)的參數(shù):
--input_model INPUT_MODEL, -w INPUT_MODEL, -m INPUT_MODEL
--model_name MODEL_NAME, -n MODEL_NAME
輸出IR文件名
--output_dir OUTPUT_DIR, -o OUTPUT_DIR
--input_shape INPUT_SHAPE
模型輸入節(jié)點(diǎn)的shape,也可以使用--input參數(shù)設(shè)置input_shape
--scale SCALE, -s SCALE
原始網(wǎng)絡(luò)中所有的input會(huì)除以這個(gè)值。
--reverse_input_channels
轉(zhuǎn)換通道,從RGB→BGR
--log_level {CRITICAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
Logger level
--input INPUT
帶""的字符串,用逗號(hào)分隔的輸入節(jié)點(diǎn)信息,包括名稱、形狀、數(shù)據(jù)類型等。
--output OUTPUT
指定模型的輸出節(jié)點(diǎn)
--mean_values MEAN_VALUES, -ms MEAN_VALUES
對(duì)輸入圖像的每一個(gè)通道設(shè)置mean值
--scale_values SCALE_VALUES
對(duì)輸入圖像的每一個(gè)通道設(shè)置scale值
--source_layout SOURCE_LAYOUT
Layout of the input or output of the model in the framework. Layout can be specified in
the short form, e.g. nhwc, or in complex form, e.g. "[n,h,w,c]". Example for many names:
"in_name1([n,h,w,c]),in_name2(nc),out_name1(n),out_name2(nc)". Layout can be partially
defined, "?" can be used to specify undefined layout for one dimension, "..." can be used
to specify undefined layout for multiple dimensions, for example "?c??", "nc...", "n...c",
etc.
--target_layout TARGET_LAYOUT
Same as --source_layout, but specifies target layout that will be in the model after
processing by ModelOptimizer.
--layout LAYOUT Combination of --source_layout and --target_layout. Can't be used with either of them. If
model has one input it is sufficient to specify layout of this input, for example --layout
nhwc. To specify layouts of many tensors, names must be provided, for example: --layout
"name1(nchw),name2(nc)". It is possible to instruct ModelOptimizer to change layout, for
example: --layout "name1(nhwc->nchw),name2(cn->nc)". Also "*" in long layout form can be
used to fuse dimensions, for example "[n,c,...]->[n*c,...]".
--data_type {FP16,FP32,half,float}
數(shù)據(jù)類型,該參數(shù)決定了模型的精度。
--transform TRANSFORM
Apply additional transformations. Usage: "--transform
transformation_name1[args],transformation_name2..." where [args] is key=value pairs
separated by semicolon. Examples: "--transform LowLatency2" or "--transform
LowLatency2[use_const_initializer=False]" or "--transform "MakeStateful[param_res_names={'
input_name_1':'output_name_1','input_name_2':'output_name_2'}]"" Available
transformations: "LowLatency2", "MakeStateful"
--disable_fusing [DEPRECATED] Turn off fusing of linear operations to Convolution.
--disable_resnet_optimization
[DEPRECATED] Turn off ResNet optimization.
--finegrain_fusing FINEGRAIN_FUSING
[DEPRECATED] Regex for layers/operations that won't be fused. Example: --finegrain_fusing
Convolution1,.*Scale.*
--enable_concat_optimization
[DEPRECATED] Turn on Concat optimization.
--extensions EXTENSIONS
Paths or a comma-separated list of paths to libraries (.so or .dll) with extensions. For
the legacy MO path (if `--use_legacy_frontend` is used), a directory or a comma-separated
list of directories with extensions are supported. To disable all extensions including
those that are placed at the default location, pass an empty string.
--batch BATCH, -b BATCH
Input batch size
--version Version of Model Optimizer
--silent Prevent any output messages except those that correspond to log level equals ERROR, that
can be set with the following option: --log_level. By default, log level is already ERROR.
--freeze_placeholder_with_value FREEZE_PLACEHOLDER_WITH_VALUE
Replaces input layer with constant node with provided value, for example:
"node_name->True". It will be DEPRECATED in future releases. Use --input option to specify
a value for freezing.
--static_shape Enables IR generation for fixed input shape (folding `ShapeOf` operations and shape-
calculating sub-graphs to `Constant`). Changing model input shape using the OpenVINO
Runtime API in runtime may fail for such an IR.
--disable_weights_compression
[DEPRECATED] Disable compression and store weights with original precision.
--progress Enable model conversion progress display.
--stream_output Switch model conversion progress display to a multiline mode.
--transformations_config TRANSFORMATIONS_CONFIG
Use the configuration file with transformations description. File can be specified as
relative path from the current directory, as absolute path or as arelative path from the
mo root directory
--use_new_frontend Force the usage of new Frontend of Model Optimizer for model conversion into IR. The new
Frontend is C++ based and is available for ONNX* and PaddlePaddle* models. Model optimizer
uses new Frontend for ONNX* and PaddlePaddle* by default that means `--use_new_frontend`
and `--use_legacy_frontend` options are not specified.
--use_legacy_frontend
Force the usage of legacy Frontend of Model Optimizer for model conversion into IR. The
legacy Frontend is Python based and is available for TensorFlow*, ONNX*, MXNet*, Caffe*,
and Kaldi* models.
2)轉(zhuǎn)換ONNX模型
mo --input_model <INPUT_MODEL>.onnx
3)轉(zhuǎn)換PaddlePaddle模型
mo --input_model <INPUT_MODEL>.pdmodel
# 示例
mo --input_model=yolov3.pdmodel --input=image,im_shape,scale_factor --input_shape=[1,3,608,608],[1,2],[1,2] --reverse_input_channels --output=save_infer_model/scale_0.tmp_1,save_infer_model/scale_1.tmp_1
4)轉(zhuǎn)換PyTorch模型
PyTorch模型先導(dǎo)出ONNX模型,再轉(zhuǎn)為OpenVINO IR。
import torch
# Instantiate your model. This is just a regular PyTorch model that will be exported in the following steps.
model = SomeModel()
# Evaluate the model to switch some operations from training mode to inference.
model.eval()
# Create dummy input for the model. It will be used to run the model inside export function.
dummy_input = torch.randn(1, 3, 224, 224)
# Call the export function
torch.onnx.export(model, (dummy_input, ), 'model.onnx')
- 從 PyTorch 1.8.1 版開始,并非所有 PyTorch 操作都可以導(dǎo)出到默認(rèn)使用的 ONNX opset 9。當(dāng)導(dǎo)出到默認(rèn) opset 9 不起作用時(shí),建議將模型導(dǎo)出到 opset 11 或更高版本。
5)轉(zhuǎn)換Caffe模型
mo --input_model <INPUT_MODEL>.caffemodel
# 針對(duì)Caffe 的特定參數(shù):
--input_proto INPUT_PROTO,-d INPUT_PROTO
包含拓?fù)涞牟渴鹁途w prototxt 文件
結(jié)構(gòu)和層屬性
--caffe_parser_path CAFFE_PARSER_PATH
從 caffe.proto 生成的 python Caffe 解析器的路徑
-k K 指定自定義層映射文件 CustomLayersMapping.xml
--disable_omitting_optional
禁用忽略可選屬性(用于自定義圖層)。如果要轉(zhuǎn)移自定義層的所有屬性到 IR,請(qǐng)使用此選項(xiàng)。默認(rèn)行為是將具有默認(rèn)值的屬性和用戶定義的屬性傳遞給 IR。
--enable_flattening_nested_params
啟用展平可選參數(shù)(用于自定義圖層)。如果要將自定義層的屬性傳輸?shù)骄哂姓蛊角短讌?shù)的 IR,請(qǐng)使用此選項(xiàng)。默認(rèn)行為是在不展平嵌套參數(shù)的情況下傳輸屬性。
# 示例
mo --input_model bvlc_alexnet.caffemodel --input_proto bvlc_alexnet.prototxt
# 如果caffemodel與prototxt在相同路徑,則指定input_model即可。
mo --input_model bvlc_alexnet.caffemodel -k CustomLayersMapping.xml --disable_omitting_optional --enable_flattening_nested_params
6)轉(zhuǎn)換TensorFlow模型
-
針對(duì)TensorFlow*的特定參數(shù)
--input_model_is_text TensorFlow*: treat the input model file as a text protobuf format. If not specified, the Model Optimizer treats it as a binary file by default. --input_checkpoint INPUT_CHECKPOINT TensorFlow*: variables file to load. --input_meta_graph INPUT_META_GRAPH Tensorflow*: a file with a meta-graph of the model before freezing --saved_model_dir SAVED_MODEL_DIR TensorFlow*: directory with a model in SavedModel format of TensorFlow 1.x or 2.x version. --saved_model_tags SAVED_MODEL_TAGS Group of tag(s) of the MetaGraphDef to load, in string format, separated by ','. For tag- set contains multiple tags, all tags must be passed in. --tensorflow_custom_operations_config_update TENSORFLOW_CUSTOM_OPERATIONS_CONFIG_UPDATE TensorFlow*: update the configuration file with node name patterns with input/output nodes information. --tensorflow_use_custom_operations_config TENSORFLOW_USE_CUSTOM_OPERATIONS_CONFIG Use the configuration file with custom operation description. --tensorflow_object_detection_api_pipeline_config TENSORFLOW_OBJECT_DETECTION_API_PIPELINE_CONFIG TensorFlow*: path to the pipeline configuration file used to generate model created with help of Object Detection API. --tensorboard_logdir TENSORBOARD_LOGDIR TensorFlow*: dump the input graph to a given directory that should be used with TensorBoard. --tensorflow_custom_layer_libraries TENSORFLOW_CUSTOM_LAYER_LIBRARIES TensorFlow*: comma separated list of shared libraries with TensorFlow* custom operations implementation. --disable_nhwc_to_nchw [DEPRECATED] Disables the default translation from NHWC to NCHW. Since 2022.1 this option is deprecated and used only to maintain backward compatibility with previous releases.
-
針對(duì)TensorFlow 1 Models
# Converting Frozen Model Format mo --input_model <INPUT_MODEL>.pb # Converting Non-Frozen Model Formats # 1.Checkpoint存儲(chǔ)格式:包含inference_graph.pb和checkpoint_file.ckpt文件 mo --input_model <INFERENCE_GRAPH>.pb --input_checkpoint <INPUT_CHECKPOINT> # 2.MetaGraph儲(chǔ)存格式:包含model_name.meta, model_name.index, model_name.data-00000-of-00001和checkpoint_file.ckpt【可選】 mo --input_meta_graph <INPUT_META_GRAPH>.meta # 3.SavedModel儲(chǔ)存格式:一個(gè)文件夾中包含.pb文件,variables、assets 和 assets.extra子文件夾 mo --saved_model_dir <SAVED_MODEL_DIRECTORY>
-
導(dǎo)出Frozen Model Format
import tensorflow as tf from tensorflow.python.framework import graph_io frozen = tf.compat.v1.graph_util.convert_variables_to_constants(sess, sess.graph_def, ["name_of_the_output_node"]) graph_io.write_graph(frozen, './', 'inference_graph.pb', as_text=False)
-
-
針對(duì)TensorFlow 2 Models
-
SavedModel儲(chǔ)存格式:一個(gè)文件夾中包含.pb文件和 variables 、assets子文件夾
mo --saved_model_dir <SAVED_MODEL_DIRECTORY>
-
Keras H5儲(chǔ)存格式,需要先將其序列化為SavedModel格式。
import tensorflow as tf model = tf.keras.models.load_model('model.h5') tf.saved_model.save(model,'model')
-
7)轉(zhuǎn)換Mxnet模型
-
針對(duì)TensorFlow*的特定參數(shù)
Mxnet-specific parameters: --input_symbol INPUT_SYMBOL Symbol file (for example, model-symbol.json) that contains a topology structure and layer attributes --nd_prefix_name ND_PREFIX_NAME Prefix name for args.nd and argx.nd files. --pretrained_model_name PRETRAINED_MODEL_NAME Name of a pretrained MXNet model without extension and epoch number. This model will be merged with args.nd and argx.nd files --save_params_from_nd Enable saving built parameters file from .nd files --legacy_mxnet_model Enable MXNet loader to make a model compatible with the latest MXNet version. Use only if your model was trained with MXNet version lower than 1.0.0 --enable_ssd_gluoncv Enable pattern matchers replacers for converting gluoncv ssd topologies.
3.訓(xùn)練后優(yōu)化
https://docs.openvino.ai/latest/pot_compression_cli_README.html
# Basic usage for DefaultQuantization
pot -q default -m <path_to_xml> -w <path_to_bin> --ac-config <path_to_AC_config_yml>
# Basic usage for AccuracyAwareQauntization
pot -q accuracy_aware -m <path_to_xml> -w <path_to_bin> --ac-config <path_to_AC_config_yml> --max-drop 0.01
五、OpenVINO推理
Integrate OpenVINO? with Your Application
1.使用openvino.runtime api開發(fā)
1)同步推理流程
-
創(chuàng)建Core對(duì)象;
from openvino.runtime import Core, Type, Layout core = Core() # 查看可用設(shè)備【可選】 devices = ie.available_devices for device in devices: device_name = ie.get_property(device, "FULL_DEVICE_NAME") print(f"{device}: {device_name}")
CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
GNA: GNA_SW
GPU: Intel(R) Iris(R) Xe Graphics [0x9a49] (iGPU) -
載入并編譯模型;
# 讀取模型文件,model_path為 .xml files 或 .onnx file model = core.read_model(model_path) # 獲取模型輸入輸出信息【可選】 input_layer = model.input(0) output_layer = model.output(0) print(f"input precision: {input_layer.element_type}") print(f"input shape: {input_layer.shape}") print(f"output precision: {output_layer.element_type}") print(f"output shape: {output_layer.shape}") # 集成預(yù)處理步驟到模型【可選】 # 參考:使用openvino.preprocess api開發(fā) # 將模型文件編譯到指定的設(shè)備:device_name='CPU/GPU', config可選 compiled_model = core.compile_model(model, device_name, config)
input precision: <Type: 'float32'>
input shape: {1, 3, 224, 224}output precision: <Type: 'float32'>
output shape: {1, 1001} -
執(zhí)行同步推理獲得結(jié)果;
# 方法1:程序預(yù)處理, blob為模型輸入數(shù)據(jù) import numpy as np image = cv2.imread(image_path) N, C, H, W = input_layer.shape resized_image = cv2.resize(src=image, dsize=(W, H)) input_data = np.expand_dims(np.transpose(resized_image, (2, 0, 1)), 0).astype(np.float32) result = compiled_model([input_data])[output_layer] # 阻塞推理 # 方法2: 模型集成預(yù)處理 image = cv2.imread(image_path) # Add N dimension input_tensor = np.expand_dims(image, 0) results = compiled_model.infer_new_request({0: input_tensor}) # 阻塞推理
2)異步推理流程
加載模型步驟與前面一致
-
執(zhí)行異步推理獲得結(jié)果
from openvino.runtime import AsyncInferQueue, Core, InferRequest, Layout, Type # Read input images images = [cv2.imread(image_path) for image_path in args.input] # Add N dimension input_tensors = [np.expand_dims(image, 0) for image in resized_images] # create async queue with optimal number of infer requests infer_queue = AsyncInferQueue(compiled_model) infer_queue.set_callback(completion_callback) for i, input_tensor in enumerate(input_tensors): # 執(zhí)行異步推理 infer_queue.start_async({0: input_tensor}, args.input[i]) # 非阻塞 # 等待推理結(jié)束 infer_queue.wait_all()
或
... # 創(chuàng)建一個(gè)推理請(qǐng)求負(fù)責(zé)處理當(dāng)前幀 infer_request_curr = net.create_infer_request() # 創(chuàng)建一個(gè)推理請(qǐng)求負(fù)責(zé)處理下一幀 infer_request_next = net.create_infer_request() # Get the current frame,采集當(dāng)前幀圖像 frame_curr = cv2.imread("./data/images/bus.jpg") # Preprocess the frame,對(duì)當(dāng)前幀做預(yù)處理 letterbox_img_curr, _, _ = letterbox(frame_curr, auto=False) # Normalization + Swap RB + Layout from HWC to NCHW blob = Tensor(cv2.dnn.blobFromImage(letterbox_img_curr, 1/255.0, swapRB=True)) # 將數(shù)據(jù)傳入模型的指定輸入節(jié)點(diǎn) infer_request_curr.set_tensor(input_node, blob) # 調(diào)用start_sync(),以非阻塞方式啟動(dòng)當(dāng)前幀推理計(jì)算 infer_request_curr.start_async() while True: # 下一幀推理請(qǐng)求數(shù)據(jù)blob準(zhǔn)備 # 將數(shù)據(jù)傳入下一幀推理請(qǐng)求 infer_request_next.set_tensor(input_node, blob) # 調(diào)用start_sync(),以非阻塞的方式啟動(dòng)下一幀推理計(jì)算 infer_request_next.start_async() # 等待當(dāng)前幀推理請(qǐng)求結(jié)束 infer_request_curr.wait() # 從 output_node獲取當(dāng)前幀推理結(jié)果 infer_result = infer_request_curr.get_tensor(output_node) # Postprocess the inference result data = torch.tensor(infer_result.data) # 交換當(dāng)前幀推理請(qǐng)求和下一幀推理請(qǐng)求 infer_request_curr, infer_request_next = infer_request_next, infer_request_curr
2.使用openvino.preprocess api開發(fā)
OpenVINO? 2022.1之后的預(yù)處理API可以將所有預(yù)處理步驟都集成到在執(zhí)行圖中,這樣dGPU、VPU或iGPU都能進(jìn)行數(shù)據(jù)預(yù)處理,無需依賴CPU。
1)數(shù)據(jù)預(yù)處理的典型操作
- 改變輸入數(shù)據(jù)的形狀:[720, 1280,3] → [1, 3, 640, 640]
- 改變輸入數(shù)據(jù)的精度:U8 → f32
- 改變輸入數(shù)據(jù)的顏色通道順序:BGR → RGB
- 改變輸入數(shù)據(jù)的布局(layout):HWC → NCHW
- 歸一化數(shù)據(jù):減去均值(mean),除以標(biāo)準(zhǔn)差(std)
2)OpenVINO預(yù)處理API主要流程
-
實(shí)例化PrePostProcessor對(duì)象
from openvino.runtime import Core, Type, Layout from openvino.preprocess import PrePostProcessor, ColorFormat, ResizeAlgorithm core = Core() model = core.read_model(model_path) ppp = PrePostProcessor(model)
-
聲明輸入張量的信息
image = cv2.imread(image_path) # Add N dimension input_tensor = np.expand_dims(image, 0) # 例如:input_tensor.shape = [1,640,640,3] ppp.input().tensor() \ .set_shape([1,640,640,3]) \ # 圖像的尺寸,按照'NHWC'的順序?qū)? .set_color_format(ColorFormat.BGR) \ .set_element_type(Type.u8) \ .set_layout(Layout('NHWC'))
-
指定模型的數(shù)據(jù)布局(layout)
# 模型輸入的數(shù)據(jù)布局為NCHW ppp.input().model().set_layout(Layout('NCHW')) # 模型輸出的數(shù)據(jù)布局為NHWC【可選】 ppp.output().model().set_layout(Layout('NHWC'))
-
聲明輸出張量的信息
ppp.output().tensor() \ .set_element_type(Type.f32) # 輸出張量的精度為f32 .set_layout(Layout('NHWC')) # 可選
-
定義預(yù)處理的具體步驟
# 或 自定義前處理步驟 ppp.input().preprocess() \ .convert_element_type(Type.f32) \ .convert_color(ColorFormat.RGB) \ # 將輸入圖像從BGR格式轉(zhuǎn)化為RGB格式 .resize(ResizeAlgorithm.RESIZE_LINEAR, 224, 224) # 例如模型輸入尺寸是[1,3,224,224] .mean([0.0, 0.0, 0.0]) \ .scale([255.0, 255.0, 255.0]) \ .convert_layout([0, 3, 1, 2]) # 將'NHWC'轉(zhuǎn)化為'NCHW'
-
OpenVINO支持的前處理操作步驟
- convert_color、convert_element_type、convert_layout、crop、mean、resize、reverse_channels、scale、custom
-
OpenVINO支持的前處理操作步驟
-
定義后處理的具體步驟【可選】
ppp.output().postprocess() \ .convert_element_type(Type.f32) \ ..convert_layout([0, 3, 1, 2]) # 將'NHWC'轉(zhuǎn)化為'NCHW'
-
OpenVINO支持的后處理操作步驟
- convert_element_type、convert_layout、custom
-
OpenVINO支持的后處理操作步驟
-
將預(yù)處理步驟集成到模型
model = ppp.build()
<img src="https://img-blog.csdnimg.cn/42d7745f14e1434cbaee46725eac5420.png" style="zoom: 80%;" />
-
將集成了預(yù)處理步驟的模型導(dǎo)出【可選】
from openvino.offline_transformations import serialize serialize(model, 'xxx.xml', 'xxx.bin')
3.Auto-Device及Automatic Batching插件
OpenVINOTM 2022.1中AUTO插件和自動(dòng)批處理的最佳實(shí)踐
1)Auto-Device
AUTO Device (簡(jiǎn)稱 Automatic device selection) 是一個(gè)構(gòu)建在CPU/GPU插件之上的虛擬插件,它不綁定到特定類型的設(shè)備,它可以是受支持的CPU、GPU、VPU(視覺處理單元)或 GNA(高斯神經(jīng)加速器協(xié)處理器)或這些設(shè)備的組合。
優(yōu)點(diǎn):
- 根據(jù)深度學(xué)習(xí)模型和所選設(shè)備的特性以最佳配置使用它們。
- 使 GPU 實(shí)現(xiàn)更快的首次推理延遲:GPU 插件需要在開始推理之前在運(yùn)行時(shí)進(jìn)行在線模型編譯。當(dāng)選擇獨(dú)立或集成GPU時(shí),“AUTO”插件開始會(huì)首先利用CPU進(jìn)行推理,以隱藏此GPU模型編譯時(shí)間。
- 使用簡(jiǎn)單,開發(fā)者只需將compile_model()方法的device_name參數(shù)指定為“AUTO”即可。
設(shè)備切換邏輯:
- AUTO插件會(huì)依據(jù)設(shè)備優(yōu)先級(jí): dGPU > iGPU > VPU > CPU 來選擇最佳計(jì)算設(shè)備。當(dāng)自動(dòng)插件選擇 GPU 作為最佳設(shè)備時(shí),會(huì)發(fā)生推理設(shè)備切換,以隱藏首次推理延遲。
不同設(shè)備支持的精度
SupportedDevice | Supportedmodel precision |
---|---|
dGPU(e.g. Intel? Iris? Xe MAX) | FP32, FP16, INT8, BIN |
iGPU(e.g. Intel? UHD Graphics 620 (iGPU)) | FP32, FP16, BIN |
Intel? Movidius? Myriad? X VPU(e.g. Intel? Neural Compute Stick 2 (Intel? NCS2)) | FP16 |
Intel? CPU(e.g. Intel? Core? i7-1165G7) | FP32, FP16, INT8, BIN |
2)Automatic Batching
自動(dòng)批處理(Automatic Batching) 將用戶程序發(fā)出的多個(gè)異步推理請(qǐng)求組合起來,將它們視為多批次推理請(qǐng)求,并將批推理結(jié)果拆解后,返回給各推理請(qǐng)求。
當(dāng)compile_model()方法的config參數(shù)設(shè)置為{“PERFORMANCE_HINT”: ”THROUGHPUT”}時(shí),OpenVINOTM Runtime會(huì)自動(dòng)啟動(dòng)自動(dòng)批處理執(zhí)行。
-
PERFORMANCE_HINT 應(yīng)用場(chǎng)景 是否啟動(dòng)Auto Batching? THROUGHPUT 非實(shí)時(shí)的大批量推理計(jì)算任務(wù) 是 LATENCY 實(shí)時(shí)或近實(shí)時(shí)應(yīng)用任務(wù) 否
compiled_model = core.compile_model(model="xxx.onnx", device_name="AUTO", \
config={"PERFORMANCE_HINT": "THROUGHPUT", 'ALLOW_AUTO_BATCHING': 'YES'})
4.C++推理示例
#include <iterator>
#include <memory>
#include <sstream>
#include <string>
#include <vector>
// clang-format off
#include "openvino/openvino.hpp"
#include "samples/args_helper.hpp"
#include "samples/common.hpp"
#include "samples/classification_results.h"
#include "samples/slog.hpp"
#include "format_reader_ptr.h"
// clang-format on
/**
* @brief Main with support Unicode paths, wide strings
*/
int tmain(int argc, tchar* argv[]) {
try {
// -------- Step 1. Initialize OpenVINO Runtime Core --------
ov::Core core;
// -------- Step 2. Read a model --------
std::shared_ptr<ov::Model> model = core.read_model(model_path);
printInputAndOutputsInfo(*model);
// -------- Step 3. Set up input
// Read input image to a tensor and set it to an infer request
// without resize and layout conversions
FormatReader::ReaderPtr reader(image_path.c_str());
if (reader.get() == nullptr) {
std::stringstream ss;
ss << "Image " + image_path + " cannot be read!";
throw std::logic_error(ss.str());
}
ov::element::Type input_type = ov::element::u8;
ov::Shape input_shape = {1, reader->height(), reader->width(), 3};
std::shared_ptr<unsigned char> input_data = reader->getData();
// just wrap image data by ov::Tensor without allocating of new memory
ov::Tensor input_tensor = ov::Tensor(input_type, input_shape, input_data.get());
const ov::Layout tensor_layout{"NHWC"};
// -------- Step 4. Configure preprocessing --------
ov::preprocess::PrePostProcessor ppp(model);
// 1) Set input tensor information:
// - input() provides information about a single model input
// - reuse precision and shape from already available `input_tensor`
// - layout of data is 'NHWC'
ppp.input().tensor().set_shape(input_shape).set_element_type(input_type).set_layout(tensor_layout);
// 2) Adding explicit preprocessing steps:
// - convert layout to 'NCHW' (from 'NHWC' specified above at tensor layout)
// - apply linear resize from tensor spatial dims to model spatial dims
ppp.input().preprocess().resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR);
// 4) Here we suppose model has 'NCHW' layout for input
ppp.input().model().set_layout("NCHW");
// 5) Set output tensor information:
// - precision of tensor is supposed to be 'f32'
ppp.output().tensor().set_element_type(ov::element::f32);
// 6) Apply preprocessing modifying the original 'model'
model = ppp.build();
// -------- Step 5. Loading a model to the device --------
ov::CompiledModel compiled_model = core.compile_model(model, device_name);
// -------- Step 6. Create an infer request --------
ov::InferRequest infer_request = compiled_model.create_infer_request();
// -----------------------------------------------------------------------------------------------------
// -------- Step 7. Prepare input --------
infer_request.set_input_tensor(input_tensor);
// -------- Step 8. Do inference synchronously --------
infer_request.infer();
// -------- Step 9. Process output
const ov::Tensor& output_tensor = infer_request.get_output_tensor();
// Print classification results
ClassificationResult classification_result(output_tensor, {image_path});
classification_result.show();
// -----------------------------------------------------------------------------------------------------
} catch (const std::exception& ex) {
std::cerr << ex.what() << std::endl;
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
推理模式
-
自動(dòng)設(shè)備選擇 (AUTO):檢測(cè)可用設(shè)備,選擇最適合該任務(wù)的設(shè)備,并配置其優(yōu)化設(shè)置。這樣可以編寫一次應(yīng)用程序并將其部署到任何地方。
-
從 CPU 開始執(zhí)行推理,繼續(xù)將模型加載到最適合該目的的設(shè)備,并在準(zhǔn)備好時(shí)將任務(wù)轉(zhuǎn)移給它。
- 使用CPU可以減少首次推理延時(shí)。
-
多設(shè)備執(zhí)行 (MULTI)
異構(gòu)執(zhí)行 (HETERO):允許在多個(gè)設(shè)備上執(zhí)行一個(gè)模型的推理。
自動(dòng)批處理執(zhí)行(Auto-batching):通過將推理請(qǐng)求分組在一起來提高設(shè)備利用率,而無需用戶進(jìn)行編程。