OpenVINO使用說明

本文檔基于openvino_2022.2。

一.簡(jiǎn)介

OpenVINO? 工具包是一個(gè)綜合工具包,用于快速開發(fā)解決各種任務(wù)的應(yīng)用程序和解決方案,包括模擬人類視覺、自動(dòng)語音識(shí)別、自然語言處理、推薦系統(tǒng)等。

2018年發(fā)布,開源、商用免費(fèi)。

OpenVINOTM 2022.2版本特性

1.OpenVINO? 工具包

  • 在邊緣啟用基于 CNN 的深度學(xué)習(xí)推理
  • 支持跨英特爾? CPU、英特爾? 集成顯卡、英特爾? 神經(jīng)計(jì)算棒 2 和英特爾? 視覺加速器設(shè)計(jì)與英特爾? Movidius? VPU 的異構(gòu)執(zhí)行
  • 通過易于使用的計(jì)算機(jī)視覺功能庫(kù)和預(yù)先優(yōu)化的內(nèi)核加快上市時(shí)間
  • 包括對(duì)計(jì)算機(jī)視覺標(biāo)準(zhǔn)的優(yōu)化調(diào)用,包括 OpenCV* 和 OpenCL?

2.OpenVINO? 工具包組件

  • 深度學(xué)習(xí)模型優(yōu)化器

    • 跨平臺(tái)的命令行工具包,支持導(dǎo)入來自主流的深度學(xué)習(xí)框架的模型,模型文件可能來自tensorflow、pytorch、caffe、MXNet、ONNX等深度學(xué)習(xí)框架與工具生成。模型優(yōu)化器支持對(duì)導(dǎo)入模型的轉(zhuǎn)換、優(yōu)化、導(dǎo)出中間格式文件。
  • 深度學(xué)習(xí)推理引擎

    • 一組統(tǒng)一的 C++/Python API函數(shù),允許在許多硬件類型上進(jìn)行高性能推理,包括英特爾? CPU、英特爾? 集成顯卡、英特爾? 神經(jīng)計(jì)算棒 2、采用英特爾? Movidius? 視覺處理單元 (VPU) 的英特爾? 視覺加速器設(shè)計(jì)。
  • 推理引擎示例:一組簡(jiǎn)單的控制臺(tái)應(yīng)用程序,演示如何在您的應(yīng)用程序中使用推理引擎。

  • 深度學(xué)習(xí)工作臺(tái)(DL Workbench)

    • 基于 Web 的圖形環(huán)境,是官方的 OpenVINO? 圖形界面,旨在使預(yù)訓(xùn)練深度學(xué)習(xí)計(jì)算機(jī)視覺和自然語言處理模型的生成變得更加容易。
  • 訓(xùn)練后優(yōu)化工具:用于校準(zhǔn)模型然后以 INT8 精度執(zhí)行它的工具。

  • 附加工具:一組用于處理模型的工具,包括Benchmark App、Cross Check Tool、Compile tool。

  • Open Model Zoo

  • Deep Learning Streamer (DL Streamer):基于 GStreamer 的流分析框架,用于構(gòu)建媒體分析組件的圖形。DL Streamer 可以通過英特爾? Distribution of OpenVINO? 工具包安裝程序進(jìn)行安裝。

  • OpenCV:為英特爾? 硬件編譯的 OpenCV 社區(qū)版本

  • 英特爾? 媒體 SDK:(僅在面向 Linux 的英特爾? OpenVINO? 工具套件分發(fā)版中)

3.OpenVINO? 工具包工作流程

image
  • 支持部署設(shè)備
    • Intel? CPU (e.g. Intel? Core? i7-1165G7)
    • dGPU (e.g. Intel? Iris? Xe MAX) 集成顯卡
    • iGPU (e.g. Intel? UHD Graphics 620 (iGPU)) 獨(dú)立顯卡
    • Intel? Movidius? Myriad? X VPU (e.g. Intel? Neural Compute Stick 2 (Intel? NCS2))
    • GNA (處理器集成的高斯和神經(jīng)加速器):旨在提供人工智能語音和音頻應(yīng)用程序,例如神經(jīng)噪聲消除。

二、安裝OpenVINO組件

image

1.環(huán)境依賴

  • 操作系統(tǒng)
    • Ubuntu 18.04 long-term support (LTS), 64-bit
    • Ubuntu 20.04 long-term support (LTS), 64-bit
  • 硬件設(shè)備
    • 6th to 12th generation Intel? Core? processors and Intel? Xeon? processors
    • 3rd generation Intel? Xeon? Scalable processor (formerly code named Cooper Lake)
    • Intel? Xeon? Scalable processor (formerly Skylake and Cascade Lake)
    • Intel Atom? processor with support for Intel? Streaming SIMD Extensions 4.1 (Intel? SSE4.1)
    • Intel Pentium? processor N4200/5, N3350/5, or N3450/5 with Intel? HD Graphics
    • Intel? Iris? Xe MAX Graphics
    • Intel? Neural Compute Stick 2
    • Intel? Vision Accelerator Design with Intel? Movidius? VPUs

2.下載與安裝

Intel? Distribution of OpenVINO? Toolkit下載選擇下載openvino development tools或openvino runtime。

1)PIP 安裝OpenVINO Development Tools

安裝OpenVINO Development Tools會(huì)一并安裝OpenVINO Runtime。

# Step 1: Create and activate virtual environment
python3 -m venv openvino_env
source openvino_env/bin/activate
# Step 2: Upgrade pip to latest version
python -m pip install --upgrade pip
# Step 3: Download and install the package
pip install openvino-dev[ONNX,tensorflow2,mxnet,kaldi,caffe,pytorch]==2022.2.0

# 在當(dāng)前目錄會(huì)出現(xiàn)openvino_env文件夾
$ tree openvino_env/ -L 2
openvino_env/
├── bin
│   ├── accuracy_check
│   ├── activate
│   ├── activate.csh
│   ├── activate.fish
│   ├── Activate.ps1
│   ├── backend-test-tools
│   ├── benchmark_app       # 評(píng)估模型
│   ├── check-model
│   ├── check-node
│   ├── convert_annotation
│   ├── convert-caffe2-to-onnx
│   ├── convert-onnx-to-caffe2
│   ├── cpuinfo
│   ├── easy_install
│   ├── easy_install-3.8
│   ├── estimator_ckpt_converter
│   ├── f2py
│   ├── f2py3
│   ├── f2py3.8
│   ├── google-oauthlib-tool
│   ├── huggingface-cli
│   ├── imagecodecs
│   ├── imageio_download_bin
│   ├── imageio_remove_bin
│   ├── import_pb_to_tensorboard
│   ├── lsm2bin
│   ├── markdown_py
│   ├── mo                              # Model Optimizer
│   ├── nib-conform
│   ├── nib-convert
│   ├── nib-dicomfs
│   ├── nib-diff
│   ├── nib-ls
│   ├── nib-nifti-dx
│   ├── nib-roi
│   ├── nib-stats
│   ├── nib-tck2trk
│   ├── nib-trk2tck
│   ├── nltk
│   ├── normalizer
│   ├── omz_converter                   # Open Model Zoo工具:預(yù)訓(xùn)練模型轉(zhuǎn)IR文件
│   ├── omz_data_downloader     # Open Model Zoo工具:下載數(shù)據(jù)
│   ├── omz_downloader              # Open Model Zoo工具:下載預(yù)訓(xùn)練模型
│   ├── omz_info_dumper
│   ├── omz_quantizer
│   ├── opt_in_out
│   ├── parrec2nii
│   ├── pip
│   ├── pip3
│   ├── pip3.10
│   ├── pip3.8
│   ├── pot                                 # Post-training Optimization Tool 
│   ├── pydicom
│   ├── pyrsa-decrypt
│   ├── pyrsa-encrypt
│   ├── pyrsa-keygen
│   ├── pyrsa-priv2pub
│   ├── pyrsa-sign
│   ├── pyrsa-verify
│   ├── python -> python3
│   ├── python3 -> /usr/bin/python3
│   ├── saved_model_cli
│   ├── skivi
│   ├── tensorboard
│   ├── tflite_convert
│   ├── tf_upgrade_v2
│   ├── tiff2fsspec
│   ├── tiffcomment
│   ├── tifffile
│   ├── toco
│   ├── toco_from_protos
│   ├── tqdm
│   ├── transformers-cli
│   └── wheel
├── include
├── lib
│   └── python3.8
├── lib64 -> lib
├── pyvenv.cfg
└── share
    ├── doc
    └── python-wheels
        ├── appdirs-1.4.3-py2.py3-none-any.whl
        ├── CacheControl-0.12.6-py2.py3-none-any.whl
        ├── certifi-2019.11.28-py2.py3-none-any.whl
        ├── chardet-3.0.4-py2.py3-none-any.whl
        ├── colorama-0.4.3-py2.py3-none-any.whl
        ├── contextlib2-0.6.0-py2.py3-none-any.whl
        ├── distlib-0.3.0-py2.py3-none-any.whl
        ├── distro-1.4.0-py2.py3-none-any.whl
        ├── html5lib-1.0.1-py2.py3-none-any.whl
        ├── idna-2.8-py2.py3-none-any.whl
        ├── ipaddr-2.2.0-py2.py3-none-any.whl
        ├── lockfile-0.12.2-py2.py3-none-any.whl
        ├── msgpack-0.6.2-py2.py3-none-any.whl
        ├── packaging-20.3-py2.py3-none-any.whl
        ├── pep517-0.8.2-py2.py3-none-any.whl
        ├── pip-20.0.2-py2.py3-none-any.whl
        ├── pkg_resources-0.0.0-py2.py3-none-any.whl
        ├── progress-1.5-py2.py3-none-any.whl
        ├── pyparsing-2.4.6-py2.py3-none-any.whl
        ├── requests-2.22.0-py2.py3-none-any.whl
        ├── retrying-1.3.3-py2.py3-none-any.whl
        ├── setuptools-44.0.0-py2.py3-none-any.whl
        ├── six-1.14.0-py2.py3-none-any.whl
        ├── toml-0.10.0-py2.py3-none-any.whl
        ├── urllib3-1.25.8-py2.py3-none-any.whl
        ├── webencodings-0.5.1-py2.py3-none-any.whl
        └── wheel-0.34.2-py2.py3-none-any.whl

$ pip list
Package                      Version
---------------------------- -----------
...
google-auth                  2.13.0
google-auth-oauthlib         0.4.6
google-pasta                 0.2.0
...
keras                        2.9.0
...
mxnet                        1.7.0.post2
...
numpy                        1.23.1
onnx                         1.11.0
opencv-python                4.6.0.66
openvino                     2022.2.0
openvino-dev                 2022.2.0
openvino-telemetry           2022.1.1
...
scikit-image                 0.19.3
scikit-learn                 0.24.2
scipy                        1.5.4
...
tensorboard                  2.9.1
tensorboard-data-server      0.6.1
tensorboard-plugin-wit       1.8.1
tensorflow                   2.9.1
tensorflow-estimator         2.9.0
tensorflow-io-gcs-filesystem 0.27.0
...
torch                        1.8.1
torchvision                  0.9.1
tqdm                         4.64.1
transformers                 4.23.1
...

2)Docker 安裝OpenVINO Development Tools

通過docker hub獲取鏡像。

# Intel CPU
docker run -it --rm openvino/ubuntu18_dev
# Intel GPU
docker run -it --rm --device /dev/dri openvino/ubuntu18_dev
# NCS2(單個(gè)VPU)
docker run -it --rm --device-cgroup-rule='c 189:* rmw' -v /dev/bus/usb:/dev/bus/usb openvino/ubuntu18_dev
# HDDL(多個(gè)VPU)
docker run -it --rm --device=/dev/ion:/dev/ion -v /var/tmp:/var/tmp openvino/ubuntu18_dev

容器說明

# 默認(rèn)進(jìn)入工作目錄,如/opt/intel/openvino_2022.2.0.7713
$ tree -L 2
.
|-- docs
|   |-- OpenVINO-GetStarted-online.html
|   |-- OpenVINO-Install-Linux-online.html
|   |-- OpenVINO-OpenVX-documentation.html
|   |-- OpenVINO-documentation-online.html
|   |-- licensing
|-- extras
|   |-- opencv
|-- install_dependencies
|   |-- 97-myriad-usbboot.rules
|   |-- install_NCS_udev_rules.sh
|   |-- install_NEO_OCL_driver.sh
|   |-- install_openvino_dependencies.sh
|-- licensing
|   |-- DockerImage_readme.txt
|   |-- third-party-programs-docker-dev.txt
|   |-- third-party-programs-docker-runtime.txt
|-- python
|   |-- python3.6
|   |-- python3.7
|   |-- python3.8
|   |-- python3.9
|-- runtime
|   |-- 3rdparty
|   |-- cmake
|   |-- include
|   |-- lib
|   |-- version.txt
|-- samples
|   |-- c
|   |   |-- CMakeLists.txt
|   |   |-- build_samples.sh
|   |   |-- common
|   |   |-- hello_classification
|   |   |-- hello_nv12_input_classification
|   |-- cpp
|   |   |-- CMakeLists.txt
|   |   |-- benchmark_app
|   |   |-- build
|   |   |-- build_samples.sh
|   |   |-- classification_sample_async
|   |   |-- common
|   |   |-- hello_classification
|   |   |-- hello_nv12_input_classification
|   |   |-- hello_query_device
|   |   |-- hello_reshape_ssd
|   |   |-- model_creation_sample
|   |   |-- samples_bin
|   |   |-- speech_sample
|   |   |-- thirdparty
|   |-- python
|       |-- classification_sample_async
|       |-- hello_classification
|       |-- hello_query_device
|       |-- hello_reshape_ssd
|       |-- model_creation_sample
|       |-- requirements.txt
|       |-- setup.cfg
|       |-- speech_sample
|-- setupvars.sh
|-- tools
    |-- cl_compiler
    |-- compile_tool
    |-- deployment_manager
    |-- requirements.txt
    |-- requirements_caffe.txt
    |-- requirements_kaldi.txt
    |-- requirements_mxnet.txt
    |-- requirements_onnx.txt
    |-- requirements_pytorch.txt
    |-- requirements_tensorflow.txt
    |-- requirements_tensorflow2.txt
    
$ ls /usr/local/bin/omz*
/usr/local/bin/omz_converter        /usr/local/bin/omz_downloader   /usr/local/bin/omz_quantizer
/usr/local/bin/omz_data_downloader  /usr/local/bin/omz_info_dumper

OpenVINO? 工具套件組件對(duì)比

2021 2022
Inference Engine Runtime 進(jìn)化為OpenVINO? Runtime
Samples 保留,進(jìn)行了精簡(jiǎn),移除了與OMZ demo中重復(fù)的示例,且只保留用于理解API用法的示例
Dev tools,含MO, POT, DLWB,以及OMZ中的下載、轉(zhuǎn)換等工具[注2] 不再默認(rèn)包含,需要單獨(dú)通過pip進(jìn)行安裝
非Dev tools,含deployment manager, compile_tool等 保留
OpenCV 不再默認(rèn)包含,需要通過單獨(dú)提供的腳本下載和安裝
DL Workbench的下載安裝腳本 從安裝包中移除,單獨(dú)通過pip安裝
DL Streamer 從安裝包中移除,單獨(dú)通過APT進(jìn)行安裝
Media SDK Media SDK進(jìn)化為One VPL[注3],從安裝包中移除
Demo應(yīng)用(來自于OMZ) 從安裝包中移除
image
image

3)Docker安裝dl workbench

https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Run_Locally.html#windows

# Manage Docker as a non-root user
$ sudo groupadd docker
$ sudo usermod -aG docker $USER
$ newgrp docker # activate the changes to groups
$ docker ps

$ docker pull openvino/workbench:latest 
$ docker run -p 0.0.0.0:5665:5665 --name workbench -it --rm openvino/workbench:latest
waiting for server to start..... done
server started
waiting for server to shut down..... done
server stopped
[Workbench] PostgreSQL init process complete.
[Workbench] PostgreSQL applying migrations...
waiting for server to start..... done
server started

打開瀏覽器,輸入http://127.0.0.1:5665.

三、使用OpenVINO組件

1.使用openvino-dev容器

基于openvino development tools。

1)容器基礎(chǔ)使用

# 初始化openvino環(huán)境變量
$ source /opt/intel/openvino/setupvars.sh
# 初始化openvino-opencv環(huán)境變量,否則無法拉流
$ source /opt/intel/openvino/extras/opencv/setupvars.sh 

# 查看設(shè)備信息
$ cd /opt/intel/openvino_2022.2.0.7713/samples/python/hello_query_device
$ python3 hello_query_device.py
[ INFO ] Available devices:
[ INFO ] CPU :
[ INFO ]    SUPPORTED_PROPERTIES:
[ INFO ]        AVAILABLE_DEVICES: 
[ INFO ]        RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 1, 1
[ INFO ]        RANGE_FOR_STREAMS: 1, 8
[ INFO ]        FULL_DEVICE_NAME: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
[ INFO ]        OPTIMIZATION_CAPABILITIES: WINOGRAD, FP32, FP16, INT8, BIN, EXPORT_IMPORT
[ INFO ]        CACHE_DIR: 
[ INFO ]        NUM_STREAMS: 1
[ INFO ]        AFFINITY: Affinity.CORE
[ INFO ]        INFERENCE_NUM_THREADS: 0
[ INFO ]        PERF_COUNT: False
[ INFO ]        INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]        PERFORMANCE_HINT: PerformanceMode.UNDEFINED
[ INFO ]        PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] GPU :
[ INFO ]    SUPPORTED_PROPERTIES:
[ INFO ]        AVAILABLE_DEVICES: 0
[ INFO ]        RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 2, 1
[ INFO ]        RANGE_FOR_STREAMS: 1, 2
[ INFO ]        OPTIMAL_BATCH_SIZE: 1
[ INFO ]        MAX_BATCH_SIZE: 1
[ INFO ]        FULL_DEVICE_NAME: Intel(R) Iris(R) Xe Graphics [0x9a49] (iGPU)
[ INFO ]        DEVICE_UUID: UNSUPPORTED TYPE
[ INFO ]        DEVICE_TYPE: Type.INTEGRATED
[ INFO ]        DEVICE_GOPS: UNSUPPORTED TYPE
[ INFO ]        OPTIMIZATION_CAPABILITIES: FP32, BIN, FP16, INT8
[ INFO ]        GPU_DEVICE_TOTAL_MEM_SIZE: UNSUPPORTED TYPE
[ INFO ]        GPU_UARCH_VERSION: 12.0.0
[ INFO ]        GPU_EXECUTION_UNITS_COUNT: 96
[ INFO ]        GPU_MEMORY_STATISTICS: UNSUPPORTED TYPE
[ INFO ]        PERF_COUNT: False
[ INFO ]        MODEL_PRIORITY: Priority.MEDIUM
[ INFO ]        GPU_HOST_TASK_PRIORITY: Priority.MEDIUM
[ INFO ]        GPU_QUEUE_PRIORITY: Priority.MEDIUM
[ INFO ]        GPU_QUEUE_THROTTLE: Priority.MEDIUM
[ INFO ]        GPU_ENABLE_LOOP_UNROLLING: True
[ INFO ]        CACHE_DIR: 
[ INFO ]        PERFORMANCE_HINT: PerformanceMode.UNDEFINED
[ INFO ]        COMPILATION_NUM_THREADS: 8
[ INFO ]        NUM_STREAMS: 1
[ INFO ]        PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]        DEVICE_ID: 0

2)運(yùn)行openvino樣例

OpenVINO Samples

直接運(yùn)行python樣例:

# CPU
docker run -it --rm <image_name>
/bin/bash -c "cd ~ && omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 /opt/intel/openvino/samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp CPU"

# GPU
docker run -itu root:root  --rm --device /dev/dri:/dev/dri <image_name>
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp GPU"

# MYRIAD
docker run -itu root:root --rm --device-cgroup-rule='c 189:\* rmw' -v /dev/bus/usb:/dev/bus/usb <image_name>
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp MYRIAD"

# HDDL
docker run -itu root:root --rm --device=/dev/ion:/dev/ion -v /var/tmp:/var/tmp -v /dev/shm:/dev/shm <image_name>
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && umask 000 && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp HDDL"

編譯運(yùn)行C++樣例:

# 容器中
$ cd /opt/intel/openvino_2022.2.0.7713/samples/cpp
# 編譯樣例
$ ./build_samples.sh
$ tree samples_bin/
samples_bin/
|-- benchmark_app
|-- classification_sample_async
|-- hello_classification
|-- hello_nv12_input_classification
|-- hello_query_device
|-- hello_reshape_ssd
|-- model_creation_sample
|-- speech_sample

3)命令行使用

# 查看可獲取的預(yù)訓(xùn)練模型
$ omz_downloader --print_all
Sphereface
aclnet
aclnet-int8
action-recognition-0001
age-gender-recognition-retail-0013
alexnet
......
yolo-v3-onnx
yolo-v3-tf
yolo-v3-tiny-onnx
yolo-v3-tiny-tf
yolo-v4-tf
yolo-v4-tiny-tf
yolof
yolox-tiny

# 測(cè)試openvino運(yùn)行模型
$ cd /opt/intel/openvino_2022.2.0.7713/samples/python/hello_classification/
# 1.下載預(yù)訓(xùn)練模型
$ omz_downloader --name alexnet
$ tree public/alexnet/
|-- alexnet.caffemodel
|-- alexnet.prototxt
|-- alexnet.prototxt.orig

# 2.轉(zhuǎn)化模型
$ omz_converter  --name alexnet
$ tree public/alexnet/
|-- FP16
|   |-- alexnet.bin
|   |-- alexnet.mapping
|   -- alexnet.xml
|-- FP32
|   |-- alexnet.bin
|   |-- alexnet.mapping
|   -- alexnet.xml
|-- alexnet.caffemodel
|-- alexnet.prototxt
|-- alexnet.prototxt.orig

# 3.運(yùn)行模型
$ curl -O https://storage.openvinotoolkit.org/data/test_data/images/banana.jpg
$ python3 hello_classification.py public/alexnet/FP16/alexnet.xml banana.jpg CPU/GPU/AUTO

# 4.模型基準(zhǔn)測(cè)試
$ benchmark_app -m public/alexnet/FP16/alexnet.xml -i  banana.jpg -d CPU/GPU -niter 128 -api sync/async
Latency:
    Median:     34.38 ms
    AVG:        34.60 ms
    MIN:        19.57 ms
    MAX:        69.10 ms
Throughput: 115.30 FPS

# 容器資源占用
CONTAINER ID   NAME                    CPU %     MEM USAGE / LIMIT     MEM %     NET I/O   BLOCK I/O        PIDS
f8e127db77d8   openvino-ubuntu18_dev   786.64%   1.324GiB / 7.383GiB   17.93%    0B / 0B   483MB / 74.4MB   19

# 查看Intel GPU消耗
sudo apt-get install -y intel-gpu-tools
sudo intel_gpu_top
intel-gpu-top -    0/   0 MHz;  100% RC6; ----- (null);        0 irqs/s

      IMC reads:   ------ (null)/s
     IMC writes:   ------ (null)/s

          ENGINE      BUSY                                                                          MI_SEMA MI_WAIT
     Render/3D/0    99.65% |                                                                       |      0%      0%
       Blitter/0    0.00% |                                                                       |      0%      0%
         Video/0    0.00% |                                                                       |      0%      0%
         Video/1    0.00% |                                                                       |      0%      0%
  VideoEnhance/0    0.00% |                                                                       |      0%      0%

2.使用openvino_notebooks樣例

https://github.com/openvinotoolkit/openvino_notebooks/blob/main/README_cn.md

1)容器中安裝jupyter

參考遠(yuǎn)程服務(wù)器(ubuntu20.04)+docker容器內(nèi)jupyter遠(yuǎn)程使用

基于上述的openvino-dev容器環(huán)境中安裝jupyter和啟動(dòng)jupyter-notebook。

apt-get update 
apt-get install vim

pip install jupyter
# 生成jupyter notebook的配置文件
jupyter-notebook --generate-config

# 修改配置文件
vim ~/.jupyter/jupyter_notebook_config.py
    # 允許通過任意綁定服務(wù)器的ip訪問
    c.NotebookApp.ip = '*'
     # 用于訪問的端口
    c.NotebookApp.port = 8888  #注意這里與前面開出的容器端口要一致
     # 不自動(dòng)打開瀏覽器
    c.NotebookApp.open_browser = False
     #允許遠(yuǎn)程訪問
    c.NotebookApp.allow_remote_access = True 
    
# 啟動(dòng)jupyter
$ jupyter notebook -ip 0.0.0.0 --allow-root --port 8888 --no-browser
        ......
        http://127.0.0.1:8888/?token=xxx

使用瀏覽器訪問notebook,輸入token登錄:如 http://192.168.1.10:8888/

2)下載并使用openvino_notebooks工程樣例

apt-get install git
cd ~; git clone https://github.com/openvinotoolkit/openvino_notebooks

在瀏覽器的notebook中打開樣例的ipynb文件即可,如:openvino_notebooks/notebooks/001-hello-world/001-hello-world.ipynb。

四、模型處理

OpenVINO? 支持多種模型格式,并允許將它們轉(zhuǎn)換為自己的 OpenVINO IR。

1.OpenVINO模型處理工具

https://docs.openvino.ai/latest/omz_tools_downloader.html

  • mo:模型優(yōu)化器可以將預(yù)訓(xùn)練深度學(xué)習(xí)模型:TensorFlow、PyTorch、PaddlePaddle、MXNet、Caffe、Kaldi 或 ONNX 轉(zhuǎn)換為 OpenVINO 中間表示格式 (IR)。
    • .xml - 描述整個(gè)模型拓?fù)洌總€(gè)階層,相連性和參數(shù)值。
    • .bin - 包含每層已經(jīng)訓(xùn)練好的權(quán)值和偏移值。
    • 包含功能:
      • Convert(轉(zhuǎn)換)
      • Optimize(優(yōu)化)
      • Conversion weights and offsets(轉(zhuǎn)換權(quán)重與偏置)
  • pot:訓(xùn)練后優(yōu)化工具可以在推理過程中將權(quán)重和激活從浮點(diǎn)精度量化到整數(shù)精度(例如,8 位)。
    • 不同的硬件平臺(tái)支持不同的整數(shù)精度和量化參數(shù),POT 通過引入“目標(biāo)設(shè)備”的概念來抽象這種復(fù)雜性。
    • 需要一個(gè)未標(biāo)注的數(shù)據(jù)集進(jìn)行量化。
  • Open Model Zoo工具:針對(duì)Open Model Zoo的模型進(jìn)行一鍵化處理。
    • omz_downloader:從在線資源下載模型文件。
    • omz_converter:將其他格式模型裝換為IR格式模型。
    • omz_quantizer:將 IR 格式的全精度模型量化為低精度版本。
    • omz_info_dumper:以穩(wěn)定的機(jī)器可讀格式打印有關(guān)模型的信息。
    • omz_data_downloader:從安裝位置復(fù)制數(shù)據(jù)集的數(shù)據(jù)。
  • benchmark_app:Benchmark C++ Tool 在支持的設(shè)備上評(píng)估深度學(xué)習(xí)推理性能。

2.各類模型格式轉(zhuǎn)換

Supported Model Formats

  • OpenVINO IR(中間表示):OpenVINO? 的專有格式。

  • ONNX、PaddlePaddle:直接支持的格式,OpenVINO 提供 C++ 和 Python API 用于將它們直接導(dǎo)入 OpenVINO Runtime,無需任何事先轉(zhuǎn)換。

  • TensorFlow、PyTorch、MXNet、Caffe、Kaldi:間接支持的格式,它們需要轉(zhuǎn)換為前面列出的格式之一。使用模型優(yōu)化器執(zhí)行從這些格式到 OpenVINO IR 的轉(zhuǎn)換。在某些情況下,需要使用其他轉(zhuǎn)換器作為中介。

1)mo參數(shù)說明

# 可選參數(shù):
 --framework {onnx,mxnet,tf,kaldi,caffe,paddle}
# 與框架無關(guān)的參數(shù):
  --input_model INPUT_MODEL, -w INPUT_MODEL, -m INPUT_MODEL
  --model_name MODEL_NAME, -n MODEL_NAME
                        輸出IR文件名
  --output_dir OUTPUT_DIR, -o OUTPUT_DIR
  --input_shape INPUT_SHAPE
                        模型輸入節(jié)點(diǎn)的shape,也可以使用--input參數(shù)設(shè)置input_shape
  --scale SCALE, -s SCALE
                        原始網(wǎng)絡(luò)中所有的input會(huì)除以這個(gè)值。
  --reverse_input_channels
                        轉(zhuǎn)換通道,從RGB→BGR
  --log_level {CRITICAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Logger level
  --input INPUT        
                        帶""的字符串,用逗號(hào)分隔的輸入節(jié)點(diǎn)信息,包括名稱、形狀、數(shù)據(jù)類型等。
  --output OUTPUT       
                        指定模型的輸出節(jié)點(diǎn)
  --mean_values MEAN_VALUES, -ms MEAN_VALUES
                        對(duì)輸入圖像的每一個(gè)通道設(shè)置mean值
  --scale_values SCALE_VALUES
                        對(duì)輸入圖像的每一個(gè)通道設(shè)置scale值
  --source_layout SOURCE_LAYOUT
                        Layout of the input or output of the model in the framework. Layout can be specified in
                        the short form, e.g. nhwc, or in complex form, e.g. "[n,h,w,c]". Example for many names:
                        "in_name1([n,h,w,c]),in_name2(nc),out_name1(n),out_name2(nc)". Layout can be partially
                        defined, "?" can be used to specify undefined layout for one dimension, "..." can be used
                        to specify undefined layout for multiple dimensions, for example "?c??", "nc...", "n...c",
                        etc.
  --target_layout TARGET_LAYOUT
                        Same as --source_layout, but specifies target layout that will be in the model after
                        processing by ModelOptimizer.
  --layout LAYOUT       Combination of --source_layout and --target_layout. Can't be used with either of them. If
                        model has one input it is sufficient to specify layout of this input, for example --layout
                        nhwc. To specify layouts of many tensors, names must be provided, for example: --layout
                        "name1(nchw),name2(nc)". It is possible to instruct ModelOptimizer to change layout, for
                        example: --layout "name1(nhwc->nchw),name2(cn->nc)". Also "*" in long layout form can be
                        used to fuse dimensions, for example "[n,c,...]->[n*c,...]".
  --data_type {FP16,FP32,half,float}
                        數(shù)據(jù)類型,該參數(shù)決定了模型的精度。
  --transform TRANSFORM
                        Apply additional transformations. Usage: "--transform
                        transformation_name1[args],transformation_name2..." where [args] is key=value pairs
                        separated by semicolon. Examples: "--transform LowLatency2" or "--transform
                        LowLatency2[use_const_initializer=False]" or "--transform "MakeStateful[param_res_names={'
                        input_name_1':'output_name_1','input_name_2':'output_name_2'}]"" Available
                        transformations: "LowLatency2", "MakeStateful"
  --disable_fusing      [DEPRECATED] Turn off fusing of linear operations to Convolution.
  --disable_resnet_optimization
                        [DEPRECATED] Turn off ResNet optimization.
  --finegrain_fusing FINEGRAIN_FUSING
                        [DEPRECATED] Regex for layers/operations that won't be fused. Example: --finegrain_fusing
                        Convolution1,.*Scale.*
  --enable_concat_optimization
                        [DEPRECATED] Turn on Concat optimization.
  --extensions EXTENSIONS
                        Paths or a comma-separated list of paths to libraries (.so or .dll) with extensions. For
                        the legacy MO path (if `--use_legacy_frontend` is used), a directory or a comma-separated
                        list of directories with extensions are supported. To disable all extensions including
                        those that are placed at the default location, pass an empty string.
  --batch BATCH, -b BATCH
                        Input batch size
  --version             Version of Model Optimizer
  --silent              Prevent any output messages except those that correspond to log level equals ERROR, that
                        can be set with the following option: --log_level. By default, log level is already ERROR.
  --freeze_placeholder_with_value FREEZE_PLACEHOLDER_WITH_VALUE
                        Replaces input layer with constant node with provided value, for example:
                        "node_name->True". It will be DEPRECATED in future releases. Use --input option to specify
                        a value for freezing.
  --static_shape        Enables IR generation for fixed input shape (folding `ShapeOf` operations and shape-
                        calculating sub-graphs to `Constant`). Changing model input shape using the OpenVINO
                        Runtime API in runtime may fail for such an IR.
  --disable_weights_compression
                        [DEPRECATED] Disable compression and store weights with original precision.
  --progress            Enable model conversion progress display.
  --stream_output       Switch model conversion progress display to a multiline mode.
  --transformations_config TRANSFORMATIONS_CONFIG
                        Use the configuration file with transformations description. File can be specified as
                        relative path from the current directory, as absolute path or as arelative path from the
                        mo root directory
  --use_new_frontend    Force the usage of new Frontend of Model Optimizer for model conversion into IR. The new
                        Frontend is C++ based and is available for ONNX* and PaddlePaddle* models. Model optimizer
                        uses new Frontend for ONNX* and PaddlePaddle* by default that means `--use_new_frontend`
                        and `--use_legacy_frontend` options are not specified.
  --use_legacy_frontend
                        Force the usage of legacy Frontend of Model Optimizer for model conversion into IR. The
                        legacy Frontend is Python based and is available for TensorFlow*, ONNX*, MXNet*, Caffe*,
                        and Kaldi* models.

2)轉(zhuǎn)換ONNX模型

mo --input_model <INPUT_MODEL>.onnx

3)轉(zhuǎn)換PaddlePaddle模型

mo --input_model <INPUT_MODEL>.pdmodel
# 示例
mo --input_model=yolov3.pdmodel --input=image,im_shape,scale_factor --input_shape=[1,3,608,608],[1,2],[1,2] --reverse_input_channels --output=save_infer_model/scale_0.tmp_1,save_infer_model/scale_1.tmp_1

4)轉(zhuǎn)換PyTorch模型

PyTorch模型先導(dǎo)出ONNX模型,再轉(zhuǎn)為OpenVINO IR。

import torch

# Instantiate your model. This is just a regular PyTorch model that will be exported in the following steps.
model = SomeModel()
# Evaluate the model to switch some operations from training mode to inference.
model.eval()
# Create dummy input for the model. It will be used to run the model inside export function.
dummy_input = torch.randn(1, 3, 224, 224)
# Call the export function
torch.onnx.export(model, (dummy_input, ), 'model.onnx')
  • 從 PyTorch 1.8.1 版開始,并非所有 PyTorch 操作都可以導(dǎo)出到默認(rèn)使用的 ONNX opset 9。當(dāng)導(dǎo)出到默認(rèn) opset 9 不起作用時(shí),建議將模型導(dǎo)出到 opset 11 或更高版本。

5)轉(zhuǎn)換Caffe模型

mo --input_model <INPUT_MODEL>.caffemodel

# 針對(duì)Caffe 的特定參數(shù):
    --input_proto INPUT_PROTO,-d INPUT_PROTO
            包含拓?fù)涞牟渴鹁途w prototxt 文件
            結(jié)構(gòu)和層屬性
    --caffe_parser_path CAFFE_PARSER_PATH
            從 caffe.proto 生成的 python Caffe 解析器的路徑
    -k K    指定自定義層映射文件 CustomLayersMapping.xml 
    --disable_omitting_optional
            禁用忽略可選屬性(用于自定義圖層)。如果要轉(zhuǎn)移自定義層的所有屬性到 IR,請(qǐng)使用此選項(xiàng)。默認(rèn)行為是將具有默認(rèn)值的屬性和用戶定義的屬性傳遞給 IR。
    --enable_flattening_nested_params
            啟用展平可選參數(shù)(用于自定義圖層)。如果要將自定義層的屬性傳輸?shù)骄哂姓蛊角短讌?shù)的 IR,請(qǐng)使用此選項(xiàng)。默認(rèn)行為是在不展平嵌套參數(shù)的情況下傳輸屬性。

# 示例
mo --input_model bvlc_alexnet.caffemodel --input_proto bvlc_alexnet.prototxt
    # 如果caffemodel與prototxt在相同路徑,則指定input_model即可。
    
mo --input_model bvlc_alexnet.caffemodel -k CustomLayersMapping.xml --disable_omitting_optional --enable_flattening_nested_params

6)轉(zhuǎn)換TensorFlow模型

  • 針對(duì)TensorFlow*的特定參數(shù)

      --input_model_is_text
                            TensorFlow*: treat the input model file as a text protobuf format. If not specified, the
                            Model Optimizer treats it as a binary file by default.
      --input_checkpoint INPUT_CHECKPOINT
                            TensorFlow*: variables file to load.
      --input_meta_graph INPUT_META_GRAPH
                            Tensorflow*: a file with a meta-graph of the model before freezing
      --saved_model_dir SAVED_MODEL_DIR
                            TensorFlow*: directory with a model in SavedModel format of TensorFlow 1.x or 2.x version.
      --saved_model_tags SAVED_MODEL_TAGS
                            Group of tag(s) of the MetaGraphDef to load, in string format, separated by ','. For tag-
                            set contains multiple tags, all tags must be passed in.
      --tensorflow_custom_operations_config_update TENSORFLOW_CUSTOM_OPERATIONS_CONFIG_UPDATE
                            TensorFlow*: update the configuration file with node name patterns with input/output nodes
                            information.
      --tensorflow_use_custom_operations_config TENSORFLOW_USE_CUSTOM_OPERATIONS_CONFIG
                            Use the configuration file with custom operation description.
      --tensorflow_object_detection_api_pipeline_config TENSORFLOW_OBJECT_DETECTION_API_PIPELINE_CONFIG
                            TensorFlow*: path to the pipeline configuration file used to generate model created with
                            help of Object Detection API.
      --tensorboard_logdir TENSORBOARD_LOGDIR
                            TensorFlow*: dump the input graph to a given directory that should be used with
                            TensorBoard.
      --tensorflow_custom_layer_libraries TENSORFLOW_CUSTOM_LAYER_LIBRARIES
                            TensorFlow*: comma separated list of shared libraries with TensorFlow* custom operations
                            implementation.
      --disable_nhwc_to_nchw
                            [DEPRECATED] Disables the default translation from NHWC to NCHW. Since 2022.1 this option
                            is deprecated and used only to maintain backward compatibility with previous releases.
    
  • 針對(duì)TensorFlow 1 Models

    # Converting Frozen Model Format
    mo --input_model <INPUT_MODEL>.pb
    
    # Converting Non-Frozen Model Formats
    # 1.Checkpoint存儲(chǔ)格式:包含inference_graph.pb和checkpoint_file.ckpt文件
    mo --input_model <INFERENCE_GRAPH>.pb --input_checkpoint <INPUT_CHECKPOINT>
    # 2.MetaGraph儲(chǔ)存格式:包含model_name.meta, model_name.index, model_name.data-00000-of-00001和checkpoint_file.ckpt【可選】
    mo --input_meta_graph <INPUT_META_GRAPH>.meta
    # 3.SavedModel儲(chǔ)存格式:一個(gè)文件夾中包含.pb文件,variables、assets 和 assets.extra子文件夾
    mo --saved_model_dir <SAVED_MODEL_DIRECTORY>
    
    • 導(dǎo)出Frozen Model Format

      import tensorflow as tf
      from tensorflow.python.framework import graph_io
      frozen = tf.compat.v1.graph_util.convert_variables_to_constants(sess, sess.graph_def, ["name_of_the_output_node"])
      graph_io.write_graph(frozen, './', 'inference_graph.pb', as_text=False)
      
  • 針對(duì)TensorFlow 2 Models

    • SavedModel儲(chǔ)存格式:一個(gè)文件夾中包含.pb文件和 variables 、assets子文件夾

      mo --saved_model_dir <SAVED_MODEL_DIRECTORY>
      
    • Keras H5儲(chǔ)存格式,需要先將其序列化為SavedModel格式。

      import tensorflow as tf
      model = tf.keras.models.load_model('model.h5')
      tf.saved_model.save(model,'model')
      

7)轉(zhuǎn)換Mxnet模型

  • 針對(duì)TensorFlow*的特定參數(shù)

    Mxnet-specific parameters:
      --input_symbol INPUT_SYMBOL
                            Symbol file (for example, model-symbol.json) that contains a topology structure and layer
                            attributes
      --nd_prefix_name ND_PREFIX_NAME
                            Prefix name for args.nd and argx.nd files.
      --pretrained_model_name PRETRAINED_MODEL_NAME
                            Name of a pretrained MXNet model without extension and epoch number. This model will be
                            merged with args.nd and argx.nd files
      --save_params_from_nd
                            Enable saving built parameters file from .nd files
      --legacy_mxnet_model  Enable MXNet loader to make a model compatible with the latest MXNet version. Use only if
                            your model was trained with MXNet version lower than 1.0.0
      --enable_ssd_gluoncv  Enable pattern matchers replacers for converting gluoncv ssd topologies.
    

3.訓(xùn)練后優(yōu)化

https://docs.openvino.ai/latest/pot_compression_cli_README.html

# Basic usage for DefaultQuantization
pot -q default -m <path_to_xml> -w <path_to_bin> --ac-config <path_to_AC_config_yml>

# Basic usage for AccuracyAwareQauntization
pot -q accuracy_aware -m <path_to_xml> -w <path_to_bin> --ac-config <path_to_AC_config_yml> --max-drop 0.01

五、OpenVINO推理

Integrate OpenVINO? with Your Application

1.使用openvino.runtime api開發(fā)

1)同步推理流程

image
  1. 創(chuàng)建Core對(duì)象;

    from openvino.runtime import Core, Type, Layout
    core = Core()
    
    # 查看可用設(shè)備【可選】
    devices = ie.available_devices
    for device in devices:
        device_name = ie.get_property(device, "FULL_DEVICE_NAME")
        print(f"{device}: {device_name}")
    

    CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
    GNA: GNA_SW
    GPU: Intel(R) Iris(R) Xe Graphics [0x9a49] (iGPU)

  2. 載入并編譯模型;

    # 讀取模型文件,model_path為 .xml files 或 .onnx file 
    model = core.read_model(model_path)
    
    # 獲取模型輸入輸出信息【可選】
    input_layer = model.input(0)
    output_layer = model.output(0)
    print(f"input precision: {input_layer.element_type}")
    print(f"input shape: {input_layer.shape}")
    print(f"output precision: {output_layer.element_type}")
    print(f"output shape: {output_layer.shape}")
    
    # 集成預(yù)處理步驟到模型【可選】
     # 參考:使用openvino.preprocess api開發(fā)
        
    # 將模型文件編譯到指定的設(shè)備:device_name='CPU/GPU', config可選
    compiled_model = core.compile_model(model, device_name, config)
    

    input precision: <Type: 'float32'>
    input shape: {1, 3, 224, 224}

    output precision: <Type: 'float32'>
    output shape: {1, 1001}

  3. 執(zhí)行同步推理獲得結(jié)果;

    # 方法1:程序預(yù)處理, blob為模型輸入數(shù)據(jù) 
    import numpy as np
    image = cv2.imread(image_path)
    N, C, H, W = input_layer.shape
    resized_image = cv2.resize(src=image, dsize=(W, H))
    input_data = np.expand_dims(np.transpose(resized_image, (2, 0, 1)), 0).astype(np.float32)
    result = compiled_model([input_data])[output_layer]  # 阻塞推理
    
    
    # 方法2: 模型集成預(yù)處理
    image = cv2.imread(image_path)
    # Add N dimension
    input_tensor = np.expand_dims(image, 0)
    results = compiled_model.infer_new_request({0: input_tensor})    # 阻塞推理
    

2)異步推理流程

image
  1. 加載模型步驟與前面一致

  2. 執(zhí)行異步推理獲得結(jié)果

    from openvino.runtime import AsyncInferQueue, Core, InferRequest, Layout, Type
    
    # Read input images
    images = [cv2.imread(image_path) for image_path in args.input]
    # Add N dimension
    input_tensors = [np.expand_dims(image, 0) for image in resized_images]
    
    # create async queue with optimal number of infer requests
    infer_queue = AsyncInferQueue(compiled_model)
    infer_queue.set_callback(completion_callback)
    
    for i, input_tensor in enumerate(input_tensors):
        # 執(zhí)行異步推理
     infer_queue.start_async({0: input_tensor}, args.input[i])   # 非阻塞
    
    # 等待推理結(jié)束
    infer_queue.wait_all()
    

    ...
    # 創(chuàng)建一個(gè)推理請(qǐng)求負(fù)責(zé)處理當(dāng)前幀
    infer_request_curr = net.create_infer_request()
    # 創(chuàng)建一個(gè)推理請(qǐng)求負(fù)責(zé)處理下一幀
    infer_request_next = net.create_infer_request()
    
    # Get the current frame,采集當(dāng)前幀圖像
    frame_curr = cv2.imread("./data/images/bus.jpg")
    # Preprocess the frame,對(duì)當(dāng)前幀做預(yù)處理
    letterbox_img_curr, _, _ = letterbox(frame_curr, auto=False)
    # Normalization + Swap RB + Layout from HWC to NCHW
    blob = Tensor(cv2.dnn.blobFromImage(letterbox_img_curr, 1/255.0, swapRB=True))  
    
    # 將數(shù)據(jù)傳入模型的指定輸入節(jié)點(diǎn)
    infer_request_curr.set_tensor(input_node, blob)
    # 調(diào)用start_sync(),以非阻塞方式啟動(dòng)當(dāng)前幀推理計(jì)算
    infer_request_curr.start_async()
    while True:    
        # 下一幀推理請(qǐng)求數(shù)據(jù)blob準(zhǔn)備
       # 將數(shù)據(jù)傳入下一幀推理請(qǐng)求
        infer_request_next.set_tensor(input_node, blob)
        # 調(diào)用start_sync(),以非阻塞的方式啟動(dòng)下一幀推理計(jì)算
        infer_request_next.start_async()
        
        # 等待當(dāng)前幀推理請(qǐng)求結(jié)束
        infer_request_curr.wait()
        # 從 output_node獲取當(dāng)前幀推理結(jié)果
        infer_result = infer_request_curr.get_tensor(output_node)
        # Postprocess the inference result
        data = torch.tensor(infer_result.data)
        
        # 交換當(dāng)前幀推理請(qǐng)求和下一幀推理請(qǐng)求
        infer_request_curr, infer_request_next = infer_request_next, infer_request_curr
    

2.使用openvino.preprocess api開發(fā)

使用OpenVINO預(yù)處理API

OpenVINO? 2022.1之后的預(yù)處理API可以將所有預(yù)處理步驟都集成到在執(zhí)行圖中,這樣dGPU、VPU或iGPU都能進(jìn)行數(shù)據(jù)預(yù)處理,無需依賴CPU。

1)數(shù)據(jù)預(yù)處理的典型操作

  1. 改變輸入數(shù)據(jù)的形狀:[720, 1280,3] → [1, 3, 640, 640]
  2. 改變輸入數(shù)據(jù)的精度:U8 → f32
  3. 改變輸入數(shù)據(jù)的顏色通道順序:BGR → RGB
  4. 改變輸入數(shù)據(jù)的布局(layout):HWC → NCHW
  5. 歸一化數(shù)據(jù):減去均值(mean),除以標(biāo)準(zhǔn)差(std)

2)OpenVINO預(yù)處理API主要流程

  1. 實(shí)例化PrePostProcessor對(duì)象

    from openvino.runtime import Core, Type, Layout
    from openvino.preprocess import PrePostProcessor, ColorFormat, ResizeAlgorithm
    
    core = Core()
    model = core.read_model(model_path)
    ppp = PrePostProcessor(model)
    
  2. 聲明輸入張量的信息

    image = cv2.imread(image_path)
    # Add N dimension
    input_tensor = np.expand_dims(image, 0)  # 例如:input_tensor.shape = [1,640,640,3]
    
    ppp.input().tensor() \
        .set_shape([1,640,640,3]) \      # 圖像的尺寸,按照'NHWC'的順序?qū)?    .set_color_format(ColorFormat.BGR) \
        .set_element_type(Type.u8) \
        .set_layout(Layout('NHWC'))  
    
  3. 指定模型的數(shù)據(jù)布局(layout)

    # 模型輸入的數(shù)據(jù)布局為NCHW
    ppp.input().model().set_layout(Layout('NCHW'))
    
    # 模型輸出的數(shù)據(jù)布局為NHWC【可選】
    ppp.output().model().set_layout(Layout('NHWC'))
    
  4. 聲明輸出張量的信息

    ppp.output().tensor() \
     .set_element_type(Type.f32)     # 輸出張量的精度為f32
        .set_layout(Layout('NHWC'))      # 可選
    
  5. 定義預(yù)處理的具體步驟

    # 或 自定義前處理步驟
    ppp.input().preprocess() \
        .convert_element_type(Type.f32) \
        .convert_color(ColorFormat.RGB) \    # 將輸入圖像從BGR格式轉(zhuǎn)化為RGB格式
        .resize(ResizeAlgorithm.RESIZE_LINEAR, 224, 224) # 例如模型輸入尺寸是[1,3,224,224]
        .mean([0.0, 0.0, 0.0]) \
        .scale([255.0, 255.0, 255.0]) \
        .convert_layout([0, 3, 1, 2])        # 將'NHWC'轉(zhuǎn)化為'NCHW'
    
  6. 定義后處理的具體步驟【可選】

    ppp.output().postprocess() \
        .convert_element_type(Type.f32) \
        ..convert_layout([0, 3, 1, 2])       # 將'NHWC'轉(zhuǎn)化為'NCHW'
    
  7. 將預(yù)處理步驟集成到模型

    model = ppp.build()
    

    <img src="https://img-blog.csdnimg.cn/42d7745f14e1434cbaee46725eac5420.png" style="zoom: 80%;" />

  8. 將集成了預(yù)處理步驟的模型導(dǎo)出【可選】

    from openvino.offline_transformations import serialize
    serialize(model, 'xxx.xml', 'xxx.bin')
    

3.Auto-Device及Automatic Batching插件

OpenVINOTM 2022.1中AUTO插件和自動(dòng)批處理的最佳實(shí)踐

1)Auto-Device

AUTO Device (簡(jiǎn)稱 Automatic device selection) 是一個(gè)構(gòu)建在CPU/GPU插件之上的虛擬插件,它不綁定到特定類型的設(shè)備,它可以是受支持的CPU、GPU、VPU(視覺處理單元)或 GNA(高斯神經(jīng)加速器協(xié)處理器)或這些設(shè)備的組合。

image

優(yōu)點(diǎn):

  • 根據(jù)深度學(xué)習(xí)模型和所選設(shè)備的特性以最佳配置使用它們。
  • 使 GPU 實(shí)現(xiàn)更快的首次推理延遲:GPU 插件需要在開始推理之前在運(yùn)行時(shí)進(jìn)行在線模型編譯。當(dāng)選擇獨(dú)立或集成GPU時(shí),“AUTO”插件開始會(huì)首先利用CPU進(jìn)行推理,以隱藏此GPU模型編譯時(shí)間。
  • 使用簡(jiǎn)單,開發(fā)者只需將compile_model()方法的device_name參數(shù)指定為“AUTO”即可。

設(shè)備切換邏輯:

  • AUTO插件會(huì)依據(jù)設(shè)備優(yōu)先級(jí): dGPU > iGPU > VPU > CPU 來選擇最佳計(jì)算設(shè)備。當(dāng)自動(dòng)插件選擇 GPU 作為最佳設(shè)備時(shí),會(huì)發(fā)生推理設(shè)備切換,以隱藏首次推理延遲。

不同設(shè)備支持的精度

SupportedDevice Supportedmodel precision
dGPU(e.g. Intel? Iris? Xe MAX) FP32, FP16, INT8, BIN
iGPU(e.g. Intel? UHD Graphics 620 (iGPU)) FP32, FP16, BIN
Intel? Movidius? Myriad? X VPU(e.g. Intel? Neural Compute Stick 2 (Intel? NCS2)) FP16
Intel? CPU(e.g. Intel? Core? i7-1165G7) FP32, FP16, INT8, BIN

2)Automatic Batching

自動(dòng)批處理(Automatic Batching) 將用戶程序發(fā)出的多個(gè)異步推理請(qǐng)求組合起來,將它們視為多批次推理請(qǐng)求,并將批推理結(jié)果拆解后,返回給各推理請(qǐng)求。

當(dāng)compile_model()方法的config參數(shù)設(shè)置為{“PERFORMANCE_HINT”: ”THROUGHPUT”}時(shí),OpenVINOTM Runtime會(huì)自動(dòng)啟動(dòng)自動(dòng)批處理執(zhí)行。

  • PERFORMANCE_HINT 應(yīng)用場(chǎng)景 是否啟動(dòng)Auto Batching?
    THROUGHPUT 非實(shí)時(shí)的大批量推理計(jì)算任務(wù)
    LATENCY 實(shí)時(shí)或近實(shí)時(shí)應(yīng)用任務(wù)
compiled_model = core.compile_model(model="xxx.onnx", device_name="AUTO", \
                                   config={"PERFORMANCE_HINT": "THROUGHPUT", 'ALLOW_AUTO_BATCHING': 'YES'})

4.C++推理示例

#include <iterator>
#include <memory>
#include <sstream>
#include <string>
#include <vector>

// clang-format off
#include "openvino/openvino.hpp"

#include "samples/args_helper.hpp"
#include "samples/common.hpp"
#include "samples/classification_results.h"
#include "samples/slog.hpp"
#include "format_reader_ptr.h"
// clang-format on

/**
 * @brief Main with support Unicode paths, wide strings
 */
int tmain(int argc, tchar* argv[]) {
    try {
        // -------- Step 1. Initialize OpenVINO Runtime Core --------
        ov::Core core;

        // -------- Step 2. Read a model --------
        std::shared_ptr<ov::Model> model = core.read_model(model_path);
        printInputAndOutputsInfo(*model);

        // -------- Step 3. Set up input

        // Read input image to a tensor and set it to an infer request
        // without resize and layout conversions
        FormatReader::ReaderPtr reader(image_path.c_str());
        if (reader.get() == nullptr) {
            std::stringstream ss;
            ss << "Image " + image_path + " cannot be read!";
            throw std::logic_error(ss.str());
        }

        ov::element::Type input_type = ov::element::u8;
        ov::Shape input_shape = {1, reader->height(), reader->width(), 3};
        std::shared_ptr<unsigned char> input_data = reader->getData();

        // just wrap image data by ov::Tensor without allocating of new memory
        ov::Tensor input_tensor = ov::Tensor(input_type, input_shape, input_data.get());

        const ov::Layout tensor_layout{"NHWC"};

        // -------- Step 4. Configure preprocessing --------

        ov::preprocess::PrePostProcessor ppp(model);

        // 1) Set input tensor information:
        // - input() provides information about a single model input
        // - reuse precision and shape from already available `input_tensor`
        // - layout of data is 'NHWC'
        ppp.input().tensor().set_shape(input_shape).set_element_type(input_type).set_layout(tensor_layout);
        // 2) Adding explicit preprocessing steps:
        // - convert layout to 'NCHW' (from 'NHWC' specified above at tensor layout)
        // - apply linear resize from tensor spatial dims to model spatial dims
        ppp.input().preprocess().resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR);
        // 4) Here we suppose model has 'NCHW' layout for input
        ppp.input().model().set_layout("NCHW");
        // 5) Set output tensor information:
        // - precision of tensor is supposed to be 'f32'
        ppp.output().tensor().set_element_type(ov::element::f32);

        // 6) Apply preprocessing modifying the original 'model'
        model = ppp.build();

        // -------- Step 5. Loading a model to the device --------
        ov::CompiledModel compiled_model = core.compile_model(model, device_name);

        // -------- Step 6. Create an infer request --------
        ov::InferRequest infer_request = compiled_model.create_infer_request();
        // -----------------------------------------------------------------------------------------------------

        // -------- Step 7. Prepare input --------
        infer_request.set_input_tensor(input_tensor);

        // -------- Step 8. Do inference synchronously --------
        infer_request.infer();

        // -------- Step 9. Process output
        const ov::Tensor& output_tensor = infer_request.get_output_tensor();

        // Print classification results
        ClassificationResult classification_result(output_tensor, {image_path});
        classification_result.show();
        // -----------------------------------------------------------------------------------------------------
    } catch (const std::exception& ex) {
        std::cerr << ex.what() << std::endl;
        return EXIT_FAILURE;
    }

    return EXIT_SUCCESS;
}

推理模式

  • 自動(dòng)設(shè)備選擇 (AUTO):檢測(cè)可用設(shè)備,選擇最適合該任務(wù)的設(shè)備,并配置其優(yōu)化設(shè)置。這樣可以編寫一次應(yīng)用程序并將其部署到任何地方。

    • 從 CPU 開始執(zhí)行推理,繼續(xù)將模型加載到最適合該目的的設(shè)備,并在準(zhǔn)備好時(shí)將任務(wù)轉(zhuǎn)移給它。

      • 使用CPU可以減少首次推理延時(shí)。
  • 多設(shè)備執(zhí)行 (MULTI)

  • 異構(gòu)執(zhí)行 (HETERO):允許在多個(gè)設(shè)備上執(zhí)行一個(gè)模型的推理。

  • 自動(dòng)批處理執(zhí)行(Auto-batching):通過將推理請(qǐng)求分組在一起來提高設(shè)備利用率,而無需用戶進(jìn)行編程。

參考

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 227,797評(píng)論 6 531
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場(chǎng)離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 98,179評(píng)論 3 414
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事。” “怎么了?”我有些...
    開封第一講書人閱讀 175,628評(píng)論 0 373
  • 文/不壞的土叔 我叫張陵,是天一觀的道長(zhǎng)。 經(jīng)常有香客問我,道長(zhǎng),這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 62,642評(píng)論 1 309
  • 正文 為了忘掉前任,我火速辦了婚禮,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 71,444評(píng)論 6 405
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 54,948評(píng)論 1 321
  • 那天,我揣著相機(jī)與錄音,去河邊找鬼。 笑死,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 43,040評(píng)論 3 440
  • 文/蒼蘭香墨 我猛地睜開眼,長(zhǎng)吁一口氣:“原來是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 42,185評(píng)論 0 287
  • 序言:老撾萬榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個(gè)月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 48,717評(píng)論 1 333
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 40,602評(píng)論 3 354
  • 正文 我和宋清朗相戀三年,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 42,794評(píng)論 1 369
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 38,316評(píng)論 5 358
  • 正文 年R本政府宣布,位于F島的核電站,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 44,045評(píng)論 3 347
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 34,418評(píng)論 0 26
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 35,671評(píng)論 1 281
  • 我被黑心中介騙來泰國(guó)打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 51,414評(píng)論 3 390
  • 正文 我出身青樓,卻偏偏與公主長(zhǎng)得像,于是被迫代替她去往敵國(guó)和親。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 47,750評(píng)論 2 370

推薦閱讀更多精彩內(nèi)容