web-dev-qa-db-ja.com

Tensorflow 2.1で変換した後、TensorをロードできませんRT SavedModel

Tensorflow 2に実装されたYOLOv3モデル をTensor RT= NVIDIA Webサイトのチュートリアル( https:// docs .nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#worflow-with-savedmodel )。

SavedModelアプローチを使用して変換を行い、元のモデルをFP16に変換し、その結果を新しいSavedModelとして保存することに成功しました。変換が行われたのと同じプロセス内で新しいSavedModelが読み込まれると、正しく読み込まれ、画像に対して推論を実行できますが、FP16で保存されたモデルを次に読み込むときに問題が発生します。新しいプロセス。これを実行しようとすると、次のエラーが発生します。

2020-04-01 10:39:42.428094: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-04-01 10:39:42.447415: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
Coco names not found, class labels will be empty
2020-04-01 10:39:53.892453: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-04-01 10:39:53.920870: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: TITAN Xp computeCapability: 6.1
coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s
2020-04-01 10:39:53.920915: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-01 10:39:53.920950: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-01 10:39:53.937043: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-04-01 10:39:53.941012: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-04-01 10:39:53.972250: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-04-01 10:39:53.976883: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-04-01 10:39:53.976919: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-01 10:39:53.978525: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-01 10:39:53.978833: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-04-01 10:39:54.112532: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2999115000 Hz
2020-04-01 10:39:54.114178: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55f3a70 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-04-01 10:39:54.114208: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-04-01 10:39:54.219842: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x555e230 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-04-01 10:39:54.219872: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): TITAN Xp, Compute Capability 6.1
2020-04-01 10:39:54.220896: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: TITAN Xp computeCapability: 6.1
coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s
2020-04-01 10:39:54.220936: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-01 10:39:54.220948: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-01 10:39:54.220981: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-04-01 10:39:54.220998: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-04-01 10:39:54.221013: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-04-01 10:39:54.221029: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-04-01 10:39:54.221039: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-01 10:39:54.222281: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-01 10:39:54.232890: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-01 10:39:54.636732: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 Edge matrix:
2020-04-01 10:39:54.636779: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-04-01 10:39:54.636786: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-04-01 10:39:54.638840: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11240 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-04-01 10:40:26.366595: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-04-01 10:40:31.509694: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger INVALID_ARGUMENT: getPluginCreator could not find plugin BatchedNMS_TRT version 1
2020-04-01 10:40:31.509767: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger safeDeserializationUtils.cpp (259) - Serialization Error in load: 0 (Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry)
2020-04-01 10:40:31.513205: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger INVALID_STATE: std::exception
2020-04-01 10:40:31.513262: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger INVALID_CONFIG: Deserialize the cuda engine failed.
Segmentation fault (core dumped)

この問題の原因は不明であり、この問題を引き起こす唯一のスレッドはnvidia devフォーラムにあり、答えを提供していません。 ( https://forums.developer.nvidia.com/t/getplugincreator-could-not-find-plugin-batchednms-trt-version-1/84205/

したがって、私の質問は次のとおりです。読み込みコードが変換コードとは異なるプロセスで実行されたときにSavedModelが読み込まれないのはなぜですか?そして、毎回TensorRT以外のモデルから変換する必要なしに、Tensor RT=モデルをどのようにロードできますか?

これは、モデルの変換に使用されたコードと、変換されたモデルが同じプロセスで読み込まれたときに推論出力です。

コード

import os
from os.path import join as pjoin

import tensorflow as tf
import numpy as np
from tensorflow.python.framework import graph_io
from tensorflow.keras.models import load_model
from tensorflow.python.compiler.tensorrt import trt_convert as trt
from tensorflow.python.framework import convert_to_constants

from caipy_services_backend.models import Yolov3
from caipy_services_backend.models.yolov3.utils import freeze_all

# Clear any previous session.
tf.keras.backend.clear_session()


def my_input_fn():
    for _ in range(1):
        inp1 = np.random.normal(size=(1, 416, 416, 3)).astype(np.float32)
        # inp2 = np.random.normal(size=(8, 16, 16, 3)).astype(np.float32)
        yield [inp1]


def convert_saved_model_and_reload(input_saved_model_dir, output_saved_model_dir):
    conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS
    conversion_params = conversion_params._replace(
        max_workspace_size_bytes=(1 << 32))
    conversion_params = conversion_params._replace(precision_mode="FP16")
    conversion_params = conversion_params._replace(
        maximum_cached_engines=100)

    converter = tf.experimental.tensorrt.Converter(
        input_saved_model_dir=input_saved_model_dir, conversion_params=conversion_params)
    converter.convert()

    converter.build(input_fn=my_input_fn)
    converter.save(output_saved_model_dir)

    saved_model_loaded = tf.saved_model.load(
        output_saved_model_dir, tags=["serve"])
    graph_func = saved_model_loaded.signatures["serving_default"]
    frozen_func = convert_to_constants.convert_variables_to_constants_v2(
        graph_func)
    input_data = tf.convert_to_tensor(np.random.normal(size=(1, 416, 416, 3)).astype(np.float32))
    output = frozen_func(input_data)[0].numpy()
    print(output)

出力

[[[0. 0. 1. 1.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]]
WARNING:tensorflow:Unresolved object in checkpoint: (root).trt_engine_resources.TRTEngineOp_3._serialized_trt_resource_filename
WARNING:tensorflow:Unresolved object in checkpoint: (root).trt_engine_resources.TRTEngineOp_4._serialized_trt_resource_filename
WARNING:tensorflow:Unresolved object in checkpoint: (root).trt_engine_resources.TRTEngineOp_5._serialized_trt_resource_filename
WARNING:tensorflow:Unresolved object in checkpoint: (root).trt_engine_resources.TRTEngineOp_0._serialized_trt_resource_filename
WARNING:tensorflow:Unresolved object in checkpoint: (root).trt_engine_resources.TRTEngineOp_7._serialized_trt_resource_filename
WARNING:tensorflow:Unresolved object in checkpoint: (root).trt_engine_resources.TRTEngineOp_1._serialized_trt_resource_filename
WARNING:tensorflow:Unresolved object in checkpoint: (root).trt_engine_resources.TRTEngineOp_2._serialized_trt_resource_filename
WARNING:tensorflow:Unresolved object in checkpoint: (root).trt_engine_resources.TRTEngineOp_6._serialized_trt_resource_filename
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.

そして、これがエラーを引き起こしているコードです

def load_tensor_rt_model(saved_model_dir):
    saved_model_loaded = tf.saved_model.load(
        saved_model_dir, tags=["serve"])
    graph_func = saved_model_loaded.signatures["serving_default"]
    frozen_func = convert_to_constants.convert_variables_to_constants_v2(
        graph_func)
    input_data = tf.convert_to_tensor(np.random.normal(size=(1, 416, 416, 3)).astype(np.float32))
    output = frozen_func(input_data)[0].numpy()
    print(output)

この問題について何か助けていただければ幸いです。

更新:この質問で説明されている問題は、converter.build()の使用が原因で発生しています。変換をビルドせずに保存すると、問題なくロードできます。しかし、なぜビルドがこの問題を引き起こしているのかはまだわかりません。

コンピューターのスペック:

  • GPU:NVIDIA TitanXp
  • OS:Ubuntu 18.04

パッケージのバージョン:

  • NVIDIAグラフィックスドライバー:440.59
  • CUDA:10.1.243-1 AMD64
  • CUDNN:7.6.5.32-1 + cuda10.1
  • libnvinfer-dev:6.0.1-1 + cuda10.1
  • libnvinfer-plugin-dev:6.0.1-1 + cuda10.1
  • python:3.6.9
  • tensorflow:2.1.0
3
anaBad

これは、保存されたエンジンを使用して推論するときにlibnvinfer_plugin.so。*がロードされないために発生することがわかりました(convert.build()が使用されるときにロードされて使用されると思います)。

推論関数の開始時にtrt.init_libnvinfer_plugins(None,'')(tensorrtをtrtとしてインポート)を使用してプラグインの初期化を強制しましたが、これによりこの特定のエラーが解決されました。

1
Varun Sapre