Mediapipe入门学习笔记

0.内容概览

0.内容概览
1.Bazel工具安装
2.Mediapipe编译
3.Mediapipe示例编译
4.参考资料

1.Bazel工具安装

1.1 版本要求

Mediapipe要求Bazel工具版本为6.5.0，下载地址Baze-6.5.0

1.2 Bazel安装

# 1.Linux
>> chmox +x bazel-6.5.0-installer-linux-x86_64.sh
>> sudo ./bazel-6.5.0-installer-linux-x86_64.sh

# 2.Windows
## 2.1 重命名bazel-6.5.0-windows-x86_64.exe为bazel.exe
## 2.2 将bazel.exe放到某个目录下
## 2.3 将bazel.exe所在目录添加到windows环境变量path中

# 3.查看版本
>> bazel --version
bazel 6.5.0

2.Mediapipe编译

2.1 源码下载

>> git clone --recursive git@github.com:google-ai-edge/mediapipe.git

2.2 安装依赖包

(1) OpenCV安装，这个网上教程比较多，就不写了，版本可以是3.x或4.x，mediapipe内部有适配，mediapipe的OpenCV配置需要修改 WORKSPACE和 third_party/opencv_linux.BUILD两个文件，我这边安装的是OpenCV3

new_local_repository(
    name = "linux_opencv",
    build_file = "@//third_party:opencv_linux.BUILD",
    path = "/usr/local",
)

cc_library(
    name = "opencv",
    srcs = [
        "lib/libopencv_core.so",
        "lib/libopencv_calib3d.so",
        "lib/libopencv_features2d.so",
        "lib/libopencv_highgui.so",
        "lib/libopencv_imgcodecs.so",
        "lib/libopencv_imgproc.so",
        "lib/libopencv_video.so",
        "lib/libopencv_videoio.so",
    ],
    hdrs = glob([
        "include/opencv2/**/*.h*",
        "include/opencv/**/*.h*",
    ]),
    includes = [
        "include/",
    ],
    visibility = ["//visibility:public"],
)

(2) ffmpeg安装：

>> sudo apt-get update
>> sudo apt-get install ffmpeg
>> ffmpeg -version
ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil      56. 31.100 / 56. 31.100
libavcodec     58. 54.100 / 58. 54.100
libavformat    58. 29.100 / 58. 29.100
libavdevice    58.  8.100 / 58.  8.100
libavfilter     7. 57.100 /  7. 57.100
libavresample   4.  0.  0 /  4.  0.  0
libswscale      5.  5.100 /  5.  5.100
libswresample   3.  5.100 /  3.  5.100
libpostproc    55.  5.100 / 55.  5.100

2.3 源码编译

>> cd mediapipe
>> export GLOG_logtostderr=1
>> bazel run --define MEDIAPIPE_DISABLE_GPU=1 \
    mediapipe/examples/desktop/hello_world:hello_world

如果你当前的Ubuntu是20.04版本，而python版本则是 3.8.x，这时就需要修改 WORKSPACE文件：

python_init_repositories(
    default_python_version = "system",
    local_wheel_dist_folder = "dist",
    local_wheel_inclusion_list = ["mediapipe*"],
    local_wheel_workspaces = ["//:WORKSPACE"],
    requirements = {
        "3.8": "//:requirements_lock_3_8.txt",
        "3.9": "//:requirements_lock_3_9.txt",
        "3.10": "//:requirements_lock_3_10.txt",
        "3.11": "//:requirements_lock_3_11.txt",
        "3.12": "//:requirements_lock_3_12.txt",
    },
)

其中 requirements_lock_3_8.txt文件只需要你将 requirements_lock.txt拷贝一份，再重命名一下即可，其实就是一个 requirements.txt文件，然后再编译：

>> bazel run --define MEDIAPIPE_DISABLE_GPU=1 \
    mediapipe/examples/desktop/hello_world:hello_world
WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/BUILD:70:24: in cc_library rule //mediapipe/framework:calculator_cc_proto: target '//mediapipe/framework:calculator_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/tool/BUILD:204:24: in cc_library rule //mediapipe/framework/tool:field_data_cc_proto: target '//mediapipe/framework/tool:field_data_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
INFO: Analyzed target //mediapipe/examples/desktop/hello_world:hello_world (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //mediapipe/examples/desktop/hello_world:hello_world up-to-date:
  bazel-bin/mediapipe/examples/desktop/hello_world/hello_world
INFO: Elapsed time: 0.119s, Critical Path: 0.01s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/mediapipe/examples/desktop/hello_world/hello_world
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1733825056.505418   15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505500   15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505512   15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505518   15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505539   15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505562   15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505572   15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505579   15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505597   15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505655   15716 hello_world.cc:58] Hello World!

3.Mediapipe示例编译

3.1 编译选项

由于mediapipe内部使用XNNPack推理，有些编译选项需要手动关闭，才能成功编译通过：

--define MEDIAPIPE_DISABLE_GPU=1 --define xnn_enable_avxvnni=false              \
--define xnn_enable_avx512amx=false --define xnn_enable_avxvnni=false --define  \
xnn_enable_avxvnniint8=false --define xnn_enable_avx512fp16=false

3.2 人脸检测示例

# 1.编译示例
>>  bazel build --define MEDIAPIPE_DISABLE_GPU=1 --define xnn_enable_avxvnni=false  \
--define xnn_enable_avx512amx=false --define xnn_enable_avxvnni=false --define      \
xnn_enable_avxvnniint8=false --define xnn_enable_avx512fp16=false                   \
mediapipe/examples/desktop/face_detection:face_detection_cpu

WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/BUILD:70:24: in cc_library rule //mediapipe/framework:calculator_cc_proto: target '//mediapipe/framework:calculator_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/tool/BUILD:204:24: in cc_library rule //mediapipe/framework/tool:field_data_cc_proto: target '//mediapipe/framework/tool:field_data_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
INFO: Analyzed target //mediapipe/examples/desktop/face_detection:face_detection_cpu (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //mediapipe/examples/desktop/face_detection:face_detection_cpu up-to-date:
  bazel-bin/mediapipe/examples/desktop/face_detection/face_detection_cpu
INFO: Elapsed time: 0.119s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action

# 2.运行示例
>> ./bazel-bin/mediapipe/examples/desktop/face_detection/face_detection_cpu                      \
--calculator_graph_config_file= \
--input_video_path /home/mirror/workspace/dataset/video/video_human_0.mp4

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1733831727.250858   72587 demo_run_graph_main.cc:50] Get calculator graph config contents: # MediaPipe graph that performs face mesh with TensorFlow Lite on CPU.

# CPU buffer. (ImageFrame)
input_stream: "input_video"

# Output image with rendered results. (ImageFrame)
output_stream: "output_video"
# Detected faces. (std::vector<Detection>)
output_stream: "face_detections"

# Throttles the images flowing downstream for flow control. It passes through
# the very first incoming image unaltered, and waits for downstream nodes
# (calculators and subgraphs) in the graph to finish their tasks before it
# passes through another image. All images that come in while waiting are
# dropped, limiting the number of in-flight images in most part of the graph to
# 1. This prevents the downstream nodes from queuing up incoming images and data
# excessively, which leads to increased latency and memory usage, unwanted in
# real-time mobile applications. It also eliminates unnecessarily computation,
# e.g., the output produced by a node may get dropped downstream if the
# subsequent nodes are still busy processing previous inputs.
node {
  calculator: "FlowLimiterCalculator"
  input_stream: "input_video"
  input_stream: "FINISHED:output_video"
  input_stream_info: {
    tag_index: "FINISHED"
    back_edge: true
  }
  output_stream: "throttled_input_video"
}

# Subgraph that detects faces.
node {
  calculator: "FaceDetectionShortRangeCpu"
  input_stream: "IMAGE:throttled_input_video"
  output_stream: "DETECTIONS:face_detections"
}

# Converts the detections to drawing primitives for annotation overlay.
node {
  calculator: "DetectionsToRenderDataCalculator"
  input_stream: "DETECTIONS:face_detections"
  output_stream: "RENDER_DATA:render_data"
  node_options: {
    [type.googleapis.com/mediapipe.DetectionsToRenderDataCalculatorOptions] {
      thickness: 4.0
      color { r: 255 g: 0 b: 0 }
    }
  }
}

# Draws annotations and overlays them on top of the input images.
node {
  calculator: "AnnotationOverlayCalculator"
  input_stream: "IMAGE:throttled_input_video"
  input_stream: "render_data"
  output_stream: "IMAGE:output_video"
}

I0000 00:00:1733831727.252413   72587 demo_run_graph_main.cc:56] Initialize the calculator graph.
I0000 00:00:1733831727.259628   72587 demo_run_graph_main.cc:60] Initialize the camera or load the video.
I0000 00:00:1733831727.319697   72587 demo_run_graph_main.cc:81] Start running the calculator graph.
I0000 00:00:1733831727.320504   72587 demo_run_graph_main.cc:86] Start grabbing and processing frames.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
VERBOSE: XNNPack weight cache not enabled.
INFO: Initialized TensorFlow Lite runtime.
VERBOSE: Replacing 164 out of 164 node(s) with delegate (TfLiteXNNPackDelegate) node, yielding 1 partitions for the whole graph.
W0000 00:00:1733831727.336902   72591 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
I0000 00:00:1733831733.312298   72587 demo_run_graph_main.cc:97] Empty frame, end of video reached.
I0000 00:00:1733831733.312374   72587 demo_run_graph_main.cc:145] Shutting down.
I0000 00:00:1733831733.316855   72587 demo_run_graph_main.cc:159] Success!

运行效果如下：

3.3 其它示例编译

其实都i差不多，只需要将编译命令最后一行 mediapipe/examples/desktop/face_detection:face_detection_cpu改成你要编译的示例即可。例如，编译face_mesh任务：

# 1.编译
>> bazel build --define MEDIAPIPE_DISABLE_GPU=1 --define xnn_enable_avxvnni=false --define         \
xnn_enable_avx512amx=false --define xnn_enable_avxvnni=false --define xnn_enable_avxvnniint8=false \
 --define xnn_enable_avx512fp16=false mediapipe/examples/desktop/face_mesh:face_mesh_cpu

WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/BUILD:70:24: in cc_library rule //mediapipe/framework:calculator_cc_proto: target '//mediapipe/framework:calculator_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/tool/BUILD:204:24: in cc_library rule //mediapipe/framework/tool:field_data_cc_proto: target '//mediapipe/framework/tool:field_data_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
INFO: Analyzed target //mediapipe/examples/desktop/face_mesh:face_mesh_cpu (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //mediapipe/examples/desktop/face_mesh:face_mesh_cpu up-to-date:
  bazel-bin/mediapipe/examples/desktop/face_mesh/face_mesh_cpu
INFO: Elapsed time: 2.033s, Critical Path: 0.03s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action

# 2.运行
>> ./bazel-bin/mediapipe/examples/desktop/face_mesh/face_mesh_cpu                      \
--calculator_graph_config_file=mediapipe/graphs/face_mesh/face_mesh_desktop_live.pbtxt \
--input_video_path /home/mirror/workspace/dataset/video/video_human_0.mp4

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1733833769.769169   82988 demo_run_graph_main.cc:50] Get calculator graph config contents: # MediaPipe graph that performs face mesh with TensorFlow Lite on CPU.

# Input image. (ImageFrame)
input_stream: "input_video"

# Output image with rendered results. (ImageFrame)
output_stream: "output_video"
# Collection of detected/processed faces, each represented as a list of
# landmarks. (std::vector<NormalizedLandmarkList>)
output_stream: "multi_face_landmarks"

# Throttles the images flowing downstream for flow control. It passes through
# the very first incoming image unaltered, and waits for downstream nodes
# (calculators and subgraphs) in the graph to finish their tasks before it
# passes through another image. All images that come in while waiting are
# dropped, limiting the number of in-flight images in most part of the graph to
# 1. This prevents the downstream nodes from queuing up incoming images and data
# excessively, which leads to increased latency and memory usage, unwanted in
# real-time mobile applications. It also eliminates unnecessarily computation,
# e.g., the output produced by a node may get dropped downstream if the
# subsequent nodes are still busy processing previous inputs.
node {
  calculator: "FlowLimiterCalculator"
  input_stream: "input_video"
  input_stream: "FINISHED:output_video"
  input_stream_info: {
    tag_index: "FINISHED"
    back_edge: true
  }
  output_stream: "throttled_input_video"
}

# Defines side packets for further use in the graph.
node {
  calculator: "ConstantSidePacketCalculator"
  output_side_packet: "PACKET:0:num_faces"
  output_side_packet: "PACKET:1:with_attention"
  node_options: {
    [type.googleapis.com/mediapipe.ConstantSidePacketCalculatorOptions]: {
      packet { int_value: 1 }
      packet { bool_value: true }
    }
  }
}

# Subgraph that detects faces and corresponding landmarks.
node {
  calculator: "FaceLandmarkFrontCpu"
  input_stream: "IMAGE:throttled_input_video"
  input_side_packet: "NUM_FACES:num_faces"
  input_side_packet: "WITH_ATTENTION:with_attention"
  output_stream: "LANDMARKS:multi_face_landmarks"
  output_stream: "ROIS_FROM_LANDMARKS:face_rects_from_landmarks"
  output_stream: "DETECTIONS:face_detections"
  output_stream: "ROIS_FROM_DETECTIONS:face_rects_from_detections"
}

# Subgraph that renders face-landmark annotation onto the input image.
node {
  calculator: "FaceRendererCpu"
  input_stream: "IMAGE:throttled_input_video"
  input_stream: "LANDMARKS:multi_face_landmarks"
  input_stream: "NORM_RECTS:face_rects_from_landmarks"
  input_stream: "DETECTIONS:face_detections"
  output_stream: "IMAGE:output_video"
}

I0000 00:00:1733833769.770587   82988 demo_run_graph_main.cc:56] Initialize the calculator graph.
I0000 00:00:1733833769.784603   82988 demo_run_graph_main.cc:60] Initialize the camera or load the video.
I0000 00:00:1733833769.799798   82988 demo_run_graph_main.cc:81] Start running the calculator graph.
I0000 00:00:1733833769.801796   82988 demo_run_graph_main.cc:86] Start grabbing and processing frames.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
VERBOSE: XNNPack weight cache not enabled.
INFO: Initialized TensorFlow Lite runtime.
VERBOSE: Replacing 164 out of 164 node(s) with delegate (TfLiteXNNPackDelegate) node, yielding 1 partitions for the whole graph.
W0000 00:00:1733833769.816623   82991 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
VERBOSE: XNNPack weight cache not enabled.
VERBOSE: Replacing 700 out of 712 node(s) with delegate (TfLiteXNNPackDelegate) node, yielding 5 partitions for the whole graph.
W0000 00:00:1733833769.871122   82992 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1733833769.888540   82991 landmark_projection_calculator.cc:186] Using NORM_RECT without IMAGE_DIMENSIONS is only supported for the square ROI. Provide IMAGE_DIMENSIONS or use PROJECTION_MATRIX.
I0000 00:00:1733833778.153743   82988 demo_run_graph_main.cc:97] Empty frame, end of video reached.
I0000 00:00:1733833778.153824   82988 demo_run_graph_main.cc:145] Shutting down.
I0000 00:00:1733833778.160292   82988 demo_run_graph_main.cc:159] Success!

需要注意的是，你需要手动将人脸检测模型下载下来，并更名为 face_detection_short_range.tflite放到 mediapipe\modules\face_detection路径下，才能运行成功，对应效果示例如下：

菜单

Mediapipe入门学习笔记

分享

Mediapipe入门学习笔记

Mediapipe入门学习笔记

0.内容概览

1.Bazel工具安装

1.1 版本要求

1.2 Bazel安装

2.Mediapipe编译

2.1 源码下载

2.2 安装依赖包

2.3 源码编译

3.Mediapipe示例编译

3.1 编译选项

3.2 人脸检测示例

3.3 其它示例编译

4.参考资料

评论

C++虚函数机制学习笔记

Lambda表达式、std::function和std::bind学习笔记

C++智能指针学习笔记

基于Nvidia Docker开发环境搭建

C++类型擦除、依赖注入和外部多态学习笔记

fastapi学习笔记

pathlib使用学习笔记

C++基于锁的并发数据结构设计

AR

C++并发编程实战学习笔记