Mediapipe入门学习笔记
0.内容概览
1.Bazel工具安装
1.1 版本要求
- Mediapipe要求Bazel工具版本为6.5.0,下载地址Baze-6.5.0
1.2 Bazel安装
# 1.Linux
>> chmox +x bazel-6.5.0-installer-linux-x86_64.sh
>> sudo ./bazel-6.5.0-installer-linux-x86_64.sh
# 2.Windows
## 2.1 重命名bazel-6.5.0-windows-x86_64.exe为bazel.exe
## 2.2 将bazel.exe放到某个目录下
## 2.3 将bazel.exe所在目录添加到windows环境变量path中
# 3.查看版本
>> bazel --version
bazel 6.5.0
2.Mediapipe编译
2.1 源码下载
>> git clone --recursive git@github.com:google-ai-edge/mediapipe.git
2.2 安装依赖包
- (1) OpenCV安装,这个网上教程比较多,就不写了,版本可以是3.x或4.x,mediapipe内部有适配,mediapipe的OpenCV配置需要修改
WORKSPACE
和third_party/opencv_linux.BUILD
两个文件,我这边安装的是OpenCV3
new_local_repository(
name = "linux_opencv",
build_file = "@//third_party:opencv_linux.BUILD",
path = "/usr/local",
)
cc_library(
name = "opencv",
srcs = [
"lib/libopencv_core.so",
"lib/libopencv_calib3d.so",
"lib/libopencv_features2d.so",
"lib/libopencv_highgui.so",
"lib/libopencv_imgcodecs.so",
"lib/libopencv_imgproc.so",
"lib/libopencv_video.so",
"lib/libopencv_videoio.so",
],
hdrs = glob([
"include/opencv2/**/*.h*",
"include/opencv/**/*.h*",
]),
includes = [
"include/",
],
visibility = ["//visibility:public"],
)
- (2) ffmpeg安装:
>> sudo apt-get update
>> sudo apt-get install ffmpeg
>> ffmpeg -version
ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
2.3 源码编译
>> cd mediapipe
>> export GLOG_logtostderr=1
>> bazel run --define MEDIAPIPE_DISABLE_GPU=1 \
mediapipe/examples/desktop/hello_world:hello_world
如果你当前的Ubuntu是20.04版本,而python版本则是 3.8.x
,这时就需要修改 WORKSPACE
文件:
python_init_repositories(
default_python_version = "system",
local_wheel_dist_folder = "dist",
local_wheel_inclusion_list = ["mediapipe*"],
local_wheel_workspaces = ["//:WORKSPACE"],
requirements = {
"3.8": "//:requirements_lock_3_8.txt",
"3.9": "//:requirements_lock_3_9.txt",
"3.10": "//:requirements_lock_3_10.txt",
"3.11": "//:requirements_lock_3_11.txt",
"3.12": "//:requirements_lock_3_12.txt",
},
)
其中 requirements_lock_3_8.txt
文件只需要你将 requirements_lock.txt
拷贝一份,再重命名一下即可,其实就是一个 requirements.txt
文件,然后再编译:
>> bazel run --define MEDIAPIPE_DISABLE_GPU=1 \
mediapipe/examples/desktop/hello_world:hello_world
WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/BUILD:70:24: in cc_library rule //mediapipe/framework:calculator_cc_proto: target '//mediapipe/framework:calculator_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/tool/BUILD:204:24: in cc_library rule //mediapipe/framework/tool:field_data_cc_proto: target '//mediapipe/framework/tool:field_data_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
INFO: Analyzed target //mediapipe/examples/desktop/hello_world:hello_world (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //mediapipe/examples/desktop/hello_world:hello_world up-to-date:
bazel-bin/mediapipe/examples/desktop/hello_world/hello_world
INFO: Elapsed time: 0.119s, Critical Path: 0.01s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/mediapipe/examples/desktop/hello_world/hello_world
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1733825056.505418 15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505500 15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505512 15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505518 15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505539 15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505562 15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505572 15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505579 15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505597 15716 hello_world.cc:58] Hello World!
I0000 00:00:1733825056.505655 15716 hello_world.cc:58] Hello World!
3.Mediapipe示例编译
3.1 编译选项
由于mediapipe内部使用XNNPack推理,有些编译选项需要手动关闭,才能成功编译通过:
--define MEDIAPIPE_DISABLE_GPU=1 --define xnn_enable_avxvnni=false \
--define xnn_enable_avx512amx=false --define xnn_enable_avxvnni=false --define \
xnn_enable_avxvnniint8=false --define xnn_enable_avx512fp16=false
3.2 人脸检测示例
# 1.编译示例
>> bazel build --define MEDIAPIPE_DISABLE_GPU=1 --define xnn_enable_avxvnni=false \
--define xnn_enable_avx512amx=false --define xnn_enable_avxvnni=false --define \
xnn_enable_avxvnniint8=false --define xnn_enable_avx512fp16=false \
mediapipe/examples/desktop/face_detection:face_detection_cpu
WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/BUILD:70:24: in cc_library rule //mediapipe/framework:calculator_cc_proto: target '//mediapipe/framework:calculator_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/tool/BUILD:204:24: in cc_library rule //mediapipe/framework/tool:field_data_cc_proto: target '//mediapipe/framework/tool:field_data_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
INFO: Analyzed target //mediapipe/examples/desktop/face_detection:face_detection_cpu (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //mediapipe/examples/desktop/face_detection:face_detection_cpu up-to-date:
bazel-bin/mediapipe/examples/desktop/face_detection/face_detection_cpu
INFO: Elapsed time: 0.119s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
# 2.运行示例
>> ./bazel-bin/mediapipe/examples/desktop/face_detection/face_detection_cpu \
--calculator_graph_config_file= \
--input_video_path /home/mirror/workspace/dataset/video/video_human_0.mp4
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1733831727.250858 72587 demo_run_graph_main.cc:50] Get calculator graph config contents: # MediaPipe graph that performs face mesh with TensorFlow Lite on CPU.
# CPU buffer. (ImageFrame)
input_stream: "input_video"
# Output image with rendered results. (ImageFrame)
output_stream: "output_video"
# Detected faces. (std::vector<Detection>)
output_stream: "face_detections"
# Throttles the images flowing downstream for flow control. It passes through
# the very first incoming image unaltered, and waits for downstream nodes
# (calculators and subgraphs) in the graph to finish their tasks before it
# passes through another image. All images that come in while waiting are
# dropped, limiting the number of in-flight images in most part of the graph to
# 1. This prevents the downstream nodes from queuing up incoming images and data
# excessively, which leads to increased latency and memory usage, unwanted in
# real-time mobile applications. It also eliminates unnecessarily computation,
# e.g., the output produced by a node may get dropped downstream if the
# subsequent nodes are still busy processing previous inputs.
node {
calculator: "FlowLimiterCalculator"
input_stream: "input_video"
input_stream: "FINISHED:output_video"
input_stream_info: {
tag_index: "FINISHED"
back_edge: true
}
output_stream: "throttled_input_video"
}
# Subgraph that detects faces.
node {
calculator: "FaceDetectionShortRangeCpu"
input_stream: "IMAGE:throttled_input_video"
output_stream: "DETECTIONS:face_detections"
}
# Converts the detections to drawing primitives for annotation overlay.
node {
calculator: "DetectionsToRenderDataCalculator"
input_stream: "DETECTIONS:face_detections"
output_stream: "RENDER_DATA:render_data"
node_options: {
[type.googleapis.com/mediapipe.DetectionsToRenderDataCalculatorOptions] {
thickness: 4.0
color { r: 255 g: 0 b: 0 }
}
}
}
# Draws annotations and overlays them on top of the input images.
node {
calculator: "AnnotationOverlayCalculator"
input_stream: "IMAGE:throttled_input_video"
input_stream: "render_data"
output_stream: "IMAGE:output_video"
}
I0000 00:00:1733831727.252413 72587 demo_run_graph_main.cc:56] Initialize the calculator graph.
I0000 00:00:1733831727.259628 72587 demo_run_graph_main.cc:60] Initialize the camera or load the video.
I0000 00:00:1733831727.319697 72587 demo_run_graph_main.cc:81] Start running the calculator graph.
I0000 00:00:1733831727.320504 72587 demo_run_graph_main.cc:86] Start grabbing and processing frames.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
VERBOSE: XNNPack weight cache not enabled.
INFO: Initialized TensorFlow Lite runtime.
VERBOSE: Replacing 164 out of 164 node(s) with delegate (TfLiteXNNPackDelegate) node, yielding 1 partitions for the whole graph.
W0000 00:00:1733831727.336902 72591 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
I0000 00:00:1733831733.312298 72587 demo_run_graph_main.cc:97] Empty frame, end of video reached.
I0000 00:00:1733831733.312374 72587 demo_run_graph_main.cc:145] Shutting down.
I0000 00:00:1733831733.316855 72587 demo_run_graph_main.cc:159] Success!
运行效果如下:
3.3 其它示例编译
其实都i差不多,只需要将编译命令最后一行 mediapipe/examples/desktop/face_detection:face_detection_cpu
改成你要编译的示例即可。例如,编译face_mesh任务:
# 1.编译
>> bazel build --define MEDIAPIPE_DISABLE_GPU=1 --define xnn_enable_avxvnni=false --define \
xnn_enable_avx512amx=false --define xnn_enable_avxvnni=false --define xnn_enable_avxvnniint8=false \
--define xnn_enable_avx512fp16=false mediapipe/examples/desktop/face_mesh:face_mesh_cpu
WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/BUILD:70:24: in cc_library rule //mediapipe/framework:calculator_cc_proto: target '//mediapipe/framework:calculator_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
WARNING: /home/mirror/workspace/mediapipe/mediapipe/framework/tool/BUILD:204:24: in cc_library rule //mediapipe/framework/tool:field_data_cc_proto: target '//mediapipe/framework/tool:field_data_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
INFO: Analyzed target //mediapipe/examples/desktop/face_mesh:face_mesh_cpu (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //mediapipe/examples/desktop/face_mesh:face_mesh_cpu up-to-date:
bazel-bin/mediapipe/examples/desktop/face_mesh/face_mesh_cpu
INFO: Elapsed time: 2.033s, Critical Path: 0.03s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
# 2.运行
>> ./bazel-bin/mediapipe/examples/desktop/face_mesh/face_mesh_cpu \
--calculator_graph_config_file=mediapipe/graphs/face_mesh/face_mesh_desktop_live.pbtxt \
--input_video_path /home/mirror/workspace/dataset/video/video_human_0.mp4
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1733833769.769169 82988 demo_run_graph_main.cc:50] Get calculator graph config contents: # MediaPipe graph that performs face mesh with TensorFlow Lite on CPU.
# Input image. (ImageFrame)
input_stream: "input_video"
# Output image with rendered results. (ImageFrame)
output_stream: "output_video"
# Collection of detected/processed faces, each represented as a list of
# landmarks. (std::vector<NormalizedLandmarkList>)
output_stream: "multi_face_landmarks"
# Throttles the images flowing downstream for flow control. It passes through
# the very first incoming image unaltered, and waits for downstream nodes
# (calculators and subgraphs) in the graph to finish their tasks before it
# passes through another image. All images that come in while waiting are
# dropped, limiting the number of in-flight images in most part of the graph to
# 1. This prevents the downstream nodes from queuing up incoming images and data
# excessively, which leads to increased latency and memory usage, unwanted in
# real-time mobile applications. It also eliminates unnecessarily computation,
# e.g., the output produced by a node may get dropped downstream if the
# subsequent nodes are still busy processing previous inputs.
node {
calculator: "FlowLimiterCalculator"
input_stream: "input_video"
input_stream: "FINISHED:output_video"
input_stream_info: {
tag_index: "FINISHED"
back_edge: true
}
output_stream: "throttled_input_video"
}
# Defines side packets for further use in the graph.
node {
calculator: "ConstantSidePacketCalculator"
output_side_packet: "PACKET:0:num_faces"
output_side_packet: "PACKET:1:with_attention"
node_options: {
[type.googleapis.com/mediapipe.ConstantSidePacketCalculatorOptions]: {
packet { int_value: 1 }
packet { bool_value: true }
}
}
}
# Subgraph that detects faces and corresponding landmarks.
node {
calculator: "FaceLandmarkFrontCpu"
input_stream: "IMAGE:throttled_input_video"
input_side_packet: "NUM_FACES:num_faces"
input_side_packet: "WITH_ATTENTION:with_attention"
output_stream: "LANDMARKS:multi_face_landmarks"
output_stream: "ROIS_FROM_LANDMARKS:face_rects_from_landmarks"
output_stream: "DETECTIONS:face_detections"
output_stream: "ROIS_FROM_DETECTIONS:face_rects_from_detections"
}
# Subgraph that renders face-landmark annotation onto the input image.
node {
calculator: "FaceRendererCpu"
input_stream: "IMAGE:throttled_input_video"
input_stream: "LANDMARKS:multi_face_landmarks"
input_stream: "NORM_RECTS:face_rects_from_landmarks"
input_stream: "DETECTIONS:face_detections"
output_stream: "IMAGE:output_video"
}
I0000 00:00:1733833769.770587 82988 demo_run_graph_main.cc:56] Initialize the calculator graph.
I0000 00:00:1733833769.784603 82988 demo_run_graph_main.cc:60] Initialize the camera or load the video.
I0000 00:00:1733833769.799798 82988 demo_run_graph_main.cc:81] Start running the calculator graph.
I0000 00:00:1733833769.801796 82988 demo_run_graph_main.cc:86] Start grabbing and processing frames.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
VERBOSE: XNNPack weight cache not enabled.
INFO: Initialized TensorFlow Lite runtime.
VERBOSE: Replacing 164 out of 164 node(s) with delegate (TfLiteXNNPackDelegate) node, yielding 1 partitions for the whole graph.
W0000 00:00:1733833769.816623 82991 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
VERBOSE: XNNPack weight cache not enabled.
VERBOSE: Replacing 700 out of 712 node(s) with delegate (TfLiteXNNPackDelegate) node, yielding 5 partitions for the whole graph.
W0000 00:00:1733833769.871122 82992 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1733833769.888540 82991 landmark_projection_calculator.cc:186] Using NORM_RECT without IMAGE_DIMENSIONS is only supported for the square ROI. Provide IMAGE_DIMENSIONS or use PROJECTION_MATRIX.
I0000 00:00:1733833778.153743 82988 demo_run_graph_main.cc:97] Empty frame, end of video reached.
I0000 00:00:1733833778.153824 82988 demo_run_graph_main.cc:145] Shutting down.
I0000 00:00:1733833778.160292 82988 demo_run_graph_main.cc:159] Success!
需要注意的是,你需要手动将人脸检测模型下载下来,并更名为 face_detection_short_range.tflite
放到 mediapipe\modules\face_detection
路径下,才能运行成功,对应效果示例如下:
4.参考资料
- [1] XNNPack
- [2] mediapipe
- [3] Issues · google-ai-edge/mediapipe