-
Notifications
You must be signed in to change notification settings - Fork 4.4k
feat: add TopK and Gather ncnn operators for YOLOv10 deployment #6668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
8bd7d30
b2c445a
01d15cb
13cf18c
e95770e
4b4b87a
c9e856e
4d5b35f
5c11058
226bd88
6c5978b
e16514b
00be7f8
1fe4463
6ea29eb
7befff6
5ba7fbc
e4b4073
49dbc7b
9d31f3b
84e083b
2ea44dd
5674b1c
caa9de3
4e39cb6
ca55f8a
2b5fa16
d8fd80c
d68852d
93bd423
168cdea
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,111 @@ | ||
| name: topk-linux-test | ||
| on: | ||
| push: | ||
| branches: | ||
| - topk-ci-tests | ||
|
|
||
| jobs: | ||
| x64-none: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: build | ||
| run: | | ||
| mkdir build && cd build | ||
| cmake -DCMAKE_BUILD_TYPE=Debug -DNCNN_RUNTIME_CPU=OFF \ | ||
| -DNCNN_SSE2=OFF -DNCNN_AVX=OFF \ | ||
| -DNCNN_OPENMP=OFF -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF -DNCNN_BUILD_TESTS=ON .. | ||
| cmake --build . --target test_topk -j$(nproc) | ||
| - name: test | ||
| run: cd build && ./tests/test_topk | ||
|
|
||
| x64-sse2: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: build | ||
| run: | | ||
| mkdir build && cd build | ||
| cmake -DCMAKE_BUILD_TYPE=Debug -DNCNN_RUNTIME_CPU=OFF \ | ||
| -DNCNN_SSE2=ON -DNCNN_AVX=OFF \ | ||
| -DNCNN_OPENMP=OFF -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF -DNCNN_BUILD_TESTS=ON .. | ||
| cmake --build . --target test_topk -j$(nproc) | ||
| - name: test | ||
| run: cd build && ./tests/test_topk | ||
|
|
||
| x64-avx2: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: build | ||
| run: | | ||
| mkdir build && cd build | ||
| cmake -DCMAKE_BUILD_TYPE=Debug -DNCNN_RUNTIME_CPU=OFF \ | ||
| -DNCNN_SSE2=ON -DNCNN_AVX=ON -DNCNN_F16C=ON -DNCNN_FMA=ON -DNCNN_AVX2=ON \ | ||
| -DNCNN_AVX512=OFF -DNCNN_XOP=OFF -DNCNN_AVXVNNI=OFF \ | ||
| -DNCNN_OPENMP=OFF -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF -DNCNN_BUILD_TESTS=ON .. | ||
| cmake --build . --target test_topk -j$(nproc) | ||
| - name: test | ||
| run: cd build && ./tests/test_topk | ||
|
|
||
| simplestl-simplemath: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: build | ||
| run: | | ||
| mkdir build && cd build | ||
| cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host-c.gcc.toolchain.cmake \ | ||
| -DCMAKE_BUILD_TYPE=Debug \ | ||
| -DNCNN_SIMPLESTL=ON -DNCNN_SIMPLEMATH=ON \ | ||
| -DNCNN_OPENMP=OFF -DNCNN_THREADS=OFF \ | ||
| -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF -DNCNN_BUILD_TESTS=ON .. | ||
| cmake --build . --target test_topk -j$(nproc) | ||
| - name: test | ||
| run: cd build && ./tests/test_topk | ||
|
|
||
| linux-x86-gcc: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: install | ||
| run: sudo apt-get update && sudo apt-get install -y gcc-multilib g++-multilib | ||
| - name: build | ||
| run: | | ||
| mkdir build && cd build | ||
| cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host.gcc-m32.toolchain.cmake \ | ||
| -DNCNN_BUILD_TESTS=ON -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF .. | ||
| cmake --build . --target test_topk -j$(nproc) | ||
| - name: test | ||
| run: cd build && ./tests/test_topk | ||
| - name: build-nosse | ||
| run: | | ||
| mkdir build-nosse && cd build-nosse | ||
| cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host.gcc-m32.toolchain.cmake \ | ||
| -DNCNN_RUNTIME_CPU=OFF -DNCNN_SSE2=OFF -DNCNN_AVX=OFF \ | ||
| -DNCNN_BUILD_TESTS=ON -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF .. | ||
| cmake --build . --target test_topk -j$(nproc) | ||
| - name: test-nosse | ||
| run: cd build-nosse && ./tests/test_topk | ||
|
|
||
| pnnx-onnx-topk: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: '3.12' | ||
| - name: setup-pytorch | ||
| run: | | ||
| pip3 install torch --index-url https://download.pytorch.org/whl/cpu | ||
| pip3 install numpy packaging onnx onnxruntime | ||
| - name: build-pnnx | ||
| run: | | ||
| cd tools/pnnx | ||
| mkdir build && cd build | ||
| cmake -DCMAKE_BUILD_TYPE=Release .. | ||
| cmake --build . --config Release -j$(nproc) | ||
| - name: test-topk | ||
| run: | | ||
| cd tools/pnnx/build | ||
| ctest --output-on-failure -R test_onnx_torch_topk | ||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,121 @@ | ||||||||||||||||||||||||
| // Copyright 2025 Tencent | ||||||||||||||||||||||||
| // SPDX-License-Identifier: BSD-3-Clause | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| #include "gather.h" | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| namespace ncnn { | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| Gather::Gather() | ||||||||||||||||||||||||
| { | ||||||||||||||||||||||||
| one_blob_only = false; | ||||||||||||||||||||||||
| support_inplace = false; | ||||||||||||||||||||||||
| } | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| int Gather::load_param(const ParamDict& pd) | ||||||||||||||||||||||||
| { | ||||||||||||||||||||||||
| axis = pd.get(0, 0); | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| return 0; | ||||||||||||||||||||||||
| } | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| int Gather::forward(const std::vector<Mat>& bottom_blobs, std::vector<Mat>& top_blobs, const Option& opt) const | ||||||||||||||||||||||||
| { | ||||||||||||||||||||||||
| if (bottom_blobs.size() < 2) | ||||||||||||||||||||||||
| return -1; | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| const Mat& input_blob = bottom_blobs[0]; | ||||||||||||||||||||||||
| const Mat& index_blob = bottom_blobs[1]; | ||||||||||||||||||||||||
| const int dims = input_blob.dims; | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| // index_blob should contain int64 or int32 indices | ||||||||||||||||||||||||
| // For simplicity we treat it as float and cast | ||||||||||||||||||||||||
| const int index_size = (int)index_blob.total(); | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
|
Comment on lines
+30
to
+33
|
||||||||||||||||||||||||
| int positive_axis = axis < 0 ? axis + dims : axis; | ||||||||||||||||||||||||
| if (positive_axis < 0 || positive_axis >= dims) | ||||||||||||||||||||||||
| return -1; | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| int shape[4] = {1, 1, 1, 1}; | ||||||||||||||||||||||||
| shape[0] = input_blob.w; | ||||||||||||||||||||||||
| if (dims >= 2) shape[1] = input_blob.h; | ||||||||||||||||||||||||
| if (dims == 3) shape[2] = input_blob.c; | ||||||||||||||||||||||||
| if (dims == 4) shape[2] = input_blob.c; // w*h*c layout | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| const int axis_dim_size = shape[positive_axis]; | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| // Output shape matches index_blob shape | ||||||||||||||||||||||||
| const Mat& out_shape = index_blob; | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| // Allocate output (same dtype as input, shape matches index) | ||||||||||||||||||||||||
| Mat& top_blob = top_blobs[0]; | ||||||||||||||||||||||||
| top_blob.create(out_shape.w, out_shape.h, out_shape.c, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
| top_blob.create(out_shape.w, out_shape.h, out_shape.c, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); | |
| if (out_shape.dims == 1) | |
| top_blob.create(out_shape.w, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); | |
| else if (out_shape.dims == 2) | |
| top_blob.create(out_shape.w, out_shape.h, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); | |
| else if (out_shape.dims == 3) | |
| top_blob.create(out_shape.w, out_shape.h, out_shape.c, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); | |
| else if (out_shape.dims == 4) | |
| top_blob.create(out_shape.w, out_shape.h, out_shape.d, out_shape.c, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); | |
| else | |
| return -1; |
Copilot
AI
Apr 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gather treats input and output tensors as float* (and writes via float*), but top_blob is allocated using input_blob.elemsize/elempack which may be fp16/int8/etc. This will produce incorrect results or memory corruption when elemsize != 4. Either restrict Gather to float32 with a runtime check or add proper type handling.
Copilot
AI
Apr 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only dims 1/2/3 are handled in flat index computation. For dims==4, flat_in remains incorrect and Gather will return wrong results silently. Either implement 4D indexing (w,h,d,c with cstep) or explicitly reject dims > 3 early with a clear error code.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| // Copyright 2025 Tencent | ||
| // SPDX-License-Identifier: BSD-3-Clause | ||
|
|
||
| #ifndef LAYER_GATHER_H | ||
| #define LAYER_GATHER_H | ||
|
|
||
| #include "layer.h" | ||
|
|
||
| namespace ncnn { | ||
|
|
||
| class Gather : public Layer | ||
| { | ||
| public: | ||
| Gather(); | ||
|
|
||
| virtual int load_param(const ParamDict& pd); | ||
|
|
||
| virtual int forward(const std::vector<Mat>& bottom_blobs, std::vector<Mat>& top_blobs, const Option& opt) const; | ||
|
|
||
| public: | ||
| // param_0 = axis (default 0) | ||
| int axis; | ||
| }; | ||
|
|
||
| } // namespace ncnn | ||
|
|
||
| #endif // LAYER_GATHER_H |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This workflow only runs on pushes to the 'topk-ci-tests' branch, so it will not run for pull_request events or for the repository's normal branches after merge. If it is meant to provide ongoing CI coverage, add appropriate pull_request/push triggers; otherwise consider not adding it to the mainline PR.