Replies: 1 comment
-
|
Hi @rhino-098, Sorry for the late reply — and thanks for the interesting use case you're exploring. FINN already supports 1D CNNs. For example, the vgg10 model trained on the RadioML dataset in finn-examples is 1D. You can find the build script here: https://github.com/Xilinx/finn-examples/tree/main/build/vgg10-radioml. Regarding the PyTorch definition: I'm not a PyTorch expert, but MobileNet‑v1 is supported in FINN and uses depthwise separable convolutions. While its layers aren't 1D, the concept still applies. The Brevitas‑trained MobileNet‑v1 PyTorch code is here: https://github.com/Xilinx/brevitas/blob/master/src/brevitas_examples/imagenet_classification/models/mobilenetv1.py. From that model structure, it appears that depthwise and pointwise convolutions are created separately with quantization in between. FINN will then map the depthwise convolution to the VVAU and the pointwise convolution to the MVAU hardware kernels. Additionally, there was past work on QuartzNet by a master's student from the TU Delft. The report is available here (although the build scripts were not published): https://repository.tudelft.nl/record/uuid:b6c889b1-e06f-447c-af69-55708555bf90 . |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I'm exploring the possibility of accelerating a basecalling model (based on the QuartzNet/Bonito architecture) using FINN. This model makes heavy use of Time-Channel Separable 1D convolutions (TCSConv1d), which are essentially 1D depthwise separable convolutions.
I understand that FINN requires a model's layers to be decomposed into supported primitives. Before attempting a significant model re-architecture and quantization-aware retraining, I wanted to ask for your guidance on the feasibility and the best approach for this type of layer.
A typical block in the model's encoder is structured as follows, with repeating sub-blocks containing the TCSConv1d layer:
Here is a snippet of the implementation:
My main questions are:
Is the recommended approach to manually decompose each TCSConv1d layer into a standard grouped Conv1d followed by a 1x1 Conv1d within the PyTorch model before exporting to ONNX?
Does FINN have optimized hardware support for this specific pattern (1D depthwise separable convolution), or would it be treated as two independent standard convolutions?
Are there any known challenges or best practices for handling the residual connections that are also present in these blocks?
Thank you for your time and for the great work on this framework!
Beta Was this translation helpful? Give feedback.
All reactions