[Question] Support for models with Time-Channel Separable Convolutions (TCSConv1d / Depthwise Separable 1D) #1425

rhino-098 · 2025-09-21T14:20:17Z

rhino-098
Sep 21, 2025

Hi everyone,

I'm exploring the possibility of accelerating a basecalling model (based on the QuartzNet/Bonito architecture) using FINN. This model makes heavy use of Time-Channel Separable 1D convolutions (TCSConv1d), which are essentially 1D depthwise separable convolutions.

I understand that FINN requires a model's layers to be decomposed into supported primitives. Before attempting a significant model re-architecture and quantization-aware retraining, I wanted to ask for your guidance on the feasibility and the best approach for this type of layer.

A typical block in the model's encoder is structured as follows, with repeating sub-blocks containing the TCSConv1d layer:

Here is a snippet of the implementation:

import torch.nn as nn
from torch.nn import Module, ModuleList, Sequential, BatchNorm1d, Dropout
import brevitas.nn as qnn

class TCSConv1d_quant(Module):
    """
    Quantized Time-Channel Separable 1D Convolution
    """
    def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=False, separable=False, quant=8, quant_act=8):
        super(TCSConv1d_quant, self).__init__()
        self.separable = separable

        if separable:
            self.depthwise = qnn.QuantConv1d(
                in_channels, in_channels, kernel_size=kernel_size, stride=stride,
                padding=padding, dilation=dilation, bias=bias, groups=in_channels, # Corrected groups for depthwise
                weight_bit_width=quant
            )
            self.pointwise = qnn.QuantConv1d(
                in_channels, out_channels, kernel_size=1, stride=1,
                dilation=dilation, bias=bias, padding=0,
                weight_bit_width=quant
            )
            self.quant_identity = qnn.QuantIdentity(bit_width=quant_act, return_quant_tensor=True)
        else:
            self.conv = qnn.QuantConv1d(
                in_channels, out_channels, kernel_size=kernel_size,
                stride=stride, padding=padding, dilation=dilation, bias=bias,
                weight_bit_width=quant
            )

    def forward(self, x):
        if self.separable:
            x = self.depthwise(x)
            x = self.quant_identity(x)
            x = self.pointwise(x)
        else:
            x = self.conv(x)
        return x

class Block(Module):
    """
    Quantized Block with TCSConv, BatchNorm, Activation
    """
    def __init__(self, in_channels, out_channels, activation, repeat=5, kernel_size=1, stride=1, dilation=1, dropout=0.0, residual=False, separable=False, quant=8, quant_act=8):
        super(Block, self).__init__()
        self.use_res = residual
        self.conv = ModuleList()
        # ... logic to build repeating sub-blocks ...
        # A sub-block looks like this:
        # [
        #     TCSConv1d_quant(...),
        #     BatchNorm1d(...),
        #     qnn.QuantReLU(...),
        #     Dropout(...)
        # ]
        if self.use_res:
            # The residual path also uses a TCSConv1d_quant and BatchNorm1d
            self.residual = Sequential(...)
    
    def forward(self, x):
        _x = x
        # Main path
        for layer in self.conv:
            _x = layer(_x)
        # Residual connection
        if self.use_res:
            _x = _x + self.residual(x)
        # Final activation
        return self.activation(_x)

My main questions are:

Is the recommended approach to manually decompose each TCSConv1d layer into a standard grouped Conv1d followed by a 1x1 Conv1d within the PyTorch model before exporting to ONNX?

Does FINN have optimized hardware support for this specific pattern (1D depthwise separable convolution), or would it be treated as two independent standard convolutions?

Are there any known challenges or best practices for handling the residual connections that are also present in these blocks?

Thank you for your time and for the great work on this framework!

auphelia · 2026-01-27T16:56:54Z

auphelia
Jan 27, 2026
Maintainer

Hi @rhino-098,

Sorry for the late reply — and thanks for the interesting use case you're exploring.
I believe what you want to achieve should be feasible with FINN.

FINN already supports 1D CNNs. For example, the vgg10 model trained on the RadioML dataset in finn-examples is 1D. You can find the build script here: https://github.com/Xilinx/finn-examples/tree/main/build/vgg10-radioml.

Regarding the PyTorch definition: I'm not a PyTorch expert, but MobileNet‑v1 is supported in FINN and uses depthwise separable convolutions. While its layers aren't 1D, the concept still applies. The Brevitas‑trained MobileNet‑v1 PyTorch code is here: https://github.com/Xilinx/brevitas/blob/master/src/brevitas_examples/imagenet_classification/models/mobilenetv1.py. From that model structure, it appears that depthwise and pointwise convolutions are created separately with quantization in between. FINN will then map the depthwise convolution to the VVAU and the pointwise convolution to the MVAU hardware kernels.
If you have more questions about PyTorch and Brevitas, I would recommend reaching out on their GitHub Discussions forum directly.

Additionally, there was past work on QuartzNet by a master's student from the TU Delft. The report is available here (although the build scripts were not published): https://repository.tudelft.nl/record/uuid:b6c889b1-e06f-447c-af69-55708555bf90 .

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Support for models with Time-Channel Separable Convolutions (TCSConv1d / Depthwise Separable 1D) #1425

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Question] Support for models with Time-Channel Separable Convolutions (TCSConv1d / Depthwise Separable 1D) #1425

Uh oh!

rhino-098 Sep 21, 2025

Replies: 1 comment

Uh oh!

auphelia Jan 27, 2026 Maintainer

rhino-098
Sep 21, 2025

auphelia
Jan 27, 2026
Maintainer