개발

YOLOv5 C3 Block 시각화 리뷰

Hugh Q Lee 2025. 3. 28. 22:26

ultralytics의 yolov5의 구조 중 첫번째 C3 Block를 params# 측면에서 코드 및 시각화로 리뷰합니다.

https://user-images.githubusercontent.com/31005897/172404576-c260dcf9-76bb-4bc8-b6a9-f2d987792583.png

 

 

채널 전체에 Bottleneck(Residual 포함) 연산을 하는 대신, CSP (Cross Stage Partial) 구조의 C3 블록은 입력 피처의 채널 절반에 Bottleneck 연산을 하여 params# saving을 달성합니다. 채널을 두 그룹으로 나누고 다시 하나로 합치는(concat) 과정에서 해당하는 분기에서의 레벨을 맞추는 1x1 Conv 레이어를 사용하는 점도 참고해야 합니다.

1. params# 비교

C3 Full Bottleneck
65,920 180,608

conv_params=input_ch×output_ch×kernel_w×kernel_h+bias

 

C3's parameters:65,920=3×(64×64×1×1+64)+(64×64×3×3+64)+(128×128×1×1+128) Bottleneck's parameters:180,608=2×(128×128×1×1+128)+(128×128×3×3+128)

2. 코드 구현

import torch.nn as nn
        
class BottleneckBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride, padding):
        super(BottleneckBlock, self).__init__()

        self.conv1 = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1, padding=0)
        self.conv2 = nn.Conv2d(in_channels=out_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding)

    def forward(self, x):
        y = self.conv1(x)
        y = self.conv2(y)
        return x + y


class C3Block(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride, padding):
        super(C3Block, self).__init__()
        self.split_channels = in_channels // 2

        self.conv1_1 = nn.Conv2d(in_channels=self.split_channels, out_channels=self.split_channels, kernel_size=1, stride=1, padding=0)
        self.conv1_2 = nn.Conv2d(in_channels=self.split_channels, out_channels=self.split_channels, kernel_size=1, stride=1, padding=0)
        self.bottleneck = BottleneckBlock(in_channels=self.split_channels, out_channels=self.split_channels, kernel_size=kernel_size, stride=stride, padding=padding)
        self.conv2 = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1, padding=0)

    def forward(self, x):
        x1, x2 = torch.split(x, self.split_channels, dim=1)

        x1 = self.conv1_1(x1)

        x2 = self.conv1_2(x2)
        y2 = self.bottleneck(x2)
        y2 = x2 + y2

        z = torch.cat([x1, y2], dim=1)
        z = self.conv2(z)
        return z
        
module = C3Block(in_channels=128, out_channels=128, kernel_size=3, stride=1, padding=1)

>
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 160, 160]           4,160
            Conv2d-2         [-1, 64, 160, 160]           4,160
            Conv2d-3         [-1, 64, 160, 160]           4,160
            Conv2d-4         [-1, 64, 160, 160]          36,928
   BottleneckBlock-5         [-1, 64, 160, 160]               0
            Conv2d-6        [-1, 128, 160, 160]          16,512
================================================================
Total params: 65,920
import torch.nn as nn
        
class FullBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride, padding):
        super(FullBlock, self).__init__()

        self.conv1 = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1, padding=0)
        self.conv2 = nn.Conv2d(in_channels=out_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding)
        self.conv3 = nn.Conv2d(in_channels=out_channels, out_channels=out_channels, kernel_size=1, stride=1, padding=0)

    def forward(self, x):
        y = self.conv1(x)
        y = self.conv2(y)
        y = self.conv3(y)
        return x + y
        
module = FullBlock(in_channels=128, out_channels=128, kernel_size=3, stride=1, padding=1)

>
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1        [-1, 128, 160, 160]          16,512
            Conv2d-2        [-1, 128, 160, 160]         147,584
            Conv2d-3        [-1, 128, 160, 160]          16,512
================================================================
Total params: 180,608

3. 시각화

C3 Block (CSP Bottleneck 구조) 시각화

 

Full Bottleneck (일반적인 Bottleneck 구조) 시각화
Full Bottleneck(좌)과 C3(우)의 Convolution 비교. bias#만큼 params#에 오차가 있습니다.

 

728x90