YOLOv5 C3 Block 시각화 리뷰

개발

YOLOv5 C3 Block 시각화 리뷰

Hugh Q Lee 2025. 3. 28. 22:26

ultralytics의 yolov5의 구조 중 첫번째 C3 Block를 params# 측면에서 코드 및 시각화로 리뷰합니다.

https://user-images.githubusercontent.com/31005897/172404576-c260dcf9-76bb-4bc8-b6a9-f2d987792583.png

채널 전체에 Bottleneck(Residual 포함) 연산을 하는 대신, CSP (Cross Stage Partial) 구조의 C3 블록은 입력 피처의 채널 절반에 Bottleneck 연산을 하여 params# saving을 달성합니다. 채널을 두 그룹으로 나누고 다시 하나로 합치는(concat) 과정에서 해당하는 분기에서의 레벨을 맞추는 1x1 Conv 레이어를 사용하는 점도 참고해야 합니다.

1. params# 비교

C3	Full Bottleneck
65,920	180,608

$conv_params = input_ch \times output_ch \times kernel_w \times kernel_h + bias$

$\begin{aligned} C3's parameters: \\ 65, 920 = & 3 \times (64 \times 64 \times 1 \times 1 + 64) \\ + (64 \times 64 \times 3 \times 3 + 64) \\ + (128 \times 128 \times 1 \times 1 + 128) \end{aligned}$ $\begin{aligned} Bottleneck's parameters: \\ 180, 608 = & 2 \times (128 \times 128 \times 1 \times 1 + 128) \\ + (128 \times 128 \times 3 \times 3 + 128) \end{aligned}$

2. 코드 구현

import torch.nn as nn
        
class BottleneckBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride, padding):
        super(BottleneckBlock, self).__init__()

        self.conv1 = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1, padding=0)
        self.conv2 = nn.Conv2d(in_channels=out_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding)

    def forward(self, x):
        y = self.conv1(x)
        y = self.conv2(y)
        return x + y


class C3Block(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride, padding):
        super(C3Block, self).__init__()
        self.split_channels = in_channels // 2

        self.conv1_1 = nn.Conv2d(in_channels=self.split_channels, out_channels=self.split_channels, kernel_size=1, stride=1, padding=0)
        self.conv1_2 = nn.Conv2d(in_channels=self.split_channels, out_channels=self.split_channels, kernel_size=1, stride=1, padding=0)
        self.bottleneck = BottleneckBlock(in_channels=self.split_channels, out_channels=self.split_channels, kernel_size=kernel_size, stride=stride, padding=padding)
        self.conv2 = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1, padding=0)

    def forward(self, x):
        x1, x2 = torch.split(x, self.split_channels, dim=1)

        x1 = self.conv1_1(x1)

        x2 = self.conv1_2(x2)
        y2 = self.bottleneck(x2)
        y2 = x2 + y2

        z = torch.cat([x1, y2], dim=1)
        z = self.conv2(z)
        return z
        
module = C3Block(in_channels=128, out_channels=128, kernel_size=3, stride=1, padding=1)

>
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 160, 160]           4,160
            Conv2d-2         [-1, 64, 160, 160]           4,160
            Conv2d-3         [-1, 64, 160, 160]           4,160
            Conv2d-4         [-1, 64, 160, 160]          36,928
   BottleneckBlock-5         [-1, 64, 160, 160]               0
            Conv2d-6        [-1, 128, 160, 160]          16,512
================================================================
Total params: 65,920

import torch.nn as nn
        
class FullBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride, padding):
        super(FullBlock, self).__init__()

        self.conv1 = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1, padding=0)
        self.conv2 = nn.Conv2d(in_channels=out_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding)
        self.conv3 = nn.Conv2d(in_channels=out_channels, out_channels=out_channels, kernel_size=1, stride=1, padding=0)

    def forward(self, x):
        y = self.conv1(x)
        y = self.conv2(y)
        y = self.conv3(y)
        return x + y
        
module = FullBlock(in_channels=128, out_channels=128, kernel_size=3, stride=1, padding=1)

>
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1        [-1, 128, 160, 160]          16,512
            Conv2d-2        [-1, 128, 160, 160]         147,584
            Conv2d-3        [-1, 128, 160, 160]          16,512
================================================================
Total params: 180,608

3. 시각화

Full Bottleneck (일반적인 Bottleneck 구조) 시각화

Full Bottleneck(좌)과 C3(우)의 Convolution 비교. bias#만큼 params#에 오차가 있습니다.

728x90

'개발' 카테고리의 다른 글

[Jetson Orin Nano] 라즈베리파이 카메라가 인식되지 않을 때: IMX219/IMX708 세팅 방법 (0)	2025.04.11
JetPack6로 업그레이드하기. (NVIDIA Jetson Nano Orin Developer Kit) (0)	2025.03.29
Deep Residual Learning for Image Recognition: ResNet 시각화 리뷰 (0)	2025.03.25
딥시크(DeepSeek-R1-Zero) 논문 리뷰 (0)	2025.01.31
라즈베리파이4와 스텝 모터 드라이버 연결 및 문제 해결 과정 (1)	2025.01.03

현재글YOLOv5 C3 Block 시각화 리뷰

hughqlee's blog

행복하자, 아프지 말고.

경주여행, python, object detection, DeepLearning, Deep Learning, openai, 라즈베리파이4, 북클럽, raspi, 노개북, pyqt5, DART, Computer Vision, 노마드코더, CV, 해몽해드림, Image Classification, jetson, 세이노의 가르침, tensorflow,

250x250

Today :
Yesterday :

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

hughqlee's blog