Trie(트라이): 검색어 자동완성은 어떻게 0.01초 만에 나올까?

1. 프롤로그 - "왜 배열로 자동완성을 만들면 이렇게 느릴까?"

스타트업에서 검색 기능을 만들어야 했다. 사용자가 검색창에 타이핑하는 순간마다 자동완성 추천어를 띄워주는 그 기능 말이다. 처음엔 단순하게 생각했다. "DB에 있는 단어들 전부 가져와서 startsWith()로 필터링하면 되겠지?" 그런데 막상 구현해보니 사용자가 키를 하나씩 누를 때마다 수천 개의 단어를 전부 순회하면서 앞글자를 비교했다. 단어가 10만 개만 넘어가도 체감할 수 있을 만큼 느려졌다.

// 초기 구현 - O(N × M) 시간 복잡도
function autocomplete(prefix, words) {
  return words.filter(word => word.startsWith(prefix));
}

이건 말이 안 된다고 생각했다. 구글은 수십억 개의 검색어 중에서도 찰나의 순간에 추천어를 보여주는데, 내가 만든 건 고작 10만 개로도 버벅인다니. 데이터 개수($N$)에 영향받지 않는 검색 방법이 필요했다. 그때 알게 된 게 Trie(트라이)였다.

처음엔 "트리(Tree)의 오타 아닌가?" 싶었다. 발음도 "트리"처럼 읽는 사람도 있고 "트라이"라고 읽는 사람도 있어서 혼란스러웠다. 나중에 알고 보니 Retrieval(검색)에서 가운데 부분을 따온 이름이었다. 발음은 원래 "트리"와 구분하기 위해 "트라이"로 읽는 게 표준이라고 한다. 이름부터 검색에 특화되어 있다는 걸 암시하는 자료구조였다.

2. 고민 - "트리인데 왜 이렇게 다를까? 구조가 헷갈려"

Trie를 처음 봤을 때 가장 혼란스러웠던 부분은 일반적인 이진 탐색 트리(BST)와 전혀 다르게 생겼다는 점이었다. BST는 노드 하나에 값이 있고, 왼쪽/오른쪽 자식이 있다. 그런데 Trie는 노드 자체에 '문자'를 저장하는 게 아니라, 간선(Edge)에 문자를 저장하는 것처럼 동작한다. 정확히는 부모에서 자식으로 가는 경로에 문자가 매핑되어 있다.

더 혼란스러웠던 건 각 노드가 자식을 26개(알파벳 개수)나 가질 수 있다는 점이었다. "이게 트리 맞나? 그래프 아닌가?" 싶었다. 결국 받아들였던 건, Trie는 문자열의 각 글자를 경로로 표현하는 트리라는 것이다.

루트 노드는 비어있고, 거기서부터 C로 가는 경로가 있으면 C에 해당하는 자식 노드로 이동한다. 그 다음 A로 가는 경로가 있으면 또 이동한다. 마지막 글자에 도달하면 "여기가 단어의 끝이다"라는 표시(isEndOfWord)를 해둔다. 이 표시가 없으면 그냥 다른 단어의 접두사일 뿐이다. 예를 들어 "CAR"와 "CARD"가 저장되어 있다면, C-A-R 경로까지는 공유하지만 R 노드에 isEndOfWord = True가 있어야 "CAR"도 단어로 인식된다.

처음엔 이 구조가 비효율적으로 느껴졌다. "그냥 해시맵에 단어 전체를 키로 넣으면 $O(1)$로 검색되는데, 왜 이렇게 복잡하게?" 그런데 곰곰이 생각해보니 해시맵은 정확히 일치하는 단어만 찾을 수 있다. "app"으로 시작하는 모든 단어를 찾으려면 해시맵의 모든 키를 순회해야 한다. 결국 $O(N)$이다. Trie는 이 접두사 검색(Prefix Search)을 단어 길이에만 비례하는 시간($O(L)$)으로 해결한다는 게 핵심이었다.

3. 깨달음 - "결국 이거였다 - 경로 자체가 키다"

BST는 노드의 값을 비교해서 왼쪽/오른쪽을 결정한다. Trie는 비교가 필요 없다. 문자 C를 찾으려면 현재 노드의 children['C']를 보면 된다. 있으면 가고, 없으면 끝이다. 경로를 따라가는 것 자체가 검색이다.

이 깨달음이 와닿았던 순간은 직접 손으로 "CAR", "CAT", "CARD", "DO"를 Trie에 넣어보면서였다.

시각적 단계별 삽입 과정

1단계: "CAR" 삽입

Root
 └─ C
     └─ A
         └─ R [끝]

2단계: "CAT" 삽입

Root
 └─ C
     └─ A
         ├─ R [끝]
         └─ T [끝]

C와 A는 이미 있으니 재사용하고, A에서 새로운 자식 T를 추가했다. 메모리를 절약하는 원리가 이해됐다.

3단계: "CARD" 삽입

Root
 └─ C
     └─ A
         ├─ R [끝]
         │   └─ D [끝]
         └─ T [끝]

C-A-R까지는 이미 있고, R에서 D를 추가한다. R 노드의 isEndOfWord는 그대로 True이므로 "CAR"와 "CARD" 둘 다 유효한 단어로 남는다.

4단계: "DO" 삽입

Root
 ├─ C
 │   └─ A
 │       ├─ R [끝]
 │       │   └─ D [끝]
 │       └─ T [끝]
 └─ D
     └─ O [끝]

루트에서 새로운 경로 D-O가 생겼다. 전혀 다른 접두사이므로 공유되지 않는다.

이 과정을 보고 나니 Trie가 공통 접두사를 최대한 공유하는 구조라는 게 명확히 정리됐다. "CAT"과 "CAR"은 "CA"까지 같으므로 그 부분만 한 번 저장하고, 이후부터 분기한다. 이게 메모리 효율성의 핵심이다(물론 나중에 설명하겠지만, 역설적으로 메모리 낭비도 심하다).

4. 깊이 파고들기 - 노드 구조와 연산 분석

4.1 노드의 내부 구조

Trie의 노드는 보통 두 가지 요소로 구성된다.

children: 다음 문자로 가는 포인터들의 집합
- 배열 방식: Node* children[26] (알파벳 소문자만 다룬다면)
- 해시맵 방식: unordered_map<char, Node*> children (메모리 절약, 약간 느림)
- 트레이드오프: 배열은 $O(1)$ 접근이지만 빈 공간 낭비, 해시맵은 실제 사용하는 것만 저장하지만 해시 연산 오버헤드
isEndOfWord: 이 노드에서 끝나는 단어가 있는지 표시하는 플래그
- 없으면 접두사만 존재하는 중간 노드

어떤 구현은 여기에 단어 빈도수(frequency), 사용자 클릭 수(weight) 등을 추가로 저장해서 자동완성 순위를 매기기도 한다.

4.2 Insert 연산 - 글자를 따라 노드를 만든다

Insert(word):
  1. current = root
  2. for each char in word:
       if char not in current.children:
         current.children[char] = new TrieNode()
       current = current.children[char]
  3. current.isEndOfWord = True

시간 복잡도: $O(L)$ (L은 단어 길이)

각 문자마다 해시맵 조회 또는 배열 인덱싱 $O(1)$
총 L번 반복

공간 복잡도: 최악 $O(L)$ (모든 글자가 신규 노드를 만들 때)

4.3 Search 연산 - 경로를 따라가서 끝 표시 확인

Search(word):
  1. current = root
  2. for each char in word:
       if char not in current.children:
         return False
       current = current.children[char]
  3. return current.isEndOfWord

시간 복잡도: $O(L)$

여기서 중요한 점은 3번째 단계다. 경로를 끝까지 따라갔더라도 isEndOfWord가 False면 그 단어는 저장되지 않은 것이다. 예를 들어 "CARD"만 넣고 "CAR"은 안 넣었다면, "CAR"을 검색할 때 C-A-R까지는 도달할 수 있지만 R 노드의 isEndOfWord가 False이므로 존재하지 않는다고 판단한다.

4.4 StartsWith 연산 - 자동완성의 핵심

StartsWith(prefix):
  1. current = root
  2. for each char in prefix:
       if char not in current.children:
         return False
       current = current.children[char]
  3. return True  # 끝까지 도달하기만 하면 OK

시간 복잡도: $O(L)$

Search와 거의 같지만, isEndOfWord 체크를 안 한다. 접두사까지만 일치하면 된다. 이 연산을 확장해서 "이 접두사로 시작하는 모든 단어 찾기"를 구현할 수 있다. 접두사 노드까지 이동한 후, 거기서부터 DFS/BFS로 isEndOfWord == True인 모든 노드를 찾으면 된다.

4.5 Delete 연산 - 조심스러운 역추적

삭제는 조금 까다롭다. 단순히 isEndOfWord만 False로 바꾸면 되는 경우도 있지만, 그 노드가 다른 단어의 중간 경로가 아니라면 메모리 누수를 막기 위해 노드 자체를 삭제해야 한다.

Delete(word):
  1. 재귀적으로 경로를 따라가면서 삭제 플래그를 세운다
  2. 되돌아오면서(backtrack) 각 노드를 검사:
     - 자식이 없고 끝단어도 아니면 → 부모에서 연결 끊고 삭제
     - 자식이 있거나 다른 단어의 끝이면 → 유지

시간 복잡도: $O(L)$

실제 구현에서는 재귀를 사용하거나, 역추적용 스택을 만들어서 처리한다. 삭제는 실제로 자주 쓰이진 않지만(검색 로그는 보통 삭제 안 함), 동적으로 사전을 관리하는 시스템에서는 필수다.

5. 시간 복잡도 심층 비교: Trie vs HashMap vs BST

연산	Trie	HashMap	Binary Search Tree
정확 검색	$O(L)$	$O(L)$ (해시 계산)	$O(L \log N)$ (문자열 비교 포함)
접두사 검색	$O(L)$	$O(N \cdot L)$ (전체 순회)	$O(L \log N + K)$ (K는 결과 개수)
삽입	$O(L)$	$O(L)$	$O(L \log N)$
공간 복잡도	$O(ALPHABET \cdot N \cdot L)$	$O(N \cdot L)$	$O(N \cdot L)$

핵심 차이점:

데이터 개수($N$)에 독립적: Trie의 모든 연산은 단어 개수와 무관하게 단어 길이($L$)에만 비례한다
접두사 검색 압도적 우위: HashMap은 접두사 검색이 불가능(전체 순회 필요)
메모리 대가: 알파벳 크기($ALPHABET$)만큼 배수로 공간을 소비

예를 들어 영어 소문자 26개를 배열로 관리하면 노드 하나당 26개의 포인터(보통 64비트 시스템에서 각각 8바이트 = 208바이트)가 필요하다. "HELLO" 하나 저장하려고 5개 노드 × 208바이트 = 1KB 이상을 쓴다. 실제로는 대부분이 null 포인터다.

6. 치명적 단점 - 공간 복잡도 폭발

Trie를 실제에 적용하려다가 제일 먼저 부딪히는 문제가 메모리 낭비다. 이론적으로는 공통 접두사를 공유해서 효율적일 것 같지만, 실제로는 각 노드가 가능한 모든 다음 문자를 위한 포인터 공간을 미리 확보해야 한다.

실제 메모리 계산 예시

알파벳 소문자만 사용하는 영어 단어 사전을 만든다고 가정:

노드 구조: Node* children[26] + bool isEndOfWord = 26 × 8바이트 + 1바이트 ≈ 209바이트
단어 100개, 평균 길이 5글자면 → 최악 500개 노드 × 209바이트 = 약 100KB
실제로는 공유로 줄지만, 각 노드마다 대부분 사용하지 않는 포인터가 20개 이상 null로 남는다

한글을 다루면 더 심각하다. 초성 19개 × 중성 21개 × 종성 28개 = 11,172개의 조합이 가능하므로, 배열로 관리하면 노드 하나가 수십 KB를 차지한다. 이 경우 해시맵을 쓰는 게 필수다.

해시맵 vs 배열 트레이드오프

# 배열 방식 - 빠르지만 메모리 낭비
class TrieNodeArray:
    def __init__(self):
        self.children = [None] * 26  # 26개 포인터 공간
        self.is_end = False

# 해시맵 방식 - 느리지만 실제 사용만 저장
class TrieNodeDict:
    def __init__(self):
        self.children = {}  # 동적으로 필요할 때만 추가
        self.is_end = False

프로덕션에서는 대부분 해시맵 방식을 쓴다. 배열의 속도 이점($O(1)$ vs 해시맵의 평균 $O(1)$, 최악 $O(k)$, k는 키 길이)은 미미하지만, 메모리 차이는 수십 배에 달한다.

7. 최적화 - Radix Tree (Patricia Trie) - 경로 압축

메모리 낭비를 줄이기 위한 가장 유명한 최적화가 Radix Tree다. 다른 이름으로 Patricia Trie(Practical Algorithm To Retrieve Information Coded In Alphanumeric)라고도 한다.

핵심 아이디어 - 자식이 하나뿐인 체인 압축

일반 Trie에서는 "ROMANTIC"을 저장하면:

R -> O -> M -> A -> N -> T -> I -> C [끝]

8개의 노드가 필요하다. 각 노드가 209바이트면 약 1.6KB다.

Radix Tree에서는:

ROMANTIC [끝]

하나의 노드에 문자열 전체를 저장한다. 만약 나중에 "ROME"이 추가되면:

ROM
 ├─ ANTIC [끝]
 └─ E [끝]

공통 부분 "ROM"까지는 하나의 노드에 저장하고, 거기서 분기한다. 이렇게 하면 노드 개수가 대폭 줄어든다.

언제 Radix Tree를 쓸까?

긴 고유 문자열이 많을 때: URL 라우팅, 파일 경로 등
- /api/users/profile/settings/privacy 같은 경로는 중간에 분기가 거의 없음
메모리가 제한적인 환경: 임베디드 시스템, 네트워크 라우터
삽입/삭제가 드물고 검색이 많을 때: 한 번 구축하면 읽기 전용으로 쓰는 경우

실제 사례:

Linux Kernel: Radix Tree로 메모리 페이지 관리 (v4.20부터 XArray로 대체)
Ethereum: Merkle Patricia Trie로 블록체인 상태 저장
Git: Packfile index에서 객체 검색에 유사한 구조 사용
Nginx: HTTP 요청 라우팅에 Radix Tree 사용

8. 실제 응용 - 자동완성 시스템 구현

이론은 이해했으니, 실제로 동작하는 자동완성을 만들어보자.

8.1 기본 Trie 구현 (Python)

class TrieNode:
    def __init__(self):
        self.children = {}  # 해시맵으로 메모리 절약
        self.is_end_of_word = False
        self.word = None  # 자동완성용: 완성된 단어 저장

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        """단어 삽입 - O(L)"""
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_end_of_word = True
        node.word = word  # 나중에 수집할 때 편리

    def search(self, word):
        """정확한 단어 검색 - O(L)"""
        node = self.root
        for char in word:
            if char not in node.children:
                return False
            node = node.children[char]
        return node.is_end_of_word

    def starts_with(self, prefix):
        """접두사 존재 여부 - O(L)"""
        node = self.root
        for char in prefix:
            if char not in node.children:
                return False
            node = node.children[char]
        return True

8.2 자동완성 기능 - 접두사로 시작하는 모든 단어 찾기

    def autocomplete(self, prefix, limit=10):
        """
        접두사로 시작하는 단어 최대 limit개 반환
        시간 복잡도: O(L + K × M)
        - L: 접두사 길이 (접두사 노드까지 이동)
        - K: 결과 개수
        - M: 평균 단어 길이 (DFS 탐색)
        """
        node = self.root

        # 1단계: 접두사 끝까지 이동
        for char in prefix:
            if char not in node.children:
                return []  # 접두사 자체가 없음
            node = node.children[char]

        # 2단계: 그 노드부터 DFS로 모든 단어 수집
        results = []
        self._dfs_collect(node, results, limit)
        return results

    def _dfs_collect(self, node, results, limit):
        """DFS로 끝단어 노드 찾기"""
        if len(results) >= limit:
            return  # 충분히 모았으면 조기 종료

        if node.is_end_of_word:
            results.append(node.word)

        # 자식들을 알파벳 순서로 탐색 (정렬된 결과)
        for char in sorted(node.children.keys()):
            self._dfs_collect(node.children[char], results, limit)

8.3 삭제 연산 (재귀 버전)

    def delete(self, word):
        """
        단어 삭제 - O(L)
        빈 노드는 역추적하면서 정리
        """
        def _delete_recursive(node, word, index):
            if index == len(word):
                # 끝에 도달 - 단어 표시 제거
                if not node.is_end_of_word:
                    return False  # 애초에 없던 단어
                node.is_end_of_word = False
                node.word = None
                # 자식이 없으면 이 노드 삭제 가능
                return len(node.children) == 0

            char = word[index]
            if char not in node.children:
                return False  # 경로가 없음

            child = node.children[char]
            should_delete_child = _delete_recursive(child, word, index + 1)

            if should_delete_child:
                # 자식 노드를 삭제해야 함
                del node.children[char]
                # 현재 노드도 삭제 가능한지 확인
                return len(node.children) == 0 and not node.is_end_of_word

            return False

        _delete_recursive(self.root, word, 0)

8.4 실제 사용 예시

# 사전 구축
trie = Trie()
words = ["apple", "app", "application", "apply", "apricot",
         "banana", "band", "bandana", "can", "cat", "car"]

for word in words:
    trie.insert(word)

# 자동완성 테스트
print(trie.autocomplete("app"))
# 출력: ['app', 'apple', 'application', 'apply']

print(trie.autocomplete("ban"))
# 출력: ['banana', 'band', 'bandana']

print(trie.autocomplete("ca"))
# 출력: ['can', 'car', 'cat']

# 정확 검색
print(trie.search("app"))      # True
print(trie.search("appl"))     # False (끝표시 없음)
print(trie.search("application"))  # True

# 삭제 후 재검색
trie.delete("app")
print(trie.search("app"))      # False
print(trie.autocomplete("app"))  # ['apple', 'application', 'apply']
# "app"은 사라졌지만 경로는 남아있어서 다른 단어들은 유지됨

9. 고급 응용 - 실제로 만나는 Trie들

9.1 IP 라우팅: Longest Prefix Match

라우터는 IP 주소의 네트워크 부분을 보고 다음 홉(hop)을 결정한다. 예를 들어:

192.168.1.0/24 → Gateway A
192.168.0.0/16 → Gateway B
0.0.0.0/0 (default) → Gateway C

IP 192.168.1.50이 들어오면 가장 긴 일치 접두사인 /24를 찾아야 한다. 이를 Longest Prefix Match라고 하는데, Trie(특히 Radix Tree)로 구현하면 효율적이다.

Binary Trie for IP routing:
Root
 └─ 1 (first bit of 192 = 11000000)
     └─ 1
         └─ 0
            ... (32bit depth)

실제로는 비트 단위가 아닌 8비트(옥텟) 단위로 압축해서 Radix Tree를 만든다.

9.2 T9 키보드 - 숫자로 문자 입력

옛날 피처폰의 T9(Text on 9 keys) 입력 방식도 Trie를 활용했다.

2번: ABC
3번: DEF
...
9번: WXYZ

사용자가 4663을 누르면 GOOD, HOME, GONE 등 여러 단어가 가능하다. Trie에 각 단어를 숫자 시퀀스로 변환해서 저장하고, 빈도수가 높은 순으로 추천했다.

9.3 스펠 체커 - Levenshtein Distance와 Fuzzy Search

오타를 교정하려면 입력된 단어와 "비슷한" 단어를 찾아야 한다. Trie를 순회하면서 편집 거리(Levenshtein Distance)가 1~2 이내인 단어를 찾는다.

def fuzzy_search(trie_node, word, max_distance):
    """
    편집 거리가 max_distance 이내인 단어 찾기
    동적 프로그래밍 + Trie 순회
    """
    # DP 배열로 편집 거리 계산하면서 Trie 탐색
    # 구현 복잡도 높음 - 라이브러리 사용 권장 (SymSpell, BK-tree 등)

9.4 Ternary Search Tree: 메모리와 속도의 절충안

Trie와 BST의 중간 형태다. 각 노드가 3개의 자식(왼쪽, 중간, 오른쪽)을 가진다.

왼쪽: 현재 문자보다 작은 문자
중간: 현재 문자와 일치 (다음 글자로 진행)
오른쪽: 현재 문자보다 큰 문자

메모리는 Trie보다 적게 쓰지만, 검색 속도는 $O(L \log \Sigma)$로 약간 느리다. 실제로는 잘 안 쓰지만, 임베디드 환경에서 고려 대상이다.

9.5 Double Array Trie: 일본에서 유명한 극한 최적화

일본어 형태소 분석기 MeCab, Kuromoji 등에서 사용하는 기법이다. 두 개의 배열(BASE, CHECK)로 Trie를 표현해서 메모리를 극도로 절약한다. 구현이 매우 복잡하지만, 읽기 전용 대용량 사전에서는 최고 성능을 낸다.

10. 적용 시 체크리스트

Trie를 프로덕션에 도입하기 전에 확인할 것들:

10.1 정말 Trie가 필요한가?

접두사 검색이 핵심 기능인가? → Yes면 Trie 검토
정확 매칭만 필요한가? → HashMap이 더 나음
범위 검색(range query)이 필요한가? → B-Tree 또는 Skip List 검토
데이터가 자주 바뀌는가? → Trie는 삽입/삭제가 비교적 비쌈

10.2 메모리 예산 확인

단어 개수 × 평균 길이 × 알파벳 크기 × 포인터 크기 대략 계산
서버 RAM의 몇 %를 Trie에 할당할 수 있는지 확인
메모리 부족 시 Radix Tree 또는 외부 저장소(Redis) 활용

10.3 대안 기술 비교

기술	장점	단점	사용 사례
Trie (In-memory)	빠른 접두사 검색	메모리 多	소규모 자동완성
Elasticsearch	분산, 전문 검색	운영 복잡도	대규모 검색 엔진
Redis Sorted Set	간단, 스코어링	접두사 검색 비효율	랭킹 기반 추천
PostgreSQL LIKE	별도 구축 불필요	느림 (인덱스 미사용 시)	소규모 DB 검색
SQLite FTS5	경량, 임베디드	고급 기능 부족	모바일 앱 검색

11. 정리 - Trie를 선택하는 기준

결국 받아들인 건, Trie는 접두사 검색이라는 특정 문제에 최적화된 전문 도구라는 점이다. 만능은 아니다. 하지만 자동완성, 사전 검색, IP 라우팅처럼 "앞부분이 일치하는 것을 빠르게 찾아야 하는" 상황에서는 대체 불가능한 성능을 낸다.

메모리 문제는 해시맵 기반 노드, Radix Tree, 외부 저장소 등으로 완화할 수 있다. 중요한 건 문제의 특성을 정확히 파악하고, Trie의 강점과 약점을 이해한 상태에서 선택하는 것이다.

구글 검색창에 타이핑할 때마다 나타나는 그 추천어들. 이제는 그 뒤에서 Trie가 $O(L)$ 시간에 수백만 개의 단어를 뒤지고 있다는 걸 안다. 그게 와닿았던 순간, Trie는 단순한 자료구조가 아니라 실시간 사용자 경험을 만드는 기술로 다가왔다.

12. 용어 사전 (Glossary)

Trie: Retrieval(검색)에서 유래한 문자열 검색 특화 트리 자료구조. 발음은 "트라이".
Prefix Tree: Trie의 별칭. 공통 접두사를 공유하는 트리 구조.
Edge Label: 부모에서 자식으로 가는 간선에 할당된 문자. Trie의 핵심 개념.
isEndOfWord: 해당 노드에서 끝나는 단어가 있는지 표시하는 플래그. False면 중간 경로일 뿐.
Radix Tree (Patricia Trie): 자식이 하나뿐인 체인을 압축해서 메모리를 절약한 Trie 변형.
Longest Prefix Match: IP 라우팅에서 사용하는 기법. 가장 긴 일치 접두사를 찾는다.
Ternary Search Tree: 각 노드가 3개 자식(작음/같음/큼)을 가지는 Trie와 BST의 하이브리드.
Double Array Trie: 두 개의 배열로 Trie를 표현하는 일본 기원 최적화 기법. 극한의 메모리 효율.
Suffix Trie: 문자열의 모든 접미사를 Trie에 넣은 것. 패턴 매칭에 사용.
Fuzzy Search: 편집 거리(Levenshtein Distance) 내에서 유사한 문자열을 찾는 기법.

13. FAQ & Common Questions

Q1. 해시 테이블로도 검색하면 $O(1)$인데 왜 Trie를 써야 하나요?

해시 테이블은 정확히 일치하는 키만 찾을 수 있다. "app"으로 시작하는 모든 단어를 찾으려면 전체 키를 순회해야 하므로 $O(N)$이다. Trie는 접두사 검색이 $O(L)$로 가능하다. 또한 해시 충돌 공격에 취약하지만, Trie는 결정론적 성능을 보장한다.

Q2. Trie와 Suffix Tree의 차이는 무엇인가요?

Trie는 단어의 시작 부분(접두사)을 공유한다. Suffix Tree는 문자열의 모든 접미사를 저장해서 문자열 내 패턴 매칭에 사용한다. 예를 들어 "BANANA"의 Suffix Tree는 "BANANA", "ANANA", "NANA", "ANA", "NA", "A"를 모두 저장한다. 용도가 완전히 다르다.

Q3. 한글이나 이모지는 어떻게 처리하나요?

유니코드 코드포인트: 문자를 유니코드 값으로 변환해서 키로 사용. 해시맵 방식 필수.
자모 분리 (한글): "한글" → "ㅎ ㅏ ㄴ ㄱ ㅡ ㄹ"로 분해해서 저장. 검색 정밀도는 높지만 깊이가 깊어짐.
Normalization: 이모지는 여러 코드포인트 조합일 수 있으므로 정규화 필요 (NFD, NFC 등).

Q4. 실제로 Trie를 직접 구현해서 쓰나요?

대부분은 검색 엔진(Elasticsearch, Solr)이나 Redis의 Sorted Set, PostgreSQL의 Full-Text Search를 활용한다. 하지만 네트워크 장비(라우터), 임베디드 시스템, 게임 엔진의 명령어 파서 등에서는 직접 구현한 최적화된 Trie를 쓴다. 특히 메모리와 레이턴시가 중요한 환경에서는 Double Array Trie 같은 고급 기법을 사용한다.

Q5. Trie 노드를 배열로 만들지 해시맵으로 만들지 어떻게 결정하나요?

배열: 알파벳 개수가 작고(26개 등), 대부분의 문자가 실제 사용될 때. 속도 최우선.
해시맵: 알파벳이 크거나(유니코드), sparse한 데이터일 때. 메모리 최우선.
하이브리드: 상위 레벨은 배열(흔한 문자), 하위 레벨은 해시맵(드문 문자).

Q6. Trie의 공간 복잡도를 줄이는 다른 방법은?

Compressed Trie (Radix Tree): 설명한 경로 압축
Bitwise Trie: 비트 단위로 분기해서 깊이를 늘리고 분기 수를 줄임 (IP 라우팅)
Level Compression: 여러 레벨을 하나로 합침 (4-way Trie 등)
Cache-Oblivious Trie: 캐시 효율을 고려한 메모리 배치
Lazy Deletion: 삭제 시 노드를 실제로 지우지 않고 플래그만 변경 (메모리 파편화 감수)

Trie: The Secret Behind Instant Autocomplete

1. Prologue: "Why is my array-based search so slow?"

I was building a search feature for a startup. Users type in a search box, and we show autocomplete suggestions in real-time. My first attempt was embarrassingly simple: "Just fetch all words from the database and filter with startsWith(), right?" It worked on my laptop with a few hundred test words. But when we loaded 100,000 real product names, every keystroke triggered a full scan through all entries. The delay was noticeable, and users complained.

I looked at how Google handles billions of search queries and still suggests completions in milliseconds. There had to be a data structure where lookup time depends only on the word length, not the dictionary size. That's when I discovered the Trie (pronounced "try", derived from re-trie-val). The name itself hints at its purpose: retrieval operations.

At first, I thought it was just a typo of "Tree." Even the pronunciation was confusing—some people say "tree," others say "try." Later, I learned the official pronunciation is "try" to distinguish it from binary trees. The structure was specifically designed for string searches, and its name tells the story.

2. Initial Confusion: "It's a tree, but why does it look so different?"

When I first saw a Trie diagram, I was confused. Binary Search Trees (BST) store values in nodes and have left/right children. Tries don't store characters in nodes themselves—instead, edges (or the path from parent to child) represent characters. Or more accurately, each node has a mapping of characters to child nodes.

What threw me off was that each node could have 26 children (for lowercase English letters). "Is this even a tree? Looks more like a graph," I thought. Eventually, I accepted that a Trie is a tree where each character in a string corresponds to a path from the root.

The root node is empty. From there, if you have a path labeled C, you move to the C child node. Then if there's a path labeled A, you move to the A child. When you reach the last character, you mark that node as "end of word" (isEndOfWord). Without this marker, it's just a prefix of another word. For example, if "CAR" and "CARD" are stored, the path C-A-R is shared, but the R node must have isEndOfWord = True for "CAR" to be recognized as a complete word.

Initially, this seemed inefficient. "Why not just use a HashMap where the whole word is the key? That's $O(1)$ lookup!" But then I realized: HashMaps only work for exact matches. To find all words starting with "app", you'd have to iterate through all keys—back to $O(N)$. Tries solve this prefix search problem in time proportional only to the word length ($O(L)$), which was the breakthrough I needed.

3. The Aha Moment: "The path itself is the key"

BST nodes compare values to decide left or right. Tries don't need comparison. To find character C, you just check node.children['C']. If it exists, move to it. If not, the word doesn't exist. The act of following a path is the search.

This clicked for me when I manually inserted "CAR", "CAT", "CARD", and "DO" into a Trie on paper.

Step-by-Step Visual Insertion

Step 1: Insert "CAR"

Root
 └─ C
     └─ A
         └─ R [end]

Step 2: Insert "CAT"

Root
 └─ C
     └─ A
         ├─ R [end]
         └─ T [end]

The C and A nodes already exist, so we reuse them. We add a new child T under A. This is how memory is saved—common prefixes are shared.

Step 3: Insert "CARD"

Root
 └─ C
     └─ A
         ├─ R [end]
         │   └─ D [end]
         └─ T [end]

The path C-A-R already exists. We add D as a child of R. The R node's isEndOfWord remains True, so both "CAR" and "CARD" are valid words.

Step 4: Insert "DO"

Root
 ├─ C
 │   └─ A
 │       ├─ R [end]
 │       │   └─ D [end]
 │       └─ T [end]
 └─ D
     └─ O [end]

A completely new path D-O is created from the root. No prefix is shared with the C branch.

This visualization made it clear: Tries maximize prefix sharing. "CAT" and "CAR" share "CA", so that part is stored once, and the structure branches after. This is the key to memory efficiency—though paradoxically, Tries can also waste massive amounts of memory (more on that later).

4. Deep Dive: Node Structure and Operation Analysis

4.1 Internal Node Structure

A Trie node typically consists of two main components:

children: A collection of pointers to child nodes
- Array approach: Node* children[26] (for lowercase English only)
- HashMap approach: unordered_map<char, Node*> children (saves memory, slightly slower)
- Trade-off: Arrays give $O(1)$ access but waste space on null pointers; HashMaps only store what's used but add hashing overhead
isEndOfWord: A boolean flag indicating if a valid word ends at this node
- Without this, the node is just part of a prefix path

Some implementations add word frequency, user click weight, or other metadata for autocomplete ranking.

4.2 Insert Operation: Follow the path, create nodes as needed

Insert(word):
  1. current = root
  2. for each char in word:
       if char not in current.children:
         current.children[char] = new TrieNode()
       current = current.children[char]
  3. current.isEndOfWord = True

Time Complexity: $O(L)$ (where L = word length)

Each character: HashMap lookup or array indexing in $O(1)$
Repeated L times

Space Complexity: Worst case $O(L)$ (when all characters create new nodes)

4.3 Search Operation: Follow the path, check end marker

Search(word):
  1. current = root
  2. for each char in word:
       if char not in current.children:
         return False
       current = current.children[char]
  3. return current.isEndOfWord

Time Complexity: $O(L)$

Step 3 is crucial. Even if you successfully traverse the entire path, if isEndOfWord is False, the word doesn't exist in the Trie. For example, if only "CARD" is inserted (not "CAR"), searching for "CAR" would reach the R node but return False because R.isEndOfWord is False.

4.4 StartsWith Operation: Core of Autocomplete

StartsWith(prefix):
  1. current = root
  2. for each char in prefix:
       if char not in current.children:
         return False
       current = current.children[char]
  3. return True  # Successfully reached prefix end

Time Complexity: $O(L)$

Almost identical to Search, but doesn't check isEndOfWord. We only care if the prefix path exists. This can be extended to "find all words starting with this prefix" by performing DFS/BFS from the prefix node to collect all nodes where isEndOfWord == True.

4.5 Delete Operation: Careful Backtracking

Deletion is trickier. Simply setting isEndOfWord = False works if the node is part of another word's path, but if it's a dead-end branch, we should remove the node to prevent memory leaks.

Delete(word):
  1. Recursively traverse to the end of the word
  2. On backtrack, check each node:
     - If it has no children and isn't end-of-word → delete it
     - If it has children or marks another word's end → keep it

Time Complexity: $O(L)$

Implementations use recursion or explicit stacks for backtracking. Deletion isn't common in read-heavy systems (search logs are rarely deleted), but it's essential for dynamic dictionaries.

5. Time Complexity Deep Comparison: Trie vs HashMap vs BST

Operation	Trie	HashMap	Binary Search Tree
Exact Search	$O(L)$	$O(L)$ (hashing)	$O(L \log N)$ (string comparison included)
Prefix Search	$O(L)$	$O(N \cdot L)$ (full scan)	$O(L \log N + K)$ (K = result count)
Insert	$O(L)$	$O(L)$	$O(L \log N)$
Space Complexity	$O(ALPHABET \cdot N \cdot L)$	$O(N \cdot L)$	$O(N \cdot L)$

Key Differences:

Independent of N: All Trie operations depend only on word length ($L$), not dictionary size ($N$)
Prefix Search Dominance: HashMap cannot do prefix search efficiently (requires full iteration)
Memory Cost: Space consumption multiplied by alphabet size ($ALPHABET$)

For example, with 26 lowercase letters stored in arrays, each node needs 26 pointers (typically 8 bytes each on 64-bit systems = 208 bytes). Storing "HELLO" requires 5 nodes × 208 bytes = over 1KB, most of which are null pointers.

6. The Fatal Flaw: Space Complexity Explosion

The first wall I hit when trying to use Tries in production was memory waste. Theoretically, sharing common prefixes should be efficient, but in practice, each node must pre-allocate space for all possible next characters.

Real Memory Calculation Example

Suppose we build an English dictionary (lowercase letters only):

Node structure: Node* children[26] + bool isEndOfWord = 26 × 8 bytes + 1 byte ≈ 209 bytes
100 words, average length 5 → worst case 500 nodes × 209 bytes ≈ 100KB
Even with prefix sharing reducing node count, each node has 20+ unused null pointers

For languages like Korean, it's worse. Korean has 19 initial consonants × 21 medial vowels × 28 final consonants = 11,172 possible syllables. An array-based node would be tens of KB. HashMaps become mandatory.

HashMap vs Array Trade-off

# Array approach - fast but memory-hungry
class TrieNodeArray:
    def __init__(self):
        self.children = [None] * 26  # 26 pointer slots
        self.is_end = False

# HashMap approach - slower but memory-efficient
class TrieNodeDict:
    def __init__(self):
        self.children = {}  # Only stores what's used
        self.is_end = False

Production systems almost always use HashMaps. The speed difference ($O(1)$ array access vs HashMap's average $O(1)$, worst $O(k)$ where k is key length) is negligible, but memory savings can be 10-100x.

7. Optimization: Radix Tree (Patricia Trie) - Path Compression

The most famous optimization to reduce memory waste is the Radix Tree, also known as Patricia Trie (Practical Algorithm To Retrieve Information Coded In Alphanumeric).

Core Idea: Compress Single-Child Chains

In a standard Trie, storing "ROMANTIC" requires:

R -> O -> M -> A -> N -> T -> I -> C [end]

8 nodes. At 209 bytes each, that's about 1.6KB.

In a Radix Tree:

ROMANTIC [end]

A single node stores the entire string. If "ROME" is later added:

ROM
 ├─ ANTIC [end]
 └─ E [end]

The common portion "ROM" is stored in one node, then branches. This drastically reduces node count.

When to Use Radix Trees

Long unique strings: URL routing, file paths
- Paths like /api/users/profile/settings/privacy rarely branch in the middle
Memory-constrained environments: Embedded systems, network routers
Read-heavy workloads: Once built, used as read-only

Real-World Examples:

Linux Kernel: Used Radix Trees for memory page management (replaced by XArray in v4.20+)
Ethereum: Merkle Patricia Trie for blockchain state storage
Git: Similar structures in packfile indexes for object lookup
Nginx: Radix Trees for HTTP request routing

8. Practical Application: Building an Autocomplete System

Theory is understood—now let's build a working autocomplete.

8.1 Basic Trie Implementation (Python)

class TrieNode:
    def __init__(self):
        self.children = {}  # HashMap for memory efficiency
        self.is_end_of_word = False
        self.word = None  # Store complete word for autocomplete

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        """Insert word - O(L)"""
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_end_of_word = True
        node.word = word  # Easy to retrieve later

    def search(self, word):
        """Exact word search - O(L)"""
        node = self.root
        for char in word:
            if char not in node.children:
                return False
            node = node.children[char]
        return node.is_end_of_word

    def starts_with(self, prefix):
        """Check if prefix exists - O(L)"""
        node = self.root
        for char in prefix:
            if char not in node.children:
                return False
            node = node.children[char]
        return True

8.2 Autocomplete Feature: Find All Words with Prefix

    def autocomplete(self, prefix, limit=10):
        """
        Return up to 'limit' words starting with prefix
        Time Complexity: O(L + K × M)
        - L: prefix length (navigating to prefix node)
        - K: number of results
        - M: average word length (DFS traversal)
        """
        node = self.root

        # Step 1: Navigate to prefix end
        for char in prefix:
            if char not in node.children:
                return []  # Prefix doesn't exist
            node = node.children[char]

        # Step 2: DFS to collect all words from this node
        results = []
        self._dfs_collect(node, results, limit)
        return results

    def _dfs_collect(self, node, results, limit):
        """DFS to find all end-of-word nodes"""
        if len(results) >= limit:
            return  # Early termination

        if node.is_end_of_word:
            results.append(node.word)

        # Traverse children in sorted order (alphabetical results)
        for char in sorted(node.children.keys()):
            self._dfs_collect(node.children[char], results, limit)

8.3 Delete Operation (Recursive Version)

    def delete(self, word):
        """
        Delete word - O(L)
        Clean up empty nodes during backtracking
        """
        def _delete_recursive(node, word, index):
            if index == len(word):
                # Reached end - remove word marker
                if not node.is_end_of_word:
                    return False  # Word didn't exist
                node.is_end_of_word = False
                node.word = None
                # Can delete node if no children
                return len(node.children) == 0

            char = word[index]
            if char not in node.children:
                return False  # Path doesn't exist

            child = node.children[char]
            should_delete_child = _delete_recursive(child, word, index + 1)

            if should_delete_child:
                # Delete child node
                del node.children[char]
                # Check if current node can be deleted
                return len(node.children) == 0 and not node.is_end_of_word

            return False

        _delete_recursive(self.root, word, 0)

8.4 Real-World Usage Example

# Build dictionary
trie = Trie()
words = ["apple", "app", "application", "apply", "apricot",
         "banana", "band", "bandana", "can", "cat", "car"]

for word in words:
    trie.insert(word)

# Autocomplete tests
print(trie.autocomplete("app"))
# Output: ['app', 'apple', 'application', 'apply']

print(trie.autocomplete("ban"))
# Output: ['banana', 'band', 'bandana']

print(trie.autocomplete("ca"))
# Output: ['can', 'car', 'cat']

# Exact search
print(trie.search("app"))      # True
print(trie.search("appl"))     # False (no end marker)
print(trie.search("application"))  # True

# Delete and re-search
trie.delete("app")
print(trie.search("app"))      # False
print(trie.autocomplete("app"))  # ['apple', 'application', 'apply']
# "app" is gone, but the path remains for other words

9. Advanced Applications: Tries in Production

9.1 IP Routing: Longest Prefix Match

Routers determine the next hop by matching the network portion of IP addresses. For example:

192.168.1.0/24 → Gateway A
192.168.0.0/16 → Gateway B
0.0.0.0/0 (default) → Gateway C

For IP 192.168.1.50, the router finds the longest matching prefix (/24). This Longest Prefix Match is efficiently implemented with Tries (especially Radix Trees).

Binary Trie for IP routing:
Root
 └─ 1 (first bit of 192 = 11000000)
     └─ 1
         └─ 0
            ... (32-bit depth)

In practice, routers use octet-level (8-bit) compressed Radix Trees.

9.2 T9 Keyboard: Number-to-Text Input

Old feature phones used T9 (Text on 9 keys):

2: ABC
3: DEF
...
9: WXYZ

Typing 4663 could mean "GOOD", "HOME", "GONE", etc. The Trie stores words converted to digit sequences, ranked by frequency.

9.3 Spell Checkers: Levenshtein Distance and Fuzzy Search

To correct typos, we need to find words "similar" to the input. Traverse the Trie while computing edit distance (Levenshtein Distance), collecting words within distance 1-2.

def fuzzy_search(trie_node, word, max_distance):
    """
    Find words within max_distance edit distance
    Uses Dynamic Programming + Trie traversal
    """
    # DP array tracks edit distance while traversing
    # Complex implementation - libraries like SymSpell or BK-tree recommended

9.4 Ternary Search Tree: Middle Ground

A hybrid of Tries and BSTs. Each node has 3 children:

Left: characters less than current
Middle: characters equal to current (proceed to next letter)
Right: characters greater than current

Uses less memory than Tries but search is $O(L \log \Sigma)$ instead of $O(L)$. Rarely used in practice but considered for embedded systems.

9.5 Double Array Trie: Japanese Extreme Optimization

Used in Japanese morphological analyzers like MeCab and Kuromoji. Represents a Trie using two arrays (BASE and CHECK) for extreme memory efficiency. Implementation is very complex, but achieves best-in-class performance for read-only large dictionaries.

10. Production Readiness Checklist

Before deploying Tries to production:

10.1 Do You Really Need a Trie?

Is prefix search core functionality? → Consider Trie
Only exact matching needed? → HashMap is better
Need range queries? → Consider B-Tree or Skip List
Frequent updates? → Tries have relatively expensive insert/delete

10.2 Memory Budget Verification

Estimate: word count × avg length × alphabet size × pointer size
What % of server RAM can be allocated to Trie?
If insufficient, consider Radix Tree or external storage (Redis)

10.3 Alternative Technology Comparison

Technology	Advantages	Disadvantages	Use Cases
Trie (In-memory)	Fast prefix search	High memory	Small autocomplete
Elasticsearch	Distributed, full-text	Operational complexity	Large search engines
Redis Sorted Set	Simple, scoring	Inefficient prefix search	Ranking-based suggestions
PostgreSQL LIKE	No extra setup	Slow (without index)	Small DB queries
SQLite FTS5	Lightweight, embedded	Limited advanced features	Mobile app search

11. Summary: When to Choose Tries

What I ultimately accepted: Tries are specialized tools optimized for the specific problem of prefix search. They're not a silver bullet. But for autocomplete, dictionary lookup, and IP routing—anywhere you need to "quickly find things matching the beginning"—they deliver irreplaceable performance.

Memory issues can be mitigated with HashMap-based nodes, Radix Trees, or external storage. The key is accurately understanding the problem characteristics and making an informed choice based on Tries' strengths and weaknesses.

Every time I type in Google's search box and suggestions appear instantly, I now know there's a Trie behind the scenes scanning millions of words in $O(L)$ time. When that clicked, Tries became more than just a data structure—they became a core technology enabling real-time user experiences.

12. Glossary

Trie: String search-specialized tree structure derived from re-trie-val. Pronounced "try".
Prefix Tree: Alternative name for Trie. Tree structure sharing common prefixes.
Edge Label: Character assigned to the edge from parent to child. Core Trie concept.
isEndOfWord: Flag indicating if a valid word ends at this node. False means it's just a path.
Radix Tree (Patricia Trie): Trie variant that compresses single-child chains to save memory.
Longest Prefix Match: Technique used in IP routing to find the longest matching prefix.
Ternary Search Tree: Hybrid of Trie and BST where each node has 3 children (less/equal/greater).
Double Array Trie: Japanese-origin optimization representing Tries with two arrays. Extreme memory efficiency.
Suffix Trie: Trie containing all suffixes of a string. Used for pattern matching.
Fuzzy Search: Finding similar strings within edit distance (Levenshtein Distance).

Trie(트라이): 검색어 자동완성은 어떻게 0.01초 만에 나올까?

관련 포스트

BFS와 DFS: 그래프 탐색의 두 축

퀵 정렬(Quick Sort): 분할 정복의 대표

이진 탐색 트리(BST): 데이터 검색의 기초와 자가 균형의 미학

CPU 스케줄링: 운영체제는 어떻게 공평함을 연기하는가? (FCFS부터 Linux CFS까지)

1. 프롤로그 - "왜 배열로 자동완성을 만들면 이렇게 느릴까?"

2. 고민 - "트리인데 왜 이렇게 다를까? 구조가 헷갈려"

3. 깨달음 - "결국 이거였다 - 경로 자체가 키다"

시각적 단계별 삽입 과정

4. 깊이 파고들기 - 노드 구조와 연산 분석

4.1 노드의 내부 구조

4.2 Insert 연산 - 글자를 따라 노드를 만든다

4.3 Search 연산 - 경로를 따라가서 끝 표시 확인

4.4 StartsWith 연산 - 자동완성의 핵심

4.5 Delete 연산 - 조심스러운 역추적

5. 시간 복잡도 심층 비교: Trie vs HashMap vs BST

6. 치명적 단점 - 공간 복잡도 폭발

실제 메모리 계산 예시

해시맵 vs 배열 트레이드오프

7. 최적화 - Radix Tree (Patricia Trie) - 경로 압축

핵심 아이디어 - 자식이 하나뿐인 체인 압축

언제 Radix Tree를 쓸까?

8. 실제 응용 - 자동완성 시스템 구현

8.1 기본 Trie 구현 (Python)

8.2 자동완성 기능 - 접두사로 시작하는 모든 단어 찾기

8.3 삭제 연산 (재귀 버전)

8.4 실제 사용 예시

9. 고급 응용 - 실제로 만나는 Trie들

9.1 IP 라우팅: Longest Prefix Match

9.2 T9 키보드 - 숫자로 문자 입력

9.3 스펠 체커 - Levenshtein Distance와 Fuzzy Search

9.4 Ternary Search Tree: 메모리와 속도의 절충안

9.5 Double Array Trie: 일본에서 유명한 극한 최적화

10. 적용 시 체크리스트

10.1 정말 Trie가 필요한가?

10.2 메모리 예산 확인

10.3 대안 기술 비교

11. 정리 - Trie를 선택하는 기준

12. 용어 사전 (Glossary)

13. FAQ & Common Questions

Trie: The Secret Behind Instant Autocomplete

1. Prologue: "Why is my array-based search so slow?"

2. Initial Confusion: "It's a tree, but why does it look so different?"

3. The Aha Moment: "The path itself is the key"

Step-by-Step Visual Insertion

4. Deep Dive: Node Structure and Operation Analysis

4.1 Internal Node Structure

4.2 Insert Operation: Follow the path, create nodes as needed

4.3 Search Operation: Follow the path, check end marker

4.4 StartsWith Operation: Core of Autocomplete

4.5 Delete Operation: Careful Backtracking

5. Time Complexity Deep Comparison: Trie vs HashMap vs BST

6. The Fatal Flaw: Space Complexity Explosion

Real Memory Calculation Example

HashMap vs Array Trade-off

7. Optimization: Radix Tree (Patricia Trie) - Path Compression

Core Idea: Compress Single-Child Chains

When to Use Radix Trees

8. Practical Application: Building an Autocomplete System

8.1 Basic Trie Implementation (Python)

8.2 Autocomplete Feature: Find All Words with Prefix

8.3 Delete Operation (Recursive Version)

8.4 Real-World Usage Example

9. Advanced Applications: Tries in Production

9.1 IP Routing: Longest Prefix Match

9.2 T9 Keyboard: Number-to-Text Input

9.3 Spell Checkers: Levenshtein Distance and Fuzzy Search

9.4 Ternary Search Tree: Middle Ground

9.5 Double Array Trie: Japanese Extreme Optimization

10. Production Readiness Checklist

10.1 Do You Really Need a Trie?

10.2 Memory Budget Verification

10.3 Alternative Technology Comparison

11. Summary: When to Choose Tries

12. Glossary