Chunking is the act of splitting long documents into bounded, meaningful units suitable for Embedding in a RAG system. Pieces that are too small lose context; pieces that are too big bloat the Context Window with irrelevant text. Typical sizes are 256–1500 Tokens. Strategy choice has a dramatic impact: fixed-window, Recursive Splitter, sentence/paragraph-aware splitting, and Semantic Chunking are all common. "Chunk overlap" is the small bridge between adjacent chunks meant to preserve context across boundaries — an underappreciated design knob that often determines RAG quality.
MEVZU N°124ISTANBULYEAR I — VOL. III
Glossary · Beginner · 2022
Chunking
The process of splitting documents into meaningful, bounded-size pieces for RAG.
- EN — English term
- Chunking
- TR — Turkish term
- Parçalama (Chunking)