Semantic chunking is a smarter Chunking approach that splits at topic shifts rather than fixed sizes: per-sentence or per-paragraph Embeddings are computed, and a new chunk starts when Cosine Similarity between adjacent units drops sharply. Llamaindex helped popularize the technique in 2024, and it can outperform a Recursive Splitter on noisy, multi-topic documents. The cost is higher — you need embeddings for every boundary — and the gain is dataset-dependent, so the decision should be driven by Evals.
MEVZU N°124ISTANBULYEAR I — VOL. III
Glossary · Intermediate · 2024
Semantic Chunking
A smarter chunking method that uses embedding similarity to split documents at topic boundaries.
- EN — English term
- Semantic Chunking
- TR — Turkish term
- Anlamsal Parçalama