MEVZU N°129ISTANBULYEAR I — VOL. III
MEVZU N° TAG / VOL. 118
#performance
0 blog · 0 news · 6 wiki
§03
06Wiki
§01Glossary
TPS Wars
The competitive period that emerged in 2024 around inference providers competing on tokens per second (TPS).
- EN
- TPS Wars
- TR
- TPS Savaşları
§02Glossary
Prompt Caching
A feature that caches large recurring prompt prefixes for major cost and latency savings.
- EN
- Prompt Caching
- TR
- Prompt Önbellekleme
§03Glossary
Tokens Per Second (TPS)
How many tokens a model generates per second — the most visible metric of inference speed.
- EN
- Tokens Per Second (TPS)
- TR
- Saniyedeki Token (TPS)
§04Glossary
Speculative Decoding
An inference speedup where a small draft model proposes multiple tokens that the big model then verifies in parallel.
- EN
- Speculative Decoding
- TR
- Spekülatif Çözme
§05Glossary
Throughput
The total amount of tokens, requests or jobs a system can process per unit of time.
- EN
- Throughput
- TR
- Verim (Throughput)
§06Glossary
Time to First Token (TTFT)
The time between sending a request and receiving the first generated token.
- EN
- Time to First Token (TTFT)
- TR
- İlk Token Süresi (TTFT)