MEVZU N°128ISTANBULYEAR I — VOL. III
MEVZU N° TAG / VOL. 095
#metrics
0 blog · 0 news · 7 wiki
§03
07Wiki
§01Glossary
Tokens Per Second (TPS)
How many tokens a model generates per second — the most visible metric of inference speed.
- EN
- Tokens Per Second (TPS)
- TR
- Saniyedeki Token (TPS)
§02Glossary
Cold Start
The slow first response when a model or service has been idle and must initialise on demand.
- EN
- Cold Start
- TR
- Soğuk Başlatma
§03Glossary
Latency
The time between issuing a request and receiving a result.
- EN
- Latency
- TR
- Gecikme (Latency)
§04Glossary
Throughput
The total amount of tokens, requests or jobs a system can process per unit of time.
- EN
- Throughput
- TR
- Verim (Throughput)
§05Glossary
Time to First Token (TTFT)
The time between sending a request and receiving the first generated token.
- EN
- Time to First Token (TTFT)
- TR
- İlk Token Süresi (TTFT)
§06Glossary
Model FLOPs Utilization (MFU)
How much of a model's theoretical peak FLOPs is actually delivered during real training — a key efficiency metric.
- EN
- Model FLOPs Utilization (MFU)
- TR
- Model FLOPs Kullanımı (MFU)
§07Glossary
FLOPs
Floating-point operations per second — the classic metric for raw compute power.
- EN
- FLOPs
- TR
- FLOPs (Saniyedeki Kayar Nokta İşlemi)