LLM Insights Calculator
Estimate LLM memory footprint, capacity, and latency based on your configuration.
Input Parameters
Number of GPUs
Prompt Size (tokens)
Response Size (tokens)
Number of Concurrent Requests
Calculate
Model Specifications
GLinER-500M
Llama-7B-text2sql
TinyLlama-1.1B