What is Temperature?
In LLMs, Temperature is a hyperparameter that scales the "logits" (raw scores) of the next token predictions before they are converted into probabilities. It essentially controls how much the model favors the most likely options versus exploring less likely ones.
Low Temperature
Focuses on the top results. Reliable, consistent, and factual. Great for code, math, and structured data.
High Temperature
Spreads probability to more tokens. Diverse, creative, and surprising. Great for stories, brainstorming, and poetry.
Interactive Distribution
Adjust the temperature to see its effect
Adjust the temperature slider to see how it reshapes the probability distribution for the next token. Watch how "the" (the most likely choice) dominates at low temperatures and loses its lead as the temperature rises.
Control Panel
Adjust the temperature
Sample Prompt
"Once upon a time, there was..."
Probability Distribution
Next token probabilities
Focused Sampling: Distribution is sharpened. The model samples randomly, but high-probability tokens are much more likely to be chosen.
Live Completion
Balanced: Natural mix of predictability and variety.
Why different outputs? The bars show probability, but the model doesn't always pick the tallest bar. It samples randomly—like rolling a weighted die. Higher temperature = more equal weights = more unpredictable rolls. Click "Regenerate" to sample again!
How it Works Mathematically
The model generates a score for every possible token. To get probabilities, we use the Softmax function, modified by temperature:
Dividing by a small T amplifies differences between scores. The highest logit dominates exponentially.
Dividing by a large T compresses all scores toward zero, making them nearly equal after exponentiation.
Practical Guidelines
| Use Case | Temperature | Why? |
|---|---|---|
| Coding & Math | 0.0 - 0.2 | Errors in logic are costly; you want the most likely correct path. |
| Fact Retrieval | 0.1 - 0.4 | Reduces "hallucinations" by sticking to the most probable data points. |
| General Chat | 0.7 - 0.8 | The "sweet spot" for most models to sound natural and helpful. |
| Creative Writing | 1.0 - 1.2 | Encourages the model to use more interesting, varied vocabulary. |
| Brainstorming | 1.2 - 1.5 | Generates wild, unconventional ideas that might spark inspiration. |
Key Takeaways
- 1Temperature 0 is deterministic ("Greedy Search") — always picks the top token
- 2Higher temperature increases variety and creativity but decreases coherence
- 3Too high temperature (> 1.5) often results in gibberish
- 4Always match your temperature to the task's requirement for precision vs. creativity