Controlling the Model: Generation Parameters

Master the parameters that control how LLMs generate text in real applications

Learning goals

Temperature controls the randomness of token selection:

0.0: Deterministic—always picks the highest probability token. Best for factual tasks.
0.7: Balanced—good default for most tasks
1.0+: Creative—more random, unexpected outputs. Good for brainstorming.

Think of temperature as the "creativity dial."

Top-P (0.0 - 1.0) limits token selection to the smallest set whose cumulative probability exceeds P:

Use either temperature OR top-p for control, not both simultaneously.

Sets the maximum length of the generated response. Important for:

Strings that terminate generation when encountered. Useful for structured outputs.

Reduces repetition by penalizing tokens that have already appeared. Higher values = less repetition.

Encourages the model to introduce new topics. Higher values = more diverse content.

×Using high temperature for tasks requiring accuracy—leads to hallucinations

×Setting max_tokens too low—responses get cut off mid-sentence

×Using both temperature and top_p—they compete; use one or the other

×Not testing parameters—optimal values vary by use case

+Temperature controls randomness: low for accuracy, high for creativity

+Top-P is an alternative to temperature for controlling diversity

+Max tokens limits output length and controls costs

+Always test different parameter combinations for your specific use case

Temperature (0.0 - 2.0)

Temperature controls the randomness of token selection:

0.0: Deterministic—always picks the highest probability token. Best for factual tasks.
0.7: Balanced—good default for most tasks
1.0+: Creative—more random, unexpected outputs. Good for brainstorming.

Think of temperature as the "creativity dial."

Other Parameters

Strings that terminate generation when encountered. Useful for structured outputs.

Reduces repetition by penalizing tokens that have already appeared. Higher values = less repetition.

Encourages the model to introduce new topics. Higher values = more diverse content.