Chat Completions

The Chat Completions API is the most widely supported text generation method across AI providers.

Basic Usage

Simple Text Input

string response = await "Hello, AI!"
    .GENCompletion()
    .ExecuteAsync();

With Model Selection

string response = await "Explain photosynthesis"
    .GENCompletion()
    .SetModel(OpenAIModel.GPT4o)
    .ExecuteAsync();

Input Types

1. String Input

Direct text prompt:

2. Message Input

Use Message object for more control:

3. Prompt Input

Use Prompt object for reusable prompts:

Configuration

Temperature

Controls randomness (0.0 = deterministic, 2.0 = very creative):

Max Tokens

Limit response length:

System Message

Set context and behavior:

Top P (Nucleus Sampling)

Alternative to temperature (0.0-1.0):

Frequency Penalty

Reduce word repetition (-2.0 to 2.0):

Presence Penalty

Encourage topic diversity (-2.0 to 2.0):

Multi-Turn Conversations

Build conversation history:

Streaming

Get real-time token-by-token responses:

Provider Support

Provider
Support
Models

OpenAI

✅ Full

GPT-4o, GPT-4, GPT-3.5

Anthropic

✅ Full

Claude 3.5, Claude 3

Google Gemini

✅ Full

Gemini 1.5, Gemini 1.0

OpenRouter

✅ Full

All models

Groq

✅ Full

Llama 3, Mixtral

xAI

✅ Full

Grok

Perplexity

✅ Full

All models

Azure OpenAI

✅ Full

GPT-4, GPT-3.5

Ollama

✅ Full

All local models

Examples

Example 1: Chatbot Response

Example 2: FAQ System

Example 3: Contextual Help

Example 4: Translation

Error Handling

Best Practices

✅ Do

  • Use lower temperature (0.0-0.3) for factual answers

  • Use higher temperature (0.7-1.5) for creative content

  • Set reasonable max tokens to control costs

  • Cache responses when possible

  • Handle errors gracefully

❌ Don't

  • Set temperature above 2.0 (unstable results)

  • Use very high frequency/presence penalties (degrades quality)

  • Forget to set system messages for context

  • Ignore error handling

  • Call API in tight loops without rate limiting

Next Steps

Last updated