Text generation
Generate text from text-only input
The simplest way to generate text using the Gemini API is to provide the model with a single text-only input, as shown in this example:
In this case, the prompt ("Write a story about a magic backpack") doesn't include any output examples, system instructions, or formatting information. It's a zero-shot approach. For some use cases, a one-shot or few-shot prompt might produce output that's more aligned with user expectations. In some cases, you might also want to provide system instructions to help the model understand the task or follow specific guidelines.
Generate text from text-and-image input
The Gemini API supports multimodal inputs that combine text with media files. The following example shows how to generate text from text-and-image input:
As with text-only prompting, multimodal prompting can involve various approaches and refinements. Depending on the output from this example, you might want to add steps to the prompt or be more specific in your instructions. To learn more, see File prompting strategies.
Generate a text stream
By default, the model returns a response after completing the entire text generation process. You can achieve faster interactions by not waiting for the entire result, and instead use streaming to handle partial results.
The following example shows how to implement streaming using the streamGenerateContent
method to generate text from a text-only input prompt.
What's next
This guide shows how to use generateContent
and streamGenerateContent
to generate text outputs from text-only and text-and-image inputs. To learn more about generating text using the Gemini API, see the following resources:
Prompting with media files: The Gemini API supports prompting with text, image, audio, and video data, also known as multimodal prompting.
System instructions: System instructions let you steer the behavior of the model based on your specific needs and use cases.
Safety guidance: Sometimes generative AI models produce unexpected outputs, such as outputs that are inaccurate, biased, or offensive. Post-processing and human evaluation are essential to limit the risk of harm from such outputs.
Last updated