Text generation
Last updated
Last updated
The simplest way to generate text using the Gemini API is to provide the model with a single text-only input, as shown in this example:
In this case, the prompt ("Write a story about a magic backpack") doesn't include any output examples, system instructions, or formatting information. It's a approach. For some use cases, a or prompt might produce output that's more aligned with user expectations. In some cases, you might also want to provide to help the model understand the task or follow specific guidelines.
The Gemini API supports multimodal inputs that combine text with media files. The following example shows how to generate text from text-and-image input:
By default, the model returns a response after completing the entire text generation process. You can achieve faster interactions by not waiting for the entire result, and instead use streaming to handle partial results.
As with text-only prompting, multimodal prompting can involve various approaches and refinements. Depending on the output from this example, you might want to add steps to the prompt or be more specific in your instructions. To learn more, see .
The following example shows how to implement streaming using the method to generate text from a text-only input prompt.
This guide shows how to use and to generate text outputs from text-only and text-and-image inputs. To learn more about generating text using the Gemini API, see the following resources:
: The Gemini API supports prompting with text, image, audio, and video data, also known as multimodal prompting.
: System instructions let you steer the behavior of the model based on your specific needs and use cases.
: Sometimes generative AI models produce unexpected outputs, such as outputs that are inaccurate, biased, or offensive. Post-processing and human evaluation are essential to limit the risk of harm from such outputs.