Gemini Provider
Google’s Gemini provider offers multimodal capabilities including native video and audio understanding.
Configuration
import { LLM } from "@node-llm/core";
LLM.configure({
provider: "gemini",
apiKey: process.env.GEMINI_API_KEY, // Optional if set in env
});
Specific Parameters
Gemini uses generationConfig and safetySettings.
const chat = LLM.chat("gemini-1.5-pro")
.withParams({
generationConfig: {
topP: 0.8,
topK: 40,
maxOutputTokens: 8192
},
safetySettings: [
{
category: "HARM_CATEGORY_HARASSMENT",
threshold: "BLOCK_LOW_AND_ABOVE"
}
]
});
Features
- Models:
gemini-1.5-pro,gemini-1.5-flash,gemini-1.0-pro. - Multimodal: Supports Images, Audio, and Video files directly.
- Tools: Supported.
- System Instructions: Supported.
Video Support
Gemini is unique in its ability to natively process video files.
await chat.ask("What happens in this video?", {
files: ["./video.mp4"]
});