Models
Pochi leverages the AI SDK to support various LLM providers and can run local models.
Pochi
After signing in, you gain access to the Pochi models shown below. Pochi uses a usage-based pricing strategy with no additional charges. For detailed pricing information, visit https://app.getpochi.com/pricing.
OpenAI Compatible
Pochi allows you to configure any LLM provider that offers an OpenAI-compatible API by setting up the appropriate API keys. To configure custom models, use the VS Code command Pochi: Open Custom Model Settings
to open the settings configuration file.
{
"$schema": "https://getpochi.com/config.schema.json",
"providers": {
"provider-id": {
"apiKey": "your_api_key",
"baseURL": "https://api.provider.com/v1",
"models": {
"model-id": {
// The contextWindow and maxTokens are optional right now, it's default set to 10k and 4096
// "contextWindow": 131072,
// "maxTokens": 8000
}
}
}
}
}
Chutes
Chutes provides access to various AI models through their API. To use Chutes, you'll need to obtain an API token from Chutes Platform.
{
"$schema": "https://getpochi.com/config.schema.json",
"providers": {
"chutes": {
"apiKey": "your_api_token",
"baseURL": "https://llm.chutes.ai/v1",
"models": {
"zai-org/GLM-4.5-FP8": {
"name": "glm-4.5"
}
}
}
}
}
Cerebras
Cerebras provides AI models through their OpenAI-compatible API. To use Cerebras, you'll need to obtain an API key from Cerebras Platform.
{
"$schema": "https://getpochi.com/config.schema.json",
"providers": {
"cerebras": {
"apiKey": "your_api_key",
"baseURL": "https://api.cerebras.ai/v1",
"models": {
"llama3.1-8b": {
"name": "llama-3.1-8b"
}
}
}
}
}
DeepInfra
DeepInfra provides access to a wide range of AI models through their OpenAI-compatible API. To use DeepInfra, you'll need to obtain an API token from DeepInfra Platform.
{
"$schema": "https://getpochi.com/config.schema.json",
"providers": {
"deepinfra": {
"apiKey": "your_api_token",
"baseURL": "https://api.deepinfra.com/v1/openai",
"models": {
"meta-llama/Llama-3.3-70B-Instruct": {
"name": "llama-3.3-70b"
}
}
}
}
}
Groq
Groq is a cloud-based AI platform that provides fast inference for large language models.
{
"$schema": "https://getpochi.com/config.schema.json",
"providers": {
"groq": {
"apiKey": "your_api_key",
"baseURL": "https://api.groq.com/openai/v1",
"models": {
"llama3-8b-8192": {
"name": "llama-3-8b"
}
}
}
}
}
LM Studio
LM Studio is a local IDE application that allows you to run and experiment with language models on your own machine.
{
"$schema": "https://getpochi.com/config.schema.json",
"providers": {
"lmstudio": {
"baseURL": "http://127.0.0.1:1234/v1",
"models": {
"gemma-2b-it-q4f32_1": {
"name": "gemma-2b"
}
}
}
}
}
Mistral
Mistral AI provides a range of powerful language models available through their API. To use Mistral models, you'll need to obtain an API key from Mistral AI Platform.
{
"$schema": "https://getpochi.com/config.schema.json",
"providers": {
"mistral": {
"apiKey": "your_api_key",
"baseURL": "https://api.mistral.ai/v1",
"models": {
"mistral-small-latest": {
"name": "mistral-small"
}
}
}
}
}
Ollama
Ollama is a tool that allows you to run large language models locally on your machine with a simple API.
{
"$schema": "https://getpochi.com/config.schema.json",
"providers": {
"ollama": {
"baseURL": "http://localhost:11434/v1",
"models": {
"llama3:8b": {
"name": "llama-3-8b"
}
}
}
}
}