Sends text to the /v1/embeddings endpoint of a running llama.cpp server
and returns embedding vectors.
Usage
llamacpp_embedding(
.input,
.model = "local-model",
.server = Sys.getenv("LLAMACPP_SERVER", "http://localhost:8080"),
.api_key = Sys.getenv("LLAMACPP_API_KEY", ""),
.timeout = 120,
.dry_run = FALSE,
.max_tries = 3
)Arguments
- .input
A character vector of texts to embed, or an
LLMMessageobject.- .model
The model name (default:
"local-model"). llama.cpp ignores this and serves the loaded embedding model.- .server
Base URL of the llama.cpp server. Defaults to the
LLAMACPP_SERVERenvironment variable, falling back to"http://localhost:8080".- .api_key
API key for the server (default:
LLAMACPP_API_KEYenv var).- .timeout
Request timeout in seconds (default: 120).
- .dry_run
If TRUE, returns the request object without executing it (default: FALSE).
- .max_tries
Maximum retries (default: 3).
