Calls the /v1/models endpoint of a running llama.cpp server and returns
the currently loaded model(s) as a tibble. In normal operation this is one
row; two rows appear when speculative decoding is active (main + draft model).
Usage
llamacpp_list_models(
.server = Sys.getenv("LLAMACPP_SERVER", "http://localhost:8080"),
.api_key = Sys.getenv("LLAMACPP_API_KEY", ""),
.timeout = 30,
.max_tries = 3
)