Send LLM Messages to an OpenAI Chat Completions endpoint on Azure
Source:R/api_azure_openai.R
azure_openai_chat.Rd
This function sends a message history to the Azure OpenAI Chat Completions API and returns the assistant's reply. This function is work in progress and not fully tested
Usage
azure_openai_chat(
.llm,
.endpoint_url = Sys.getenv("AZURE_ENDPOINT_URL"),
.deployment = "gpt-4o-mini",
.api_version = "2024-08-01-preview",
.max_completion_tokens = NULL,
.frequency_penalty = NULL,
.logit_bias = NULL,
.logprobs = FALSE,
.top_logprobs = NULL,
.presence_penalty = NULL,
.seed = NULL,
.stop = NULL,
.stream = FALSE,
.temperature = NULL,
.top_p = NULL,
.timeout = 60,
.verbose = FALSE,
.json = FALSE,
.json_schema = NULL,
.dry_run = FALSE,
.max_tries = 3
)
Arguments
- .llm
An
LLMMessage
object containing the conversation history.- .endpoint_url
Base URL for the API (default: Sys.getenv("AZURE_ENDPOINT_URL")).
- .deployment
The identifier of the model that is deployed (default: "gpt-4o-mini").
- .api_version
Which version of the API is deployed (default: "2024-08-01-preview")
- .max_completion_tokens
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
- .frequency_penalty
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.
- .logit_bias
A named list modifying the likelihood of specified tokens appearing in the completion.
- .logprobs
Whether to return log probabilities of the output tokens (default: FALSE).
- .top_logprobs
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position.
- .presence_penalty
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
- .seed
If specified, the system will make a best effort to sample deterministically.
- .stop
Up to 4 sequences where the API will stop generating further tokens.
- .stream
If set to TRUE, the answer will be streamed to console as it comes (default: FALSE).
- .temperature
What sampling temperature to use, between 0 and 2. Higher values make the output more random.
- .top_p
An alternative to sampling with temperature, called nucleus sampling.
- .timeout
Request timeout in seconds (default: 60).
- .verbose
Should additional information be shown after the API call (default: FALSE).
- .json
Should output be in JSON mode (default: FALSE).
- .json_schema
A JSON schema object as R list to enforce the output structure (If defined has precedence over JSON mode).
- .dry_run
If TRUE, perform a dry run and return the request object (default: FALSE).
- .max_tries
Maximum retries to perform request
Examples
if (FALSE) { # \dontrun{
# Basic usage
msg <- llm_message("What is R programming?")
result <- azure_openai_chat(msg)
# With custom parameters
result2 <- azure_openai_chat(msg,
.deployment = "gpt-4o-mini",
.temperature = 0.7,
.max_tokens = 1000)
} # }