Summarizers
Behind the scenes, Vectara supports both selecting the summarizer model as well as the prompt for the model. We make a range of these and if you are a Scale customer, or are considering becoming one and have any questions on your options, please reach out to our support team, who can help guide you.
Summarizers have prefixes and versions and they encapsulate both a prompt text,
as well as potentially specific configuration options for the generative
system. vectara-summary-ext
is the prefix for generative summarization of
the results.
Providing the summarizer as part of the config is optional. If you do not provide a summarizer config at request time, Vectara uses the best available summarizer for your account.
Currently Available Summarizers
Today, the versions available are 1.2.0
which uses chatgpt-3.5-turbo
and 1.3.0
which uses gpt-4.0 (and only available to our paying Scale
customers). The 1.2.0 summarizer is typically faster while 1.3.0 is typically
slower, but it produces a more accurate summary.
These are the two official summarizers available to our users that you specify
in the summarizerPromptName
:
vectara-summary-ext-v1.2.0
(gpt-3.5-turbo)vectara-summary-ext-v1.3.0
(gpt-4.0)
Scale customers also have access to advanced summarization customization options including custom prompts, character limits, temperature, and frequency and presence penalties.
Check out our interactive API Playground that lets Scale users experiment with these additional summarization options.
Beta Summarizers
We also have four beta summarizers available for our users to try:
- Growth and Scale:
vectara-experimental-summary-ext-2023-10-23-small
(gpt-3.5-turbo) - Scale only:
vectara-experimental-summary-ext-2023-10-23-med
(gpt-4.0) - Growth and Scale:
vectara-experimental-summary-ext-2023-12-11-sml
(gpt-3.5-turbo) - Scale only:
vectara-experimental-summary-ext-2023-12-11-large:
(gpt-4.0-turbo)
These beta versions are a preview of our next improved summarizers. Since they are experimental, and while we don't support them officially, we are currently considering promoting them to GA, pending feedback from our users.
Beta Summarizer Example
The following example query selects the beta GPT 4.0 summarizer (only available to Scale users):
{
"query": [
{
"query": "What is the infinite improbability drive?",
"start": 0,
"numResults": 10,
"corpusKey": [
{
"customerId": 12345678,
"corpusId": 1
}
],
"summary": [
{
"summarizerPromptName": "vectara-experimental-summary-ext-2023-10-23-med",
"responseLang": "en",
"maxSummarizedResults": 5
}
]
}
]
}
Default maxSummarizedResults Limit
The default limit of maxSummarizedResults
is 10 search results for Growth
plans and this limit can be extended for Scale plan users. Setting the values
closer to the limit generates a more comprehensive summary, but using a lower
value can balance the results with quality and response time.
maxSummarizedResults Example
This summarizer example attempts to balance creating a good quality summary
with a reasonably fast response by setting maxSummarizedResults
to 5
. To use
vectara-summary-ext-v1.2.0
, send it as the summarizerPromptName as follows:
{
"query": [
{
"query": "What is the infinite improbability drive?",
"start": 0,
"numResults": 10,
"corpusKey": [
{
"customerId": 12345678,
"corpusId": 1
}
],
"summary": [
{
"summarizerPromptName": "vectara-summary-ext-v1.2.0",
"responseLang": "en",
"maxSummarizedResults": 5
}
]
}
]
}