Hybrid Search
Semantic search enables users to discover information based not just on keywords, but on the contextual meaning and relevance of their queries. This neural search understands the intent behind user inquiries, leveraging natural language understanding to deliver more accurate and insightful search results.
Leveraging these semantic search capabilities, Vectara also provides a Hybrid Search option that offers a powerful and flexible approach to text retrieval. We combine partial, exact, and Boolean text matching with neural models which blends traditional, keyword-based search with semantic search in what is called a "hybrid" retrieval model.
For example, you can use this in Vectara to:
- Include exact keyword matches for occasions where a search term was absent from Vectara's training data (e.g. product SKUs)
- Disable neural retrieval entirely, and instead use exact term matching
- Incorporate typical keyword modifiers like a
NOT
function, exact phrase matching, and wildcard prefixes of terms
Enabling Hybrid Search
Enable hybrid search by specifying lambda
value at query time, specifically
under the corpusKey
. This value can range from 0.0
to 1.0
(inclusive).
As you ingest data and run queries, adjust the lambda
value to achieve the
perfect balance in answer quality.
"corpusKey": [
{
"customerId": 123456789,
"corpusId": 5,
"semantics": 0,
"metadataFilter": "",
"lexicalInterpolationConfig": {
"lambda": 0.025
},
"dim": []
}
Experimenting with different lambda values
The default value of lambda
is 0
, which disables exact and Boolean text
matching. A value of 1
would disable neural retrieval instead, relying only on
Boolean and exact text matching. Experimenting with
the lambda
value is useful if you're trying to evaluate how a keyword system like one based on
Elasticsearch or Solr may compare to Vectara.
💡
You can test queries with different lambda
values in
our API Playground and in the Vectara Console.
Vectara supports in-between values as well, which tells Vectara to try to
consider both neural and Boolean and exact text matching and then to blend
the scores of the results of the two different scoring models. Users often see
best results by setting this lambda value somewhere between 0.01
and 0.1
, and
we typically recommend users start experimentation with a lambda
value of
0.025
.
Syntax Interpretation
When interpreting query strings, Vectara treats the following syntax specially.
Words that are quoted must match exactly in that order. For example, the query
blue shoes
must match the wordblue
followed immediately byshoes
.A word fragment suffixed with an asterisk
*
is treated as a prefix match, meaning that it matches any word of which it is a prefix. For example,Miss*
matches Mississippi.Words prefixed with a minus
-
sign are excluded from the results. To extend on the previous example,-Mississippi
would exclude results referencing the Magnolia State. Using-Miss*
would exclude references to both Mississippi and Missouri.