Hallucination Evaluation

The Vectara Factual Consistency Score automatically evaluates and detects hallucinations in generated output. Based on a more advanced version of Hughes Hallucination Evaluation Model (HHEM), the Factual Consistency Score enables you to evaluate the likelihood of an AI-generated summary being factually correct based on search results. This calibrated score can range from 0.0 to 1.0. A higher scores indicates a greater probability of being factually accurate, while a lower score indicates a greater probability of hallucinations.

For example, a score of 0.95 suggests a 95% likelihood that the summary is free of hallucinations and would align with the original content. A lower score of 0.40 indicates a 40% chance which may contain one or more factual inaccuracies, however minor. We suggest starting with a setting of 0.5 as an initial guideline.

Enable the Factual Consistency Score

In your summarization request, set the factual_consistency_score field to true. The Factual Consistency Score returns a calibrated value in the factual_consistency field of the summary message. The score field contains the value between 0.0 and 1.0.

Enable the Factual Consistency Score
"summary": [
        {
          "max_summarized_results": 3,
          "response_lang": "en",
          "factual_consistency_score": true
          },
        }
      ]

In the following example, the summary shows a factualConsistency score of 0.98, which is 98%.

Example Factual Consistency Score
"summary": [
        {
          "text": "According to the novel 'The Hitchhiker's Guide to the Galaxy' by Douglas Adams, the answer to the ultimate question of life, the universe, and everything is 42.",
          "lang": "en",
          "factualConsistency": {
            "score": 0.98
            "status":{
               "code":"OK",
               "statusDetail":"",
               "cause":null
            }
          },
        }
      ]

Enable the Factual Consistency Score​

Enable the Factual Consistency Score