- Ble medlem
- 24.08.2018
- Innlegg
- 4.665
- Antall liker
- 3.067
En metode, og som vi har brukt med godt hell er Retrieval-Augmented Generation(RAG) sammen med språkmodellen da kan du kontekstualisere og bruke egne data som kilde til output.
Ja, som sagt, prøvd her også, men ikke veldig imponert over resultatet.En metode, og som vi har brukt med godt hell er Retrieval-Augmented Generation(RAG) sammen med språkmodellen da kan du kontekstualisere og bruke egne data som kilde til output.
The use of RAG does not completely eliminate the general challenges faced by LLMs, including hallucination.
For example, a chatbot powered by large language models(LLMs), like ChatGPT, may embed plausible-sounding random falsehoods within its generated content. Researchers have recognized this issue, and by 2023, analysts estimated that chatbots hallucinate as much as 27% of the time,[8] with factual errors present in 46% of generated texts.[9] Detecting and mitigating these hallucinations pose significant challenges for practical deployment and reliability of LLMs in real-world scenarios.[10][8][9]
Link videre:The hallucination phenomenon is still not completely understood. Researchers have also proposed that hallucinations are inevitable and are an innate limitation of large language models.[73] Therefore, there is still ongoing research to try to mitigate its occurrence.[74] Particularly, it was shown that language models not only hallucinate but also amplify hallucinations, even for those which were designed to alleviate this issue.[75]
Ji et al.[76] divide common mitigation method into two categories: data-related methods and modeling and inference methods. Data-related methods include building a faithful dataset, cleaning data automatically and information augmentation by augmenting the inputs with external information. Model and inference methods include changes in the architecture (either modifying the encoder, attention or the decoder in various ways), changes in the training process, such as using reinforcement learning, along with post-processing methods that can correct hallucinations in the output.
Researchers have proposed a variety of mitigation measures, including getting different chatbots to debate one another until they reach consensus on an answer.[77] Another approach proposes to actively validate the correctness corresponding to the low-confidence generation of the model using web search results. They have shown that a generated sentence is hallucinated more often when the model has already hallucinated in its previously generated sentences for the input, and they are instructing the model to create a validation question checking the correctness of the information about the selected concept using Bingsearch API.[78] An extra layer of logic-based rules was proposed for the web search mitigation method, by utilizing different ranks of web pages as a knowledge base, which differ in hierarchy.[79] When there are no external data sources available to validate LLM-generated responses (or the responses are already based on external data as in RAG), model uncertainty estimation techniques from machine learning may be applied to detect hallucinations.[80]