更新时间:2024-11-21 GMT+08:00
配置文档问答能力(Python SDK)
基于已有的知识库进行回答。有stuff、refine和map-reduce策略。
- Stuff:将所有文档直接填充到prompt中,提给模型回答,适合文档较少的场景。
from pangukitsappdev.api.embeddings.factory import Embeddings from pangukitsappdev.api.llms.factory import LLMs from pangukitsappdev.api.memory.vector.factory import Vectors from pangukitsappdev.api.memory.vector.vector_config import VectorStoreConfig, ServerInfoCss from pangukitsappdev.skill.doc.ask import DocAskStuffSkill vector_store_config = VectorStoreConfig(store_name="css", index_name="your_index_name", embedding=Embeddings.of("css"), text_key="name", vector_fields=["description"], distance_strategy="inner_product", server_info=ServerInfoCss(env_prefix="sdk.memory.css")) vector_api = Vectors.of("css", vector_store_config) # 检索 query = "杜甫的诗代表了什么主义诗歌艺术的高峰?" docs = vector_api.similarity_search(query, 4) # 问答 doc_skill = DocAskStuffSkill(LLMs.of("pangu")) print(doc_skill.execute({"documents": docs, "question": query}))
- Refine:基于首个文档,并循环后续文档来迭代更新答案。
from pangukitsappdev.api.embeddings.factory import Embeddings from pangukitsappdev.api.llms.factory import LLMs from pangukitsappdev.api.memory.vector.factory import Vectors from pangukitsappdev.api.memory.vector.vector_config import VectorStoreConfig, ServerInfoCss from pangukitsappdev.skill.doc.ask import DocAskRefineSkill vector_store_config = VectorStoreConfig(store_name="css", index_name="your_index_name", embedding=Embeddings.of("css"), text_key="name", vector_fields=["description"], distance_strategy="inner_product", server_info=ServerInfoCss(env_prefix="sdk.memory.css")) vector_api = Vectors.of("css", vector_store_config) # 检索 query = "杜甫的诗代表了什么主义诗歌艺术的高峰?" docs = vector_api.similarity_search(query, 4) # 问答 doc_skill = DocAskRefineSkill(LLMs.of("pangu")) print(doc_skill.execute({"documents": docs, "question": query}))
- Map-Reduce:先将文档单独进行摘要, 将摘要后的文档再提交给模型。 必要时循环迭代摘要。
from pangukitsappdev.api.embeddings.factory import Embeddings from pangukitsappdev.api.llms.factory import LLMs from pangukitsappdev.api.memory.vector.factory import Vectors from pangukitsappdev.api.memory.vector.vector_config import VectorStoreConfig, ServerInfoCss from pangukitsappdev.skill.doc.ask import DocAskMapReduceSkill vector_store_config = VectorStoreConfig(store_name="css", index_name="your_index_name", embedding=Embeddings.of("css"), text_key="name", vector_fields=["description"], distance_strategy="inner_product", server_info=ServerInfoCss(env_prefix="sdk.memory.css")) vector_api = Vectors.of("css", vector_store_config) # 检索 query = "杜甫的诗代表了什么主义诗歌艺术的高峰?" docs = vector_api.similarity_search(query, 4) # 问答 doc_skill = DocAskMapReduceSkill(LLMs.of("pangu")) print(doc_skill.execute({"documents": docs, "question": query}))
父主题: 配置Skill(Python SDK)