更新时间:2024-08-29 GMT+08:00
分享

文档问答

基于已有的知识库进行回答,包括stuff、refine和map-reduce策略。

  • Stuff:将所有文档直接填充到prompt中,提给模型回答,适用于文档较少的场景。
    import com.huaweicloud.pangu.dev.sdk.api.llms.LLMs;
    import com.huaweicloud.pangu.dev.sdk.api.memory.bo.Document;
    import com.huaweicloud.pangu.dev.sdk.skill.DocSkill;
    import com.huaweicloud.pangu.dev.sdk.api.skill.Skills;
    import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vector;
    import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vectors;
    import com.huaweicloud.pangu.dev.sdk.api.memory.config.VectorStoreConfig;
    import com.huaweicloud.pangu.dev.sdk.api.embedings.Embeddings;
    
    import java.util.List;
    
    Vector cssVector = Vectors.of(Vectors.CSS,
                VectorStoreConfig.builder()
                    .embedding(Embeddings.of(Embeddings.CSS))
                    .indexName("test-stuff-document-062102")
                    .build());
    
    // 检索
    String query = "杜甫的诗代表了什么主义诗歌艺术的高峰?";
    List<Document> docs = cssVector.similaritySearch(query, 4, 105);
    
    // 问答
    DocSkill docSkill = Skills.Document.newDocAskStuffSkill(LLMs.of(LLMs.PANGU));
    
    System.out.println(docSkill.executeWithDocs(docs, query));
  • Refine:基于首个文档问答,并循环后续文档来迭代更新答案。
    import com.huaweicloud.pangu.dev.sdk.api.llms.LLMs;
    import com.huaweicloud.pangu.dev.sdk.api.memory.bo.Document;
    import com.huaweicloud.pangu.dev.sdk.skill.DocSkill;
    import com.huaweicloud.pangu.dev.sdk.api.skill.Skills;
    import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vector;
    import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vectors;
    import com.huaweicloud.pangu.dev.sdk.api.memory.config.VectorStoreConfig;
    import com.huaweicloud.pangu.dev.sdk.api.embedings.Embeddings;
    
    import java.util.List;
    
    Vector cssVector = Vectors.of(Vectors.CSS,
                VectorStoreConfig.builder()
                    .embedding(Embeddings.of(Embeddings.CSS))
                    .indexName("test-stuff-document-062102")
                    .build());
    
    // 检索
    String query = "杜甫的诗代表了什么主义诗歌艺术的高峰?";
    List<Document> docs = cssVector.similaritySearch(query, 4, 105);
    
    // 问答
    DocSkill docSkill = Skills.Document.newDocAskRefineSkill(LLMs.of(LLMs.PANGU));
    
    System.out.println(docSkill.executeWithDocs(docs, query));
  • Map-Reduce:先将文档单独进行摘要,将摘要后的文档再提交给模型。必要时,会循环迭代摘要。
    import com.huaweicloud.pangu.dev.sdk.api.llms.LLMs;
    import com.huaweicloud.pangu.dev.sdk.api.memory.bo.Document;
    import com.huaweicloud.pangu.dev.sdk.skill.DocSkill;
    import com.huaweicloud.pangu.dev.sdk.api.skill.Skills;
    import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vector;
    import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vectors;
    import com.huaweicloud.pangu.dev.sdk.api.memory.config.VectorStoreConfig;
    import com.huaweicloud.pangu.dev.sdk.api.embedings.Embeddings;
    
    import java.util.List;
    
    Vector cssVector = Vectors.of(Vectors.CSS,
                VectorStoreConfig.builder()
                    .embedding(Embeddings.of(Embeddings.CSS))
                    .indexName("test-stuff-document-062102")
                    .build());
    
    // 检索
    String query = "杜甫的诗代表了什么主义诗歌艺术的高峰?";
    List<Document> docs = cssVector.similaritySearch(query, 4, 105);
    
    // 问答
    DocSkill docSkill = Skills.Document.newDocAskMapReduceSkill(LLMs.of(LLMs.PANGU));
    
    System.out.println(docSkill.executeWithDocs(docs, query));
    

相关文档