更新时间:2024-11-21 GMT+08:00
配置文档摘要能力(Java SDK)
基于已有的知识库进行摘要总结,包括stuff、refine和map-reduce策略。
- Stuff:将所有文档直接填充到prompt中,提给模型处理,适用于文档较少的场景。
import com.huaweicloud.pangu.dev.sdk.api.llms.LLMs; import com.huaweicloud.pangu.dev.sdk.api.memory.bo.Document; import com.huaweicloud.pangu.dev.sdk.skill.DocSkill; import com.huaweicloud.pangu.dev.sdk.api.skill.Skills; import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vector; import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vectors; import com.huaweicloud.pangu.dev.sdk.api.memory.config.VectorStoreConfig; import com.huaweicloud.pangu.dev.sdk.api.embedings.Embeddings; import java.util.List; Vector cssVector = Vectors.of(Vectors.CSS, VectorStoreConfig.builder() .embedding(Embeddings.of(Embeddings.CSS)) .indexName("test-stuff-document-062102") .build()); // 检索 String query = "杜甫"; List<Document> docs = cssVector.similaritySearch(query, 4, 105); // 摘要 DocSkill docSkill = Skills.Document.newDocSummarizeStuffSkill(LLMs.of(LLMs.PANGU)); System.out.println(docSkill.executeWithDocs(docs));
- Refine:基于首个文档摘要,循环后续文档来迭代更新。
import com.huaweicloud.pangu.dev.sdk.api.llms.LLMs; import com.huaweicloud.pangu.dev.sdk.api.memory.bo.Document; import com.huaweicloud.pangu.dev.sdk.skill.DocSkill; import com.huaweicloud.pangu.dev.sdk.api.skill.Skills; import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vector; import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vectors; import com.huaweicloud.pangu.dev.sdk.api.memory.config.VectorStoreConfig; import com.huaweicloud.pangu.dev.sdk.api.embedings.Embeddings; import java.util.List; Vector cssVector = Vectors.of(Vectors.CSS, VectorStoreConfig.builder() .embedding(Embeddings.of(Embeddings.CSS)) .indexName("test-stuff-document-062102") .build()); // 检索 String query = "杜甫"; List<Document> docs = cssVector.similaritySearch(query, 4, 105); // 摘要 DocSkill docSkill = Skills.Document.newDocSummarizeRefineSkill(LLMs.of(LLMs.PANGU)); System.out.println(docSkill.executeWithDocs(docs));
- Map-Reduce:先将文档单独进行摘要,再将摘要后的文档提交给模型。必要时,会循环迭代摘要。
import com.huaweicloud.pangu.dev.sdk.api.llms.LLMs; import com.huaweicloud.pangu.dev.sdk.api.memory.bo.Document; import com.huaweicloud.pangu.dev.sdk.skill.DocSkill; import com.huaweicloud.pangu.dev.sdk.api.skill.Skills; import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vector; import com.huaweicloud.pangu.dev.sdk.api.memory.vector.Vectors; import com.huaweicloud.pangu.dev.sdk.api.memory.config.VectorStoreConfig; import com.huaweicloud.pangu.dev.sdk.api.embedings.Embeddings; import java.util.List; Vector cssVector = Vectors.of(Vectors.CSS, VectorStoreConfig.builder() .embedding(Embeddings.of(Embeddings.CSS)) .indexName("test-stuff-document-062102") .build()); // 检索 String query = "杜甫"; List<Document> docs = cssVector.similaritySearch(query, 4, 105); // 摘要 DocSkill docSkill = Skills.Document.newDocSummarizeMapReduceSkill(LLMs.of(LLMs.PANGU)); System.out.println(docSkill.executeWithDocs(docs));
父主题: 配置Skill(Java SDK)