更新时间:2025-06-18 GMT+08:00
快速入门
初始化EMS客户端
本示例用于初始化EMS客户端配置并启动EMS服务。
# 引入模块 import os, torch, torch_npu from ems import Ems, EmsConfig, EmsException, CcConfig, CcKvOption, KvBufferWrapper model_id="llama2-13b") # 初始化Ems config = EmsConfig(cc_config=cc_config) try: Ems.init(config) except EmsException as e: print(f"exception: {e}.") exit(1)
更多关于EMS客户端初始化的内容请参考初始化章节。
读写Context Caching
本示例通过初始化并获取Context Caching配置,保存和加载显存数据。
# 引入模块 import os, torch, torch_npu from ems import Ems, EmsConfig, EmsException, CcConfig, CcKvOption, KvBufferWrapper # 初始化cc配置 cc_config = CcConfig(rank_id=8, device_id=0, model_id="llama2-13b") # 初始化Ems config = EmsConfig(cc_config=cc_config) try: Ems.init(config) except EmsException as e: print(f"exception: {e}.") exit(1) # 获取context caching对象 cc = Ems.get_cc() if cc is None: exit(1) # 设置save请求的超时时间 option = CcKvOption(timeout=5000) # 保存一个tensor data = torch.ones(6, device="npu:1") length = data.numel() * data.element_size() key_list = ["hello"] val_list = [[KvBufferWrapper(data.data_ptr, length)]] try: cc_result = cc.save(option, key_list, val_list) except EmsException as e: print(f"failed to save, {e}.") exit(1) # 读取保存的tensor数据到新的tensor, 保证跟之前保存的的tensor一样的shape和dtype data = torch.zero(6, device="npu:1") val_list = [[KvBufferWrapper(data.data_ptr, len)]] try: cc_result = cc.save(option, key_list, val_list) except EmsException as e: print(f"failed to save, {e}.") exit(1)