Pangu Large Models
The Pangu Large Models connector is used to connect to the Huawei Cloud PanguLM.
PanguLM is an all-in-one platform for large model development and deployment, integrating data management, model training, and deployment. It supports custom model development and offers a full lifecycle toolchain to help developers build and deploy models efficiently. Enterprises can easily choose the right services and products to develop models and applications flexibly.
Creating a PanguLM Connection
- Log in to the new ROMA Connect console.
- In the navigation pane on the left, choose Connector. On the page displayed, click New Connection.
- Select the PanguLM connector.
- In the dialog box displayed, configure the connector and click OK.
Parameter
Description
Name
Enter the connector instance name.
App Key
Access key ID (AK) of the current account. Obtain the AK by referring to Access Keys. If an AK/SK pair has been generated, find the downloaded AK/SK file (such as credentials.csv).
App Secret
Secret access key (SK) of the current account. Obtain the SK by referring to Access Keys. If an AK/SK pair has been generated, find the downloaded AK/SK file (such as credentials.csv).
Description
Enter the description of the connector to identify it.
Action
- Q&A
- General text
Configuration Parameters
Parameter |
Description |
---|---|
Content-Type |
Request body MIME type. |
project_id |
Project ID. |
deployment_id |
Deployment ID of a model. |
region_id |
Region ID. |
messages |
Multi-turn dialogue. |
role |
Role. |
content |
Question-answer pair content. |
user |
Unique identifier of a customer. The value contains 1 to 64 characters. |
stream |
Whether to enable streaming mode.
|
temperature |
Diversity and creativity of generated text. The value ranges from 0 to 1, and 0 indicates the lowest diversity. Generally, a lower temperature is suitable for deterministic tasks, while a higher temperature, such as 0.9, is suitable for creative tasks. temperature is one of the key parameters that affect the output quality and diversity of an LLM. Other parameters, like top_p, can also be used to adjust the behavior and preferences of the LLM. However, do not use these parameters at the same time. |
top_p |
An alternative to sampling with temperature, called nucleus sampling, where the model only takes into account the tokens with the probability mass determined by the top_p parameter. Therefore, 0.1 means that only the tokens comprising the top 10% probability mass are considered. It is recommended that you use top_p or temperature to adjust the tendency of generated text, but do not modify the parameters at the same time. |
max_tokens |
Length and quality of chat replies. Generally, a large max_tokens value can generate a long and complete reply, but may also increase the risk of generating irrelevant or duplicate content. A small max_tokens value can generate short and concise replies, but may also cause incomplete or discontinuous content. Therefore, you need to select a proper max_tokens value based on scenarios and requirements. The value ranges from 1 to 2048, and the default value is 16. |
n |
Number of answers generated for each question. The default value is 1, indicating that only one answer is generated. If you want multiple answers, set this parameter to an integer greater than 1, for example, 2. In this way, the API returns an array containing two answers. The value can be 1 (default) or 2. |
presence_penalty |
Penalty given to repetition in the generated text. Positive presence penalty values penalize new tokens based on whether they have appeared in the text so far, increasing the model's likelihood of talking about new topics. This parameter helps to make the output more creative and diverse by reducing the likelihood of repetition. The value ranges from -2 to 2. |
Parameter |
Description |
---|---|
Content-Type |
Request body MIME type. |
project_id |
Project ID. |
deployment_id |
Deployment ID of a model. |
region_id |
Region ID. |
prompt |
Input text, which can contain 1 to 4,096 characters. |
user |
Unique identifier of a customer. The value contains 1 to 64 characters. |
stream |
Whether to enable streaming mode.
|
temperature |
Diversity and creativity of generated text. The value ranges from 0 to 1, and 0 indicates the lowest diversity. Generally, a lower temperature is suitable for deterministic tasks, while a higher temperature, such as 0.9, is suitable for creative tasks. temperature is one of the key parameters that affect the output quality and diversity of an LLM. Other parameters, like top_p, can also be used to adjust the behavior and preferences of the LLM. However, do not use these parameters at the same time. |
top_p |
An alternative to sampling with temperature, called nucleus sampling, where the model only takes into account the tokens with the probability mass determined by the top_p parameter. Therefore, 0.1 means that only the tokens comprising the top 10% probability mass are considered. It is recommended that you use top_p or temperature to adjust the tendency of generated text, but do not modify the parameters at the same time. |
max_tokens |
Length and quality of chat replies. Generally, a large max_tokens value can generate a long and complete reply, but may also increase the risk of generating irrelevant or duplicate content. A small max_tokens value can generate short and concise replies, but may also cause incomplete or discontinuous content. Therefore, you need to select a proper max_tokens value based on scenarios and requirements. The value ranges from 1 to 2048, and the default value is 16. |
n |
Number of answers generated for each question. The default value is 1, indicating that only one answer is generated. If you want multiple answers, set this parameter to an integer greater than 1, for example, 2. In this way, the API returns an array containing two answers. The value can be 1 (default) or 2. |
presence_penalty |
Penalty given to repetition in the generated text. Positive presence penalty values penalize new tokens based on whether they have appeared in the text so far, increasing the model's likelihood of talking about new topics. This parameter helps to make the output more creative and diverse by reducing the likelihood of repetition. The value ranges from -2 to 2. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot