All Documents
Natural Language ProcessingNatural Language Processing
- what's new
- Service Overview
- Getting Started
- API Reference
-
FAQs
- How Do I Obtain a User Token Using Postman?
- What Should I Do If an NLP API Call Fails?
- Which Methods Can Be Used to Call NLP APIs?
- Must I Subscribe to NLP Before Calling NLP APIs?
- Can NLP Be Deployed on Premises?
- Can I Use NLP Offline?
- What Are Restrictions for Using NLP?
- What Are the Username, Domain Name, and Project Name in the Token Message Body?
- Regions and AZs
- What Permissions Do I Need to Use NLP?
- What Should I Do If I Want to Customize NLP?
- Which Languages Can NLP Process?
- More Documents
Word Segmentation
Introduction
This API is used to segment words in the text.
For details about endpoints, see Endpoints.
URI
- URI format
POST /v1/{project_id}/nlp-fundamental/segment
- Parameter description
Table 1 URI parameters Parameter
Mandatory
Description
project_id
Yes
Project ID. For details about how to obtain the project ID, see Obtaining a Project ID.
Request
Table 2 describes the request parameters.
Parameter |
Type |
Mandatory |
Description |
---|---|---|---|
text |
String |
Yes |
Text to be segmented. The text is encoded using UTF-8 and contains 1 to 2,000 characters. |
pos_switch |
Integer |
No |
Whether to enable part-of-speech tagging (POS tagging). The options are 1 (yes) and 0 (no). The default value is 0. |
lang |
String |
No |
Supported text language type. English (en) is now supported. |
criterion |
String |
No |
Supported word segmentation criterion The default word segmentation criterion for English text is Penn TreeBank. You do not need to configure this parameter. |
Response
Table 3 describes the response parameters.
Parameter |
Type |
Description |
---|---|---|
words |
Array of words |
Word segmentation result. For details, see Table 4. |
error_code |
String |
Error code when the API fails to be called. For details, see Error Code. The parameter is not included when the API call succeeds. |
error_msg |
String |
Error message returned when the API fails to be called. The parameter is not included when the API call succeeds. |
Parameter |
Type |
Description |
---|---|---|
content |
String |
Word text. |
pos |
String |
Lexical character corresponding to a word. For details, see Table 5, Table 6, and Table 7. |
Class-1 POS |
Class-2 POS |
Class-3 POS |
---|---|---|
n: Noun |
nr: Name of a person |
|
ns: Place name |
nsf: Transliterated place name |
|
nt: Organization or group name |
- |
|
nz: Other exclusive name |
- |
|
nl: Nominal locution |
- |
|
ng: Nominal morpheme |
- |
|
t: Time word |
tg: Time morpheme |
- |
s: Locative word |
- |
- |
f: Positional word |
- |
- |
v: Verb |
vd: Adverbial form of a verb |
- |
vn: Gerund |
- |
|
vshi: Copula verb |
- |
|
vyou: Verb indicating "has/have" |
- |
|
vf: Directional verb |
- |
|
vx: Formal verb |
- |
|
vi: Intransitive verb |
- |
|
vl: Verbal locution |
- |
|
vg: Verbal morpheme |
- |
|
a: Adjective |
ad: Adverbial adjective |
- |
an: Nominal adjective |
- |
|
ag: Adjective morpheme |
- |
|
al: Adjective locution |
- |
|
b: Distinguishing word |
bl: Distinguishing locution |
- |
z: Status word |
- |
- |
r: Pronoun |
rr: Personal pronoun |
- |
rz: Demonstrative pronoun |
|
|
ry: Interrogative pronoun |
|
|
rg: Pronominal morpheme |
- |
|
m: Numeral |
mq: Number word |
- |
mg: A, B, C, D, E, F, G, H, N, and G |
- |
|
q: Classifier |
qv: Verbal classifier |
- |
qt: Time classifier |
- |
|
d: Adverb |
- |
- |
p: Preposition |
pba: Preposition ba |
- |
pbei: Preposition bei |
- |
|
c: Conjunction |
cc: Coordinating conjunction |
- |
u: Particle |
uzhe: Particle |
- |
ule: Particle |
- |
|
uguo: Particle |
- |
|
ude1: Particle |
- |
|
ude2: Particle |
- |
|
ude3: Particle |
- |
|
usuo: Particle |
- |
|
udeng: Particle |
- |
|
uyy: Particle |
- |
|
udh: Particle |
- |
|
uls: Particle |
- |
|
uzhi: Particle |
- |
|
ulian: Particle |
- |
|
e: Exclamation |
- |
- |
y: Discourse word |
- |
- |
o: Onomatopoeia |
- |
- |
h: Prefix |
- |
- |
k: Suffix |
- |
- |
x: character string |
xe: Email character string |
- |
xs: Weibo session separator |
- |
|
xm: Emoticon |
- |
|
xu: Website URL |
- |
|
w: Punctuation |
wkz: Chinese left brackets |
- |
wky: Chinese right brackets |
- |
|
wyz: Chinese left quotation marks |
- |
|
wyy: Chinese right quotation marks |
- |
|
wj: Chinese full stop |
- |
|
ww: Question marks |
- |
|
wt: Exclamation marks |
- |
|
wd: Commas |
- |
|
wf: Semicolons |
- |
|
wn: Enumeration comma |
- |
|
wm: Colons |
- |
|
ws: Ellipsis |
- |
|
wp: Dashes |
- |
|
wb: Percentile and permil |
- |
|
wh: Unit |
- |
POS |
Description |
Example |
---|---|---|
AD |
Adverb |
word-1, word-2, word-3 |
AS |
Dynamic particle |
word-4, word-5, word-6 |
BA |
"ba" structure |
word-7 |
CC |
Coordinating conjunction |
word-8, word-9 |
CD |
Quantifier |
One, two, three |
CS |
Subordinating conjunction |
Although, if, when |
DEC |
Complement or nominalization |
word-10, word-11 |
DEG |
Conjunctive or possessive |
word-12, word-13 |
DER |
Complement de |
de |
DEV |
Adverb di |
di |
DT |
Determiner |
word-14, word-15, word-16 |
ETC |
word-17 |
word-17, word-18 |
FW |
Loanword |
A E B |
IJ |
Exclamation |
word-18, word-19 |
JJ |
Modifier for noun |
Big, new, small |
LB |
Long bei structure |
word-20, word-21, word-22 |
LC |
Positional word |
middle, upper |
M |
Classifier |
Unit, year, dollar |
MSP |
Particle |
Particle-1, particle-2, particle-3 |
NN |
Noun |
Economy, enterprise, person |
NR |
Proper noun |
China, Zhejiang |
NT |
Time noun |
Present, last year |
OD |
Numeral |
First, second, top |
ON |
Onomatopoeia |
O |
P |
Preposition |
Preposition-1, preposition-2, preposition-3 |
PN |
Pronoun |
He, I, myself |
PU |
Punctuation |
Chinese comma, Chinese full stop |
SB |
Short bei structure |
word-23, word-24 |
SP |
Particle at the end of a sentence |
Particle-1, particle-2, particle-3 |
VA |
Predicative adjective |
Big, many, good |
VC |
Linking verb |
Verb-1, verb-2, verb-3 |
VE |
Verb indicating "has/have" |
Verb-4, verb-5, verb-6 |
VV |
Verb |
Verb-7, verb-8, verb-9 |
POS |
Description |
Example |
---|---|---|
CC |
Coordinating conjunction |
and, but, or |
CD |
Cardinal number |
one, two |
DT |
Determiner |
a, the |
EX |
There be, to exist |
there |
FW |
Foreign word |
mea, culpa |
IN |
Preposition, subordinating conjunction |
of, in, by |
JJ |
Adjective |
yellow |
JJR |
Comparative form of adjectives |
bigger |
JJS |
Superlative form of adjectives |
wildest |
LS |
List item marker |
1, 2, One |
MD |
Modal verb |
can, could, might |
NN |
Noun, countable or uncountable |
llama |
NNS |
Noun, in plural form |
llamas |
NNP |
Proper noun, in singular form |
IBM |
NNPS |
Proper noun, in plural form |
Carolinas |
PDT |
Predeterminer |
all, both |
POS |
Possessive adjective |
's |
PRP |
Personal pronoun |
I, me, you, |
PRP$ |
Possessive pronoun |
my, your, yours |
RB |
Adverb |
quickly |
RBR |
Comparative form of adverbs |
faster |
RBS |
Superlative form of adverbs |
fastest |
RP |
Particle |
up, off |
SYM |
Sign (mathematics or science) |
+, % ,& |
TO |
to |
to |
UH |
Exclamation |
ah, oops |
VB |
Basic form of verbs |
eat |
VBD |
Past tense of verbs |
ate |
VBG |
Gerund or present participle |
eating |
VBN |
Past participle |
eaten |
VBP |
Non-third person singular form of verbs |
eat |
VBZ |
Third person singular form of verbs |
eats |
WDT |
wh-determiner |
which, that |
WP |
wh-pronoun |
what, who |
WP$ |
wh-possessive pronoun |
whose |
WRB |
wh-adverb |
how, where |
PU |
Punctuation |
, . : |
Example
- Example request
POST https://nlp-ext.ap-southeast-3.myhuaweicloud.com/v1/{project_id}/nlp-fundamental/segment Request Header: Content-Type: application/json X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG... Request Body: { "text":"today is a good day.", "pos_switch":1, "lang":"en", "criterion":"PKU" }
- Example response
- Successful response example
{ "words": [ { "content": "today", "pos": "NN" }, { "content": "is", "pos": "VBZ" }, { "content": "a", "pos": "DT" }, { "content": "good", "pos": "JJ" }, { "content": "day", "pos": "NN" }, { "content": ".", "pos": "PU" } ] }
- Failed response example
{ "error_code": "NLP.0301", "error_msg": "The length of text should be in the range of 1-512" }
- Successful response example
Status code
For details about status codes, see Status Code.
Error Code
For details about error codes, see Error Code.