Processing Complex JSON Data
This section describes how to use LTS's data processing feature to process complex JSON data.
Processing Complex JSON Data with Multiple Sub-keys as Arrays
Logs generated by programs are written in JSON format with statistical information. Generally, the logs contain basic information and multiple sub-keys as arrays. For example, a server writes a log every minute, including the current information status and the statistical status of related servers and client nodes.
- Example log
{ "content":{ "service": "search_service", "overall_status": "yellow", "servers": [ { "host": "192.0.2.1", "status": "green" }, { "host": "192.0.2.2", "status": "green" } ], "clients": [ { "host": "192.0.2.3", "status": "green" }, { "host": "192.0.2.4", "status": "red" } ] } } - Processing requirements
- Split the original log into three topics: overall_type, client_status, and server_status.
- Save different information for different topics.
- overall_type: retains the number of servers and clients, overall_status color, and service information.
- client_status: retains the host address, status, and service information.
- server_status: retains the host address, status, and service information.
- Solution: The following describes how to use the processing syntax. The processing syntax in 1 to 7 needs to be used together.
- Split a log into three logs, assign three different values to the topics, and split the log. After the splitting, three logs with the same information except the topics are generated.
e_set("topic", "server_status,client_status,overall_type") e_split("topic")The log format after processing is as follows:
topic: server_status // The other two logs are client_status and overall_type. content: { ...Same as above... } - The JSON content based on the content field is expanded at the first layer, and the content field is deleted.
e_json('content',depth=1) e_drop_fields("content")The log format after processing is as follows:
topic: overall_type // The other two are client_status and overall_type. The other parameters are the same. clients: [{"host": "192.0.2.3", "status": "green"}, {"host": "192.0.2.4", "status": "red"}] overall_status: yellow servers: [{"host": "192.0.2.1", "status": "green"}, {"host": "192.0.2.2", "status": "green"}] service: search_service - For logs whose topic is overall_type, collect statistics on client_count and server_count.
e_if(e_search("topic==overall_type"), e_compose( e_set("client_count", json_select(v("clients"), "length([*])", default=0)), e_set("server_count", json_select(v("servers"), "length([*])", default=0)) ))The log after processing is as follows:
topic: overall_type server_count: 2 client_count: 2
- Discard relevant fields.
e_if(e_search("topic==overall_type"), e_drop_fields("clients", "servers")) - Further split the logs whose topic is server_status.
e_if(e_search("topic==server_status"), e_split("servers")) e_if(e_search("topic==server_status"), e_json("servers", depth=1))The first log after processing is as follows:
topic: server_status servers: {"host": "192.0.2.1", "status": "green"} host: 192.0.2.1 status: greenThe second log after processing is as follows:
topic: server_status servers: {"host": "192.0.2.2", "status": "green"} host: 192.0.2.2 status: green - Retain relevant fields:
e_if(e_search("topic==server_status"), e_compose(e_drop_fields("servers"),e_drop_fields("clients"))) - Further split the logs whose topic is client_status and delete unnecessary fields.
e_if(e_search("topic==client_status"), e_split("clients")) e_if(e_search("topic==client_status"), e_json("clients", depth=1))The first log after processing is as follows:
topic: client_status host: 192.0.2.3 status: green
The second log after processing is as follows:
topic: clients host: 192.0.2.4 status: red
- Combine the preceding syntax as follows:
# Overall splitting e_set("topic", "server_status,client_status,overall_type") e_split("topic") e_json('content',depth=1) e_drop_fields("content") # Process overall_type logs. e_if(e_search("topic==overall_type"), e_compose( e_set("client_count", json_select(v("clients"), "length([*])", default=0)), e_set("server_count", json_select(v("servers"), "length([*])", default=0)) )) e_if(e_search("topic==overall_type"), e_drop_fields("clients", "servers")) # Process server_status logs. e_if(e_search("topic==server_status"), e_split("servers")) e_if(e_search("topic==server_status"), e_json("servers", depth=1)) e_if(e_search("topic==server_status"), e_compose(e_drop_fields("servers"),e_drop_fields("clients"))) # Process client_status logs. e_if(e_search("topic==client_status"), e_split("clients")) e_if(e_search("topic==client_status"), e_json("clients", depth=1)) e_if(e_search("topic==client_status"), e_compose(e_drop_fields("servers"),e_drop_fields("clients")))Processing result:
{ "content":{ "service": "search_service", "overall_status": "yellow", "servers": [ { "host": "192.0.2.1", "status": "green" }, { "host": "192.0.2.2", "status": "green" } ], "clients": [ { "host": "192.0.2.3", "status": "green" }, { "host": "192.0.2.4", "status": "red" } ] } }
- Split a log into three logs, assign three different values to the topics, and split the log. After the splitting, three logs with the same information except the topics are generated.
Processing Complex JSON Data with Multi-Layer Array Object Nesting
Take a complex JSON object with multi-layer array nesting as an example. Split each login information in login_histories of each object under users into a login event.
- Raw log
{ "content":{ "users": [ { "name": "user1", "login_histories": [ { "date": "2019-10-10 0:0:0", "login_ip": "192.0.2.6" }, { "date": "2019-10-10 1:0:0", "login_ip": "192.0.2.6" }, { ...More login information... } ] }, { "name": "user2", "login_histories": [ { "date": "2019-10-11 0:0:0", "login_ip": "192.0.2.7" }, { "date": "2019-10-11 1:0:0", "login_ip": "192.0.2.9" }, { ...More login information... } ] }, { ...More users... } ] } } - Expected split log
name: user1 date: 2019-10-11 1:0:0 login_ip: 192.0.2.6 name: user1 date: 2019-10-11 0:0:0 login_ip: 192.0.2.6 name: user2 date: 2019-10-11 0:0:0 login_ip: 192.0.2.7 name: user2 date: 2019-10-11 1:0:0 login_ip: 192.0.2.9 ...More logs...
- Solution
- Split and expand users in content.
e_split("content", jmes='users[*]', output='item') e_json("item",depth=1)Logs returned after processing:content:{...Same as above...} item: {"name": "user1", "login_histories": [{"date": "2019-10-10 0:0:0", "login_ip": "192.0.2.6"}, {"date": "2019-10-10 1:0:0", "login_ip": "192.0.2.6"}]} login_histories: [{"date": "2019-10-10 0:0:0", "login_ip": "192.0.2.6"}, {"date": "2019-10-10 1:0:0", "login_ip": "192.0.2.6"}] name: user1 content:{...Same as above...} item: {"name": "user2", "login_histories": [{"date": "2019-10-11 0:0:0", "login_ip": "192.0.2.7"}, {"date": "2019-10-11 1:0:0", "login_ip": "192.0.2.9"}]} login_histories: [{"date": "2019-10-11 0:0:0", "login_ip": "192.0.2.7"}, {"date": "2019-10-11 1:0:0", "login_ip": "192.0.2.9"}] name: user2 - Split and then expand login_histories.
e_split("login_histories") e_json("login_histories", depth=1)Logs returned after processing:
content: {...Same as above...} date: 2019-10-11 0:0:0 item: {"name": "user2", "login_histories": [{"date": "2019-10-11 0:0:0", "login_ip": "192.0.2.7"}, {"date": "2019-10-11 1:0:0", "login_ip": "192.0.2.9"}]} login_histories: {"date": "2019-10-11 0:0:0", "login_ip": "192.0.2.7"} login_ip: 192.0.2.7 name: user2 content: {...Same as above...} date: 2019-10-11 1:0:0 item: {"name": "user2", "login_histories": [{"date": "2019-10-11 0:0:0", "login_ip": "192.0.2.7"}, {"date": "2019-10-11 1:0:0", "login_ip": "192.0.2.9"}]} login_histories: {"date": "2019-10-11 1:0:0", "login_ip": "192.0.2.9"} login_ip: 192.0.2.9 name: user2 content: {...Same as above...} date: 2019-10-10 1:0:0 item: {"name": "user1", "login_histories": [{"date": "2019-10-10 0:0:0", "login_ip": "192.0.2.6"}, {"date": "2019-10-10 1:0:0", "login_ip": "192.0.2.6"}]} login_histories: {"date": "2019-10-10 1:0:0", "login_ip": "192.0.2.6"} login_ip: 192.0.2.6 name: user1 content: {...Same as above...} date: 2019-10-10 0:0:0 item: {"name": "user1", "login_histories": [{"date": "2019-10-10 0:0:0", "login_ip": "192.0.2.6"}, {"date": "2019-10-10 1:0:0", "login_ip": "192.0.2.6"}]} login_histories: {"date": "2019-10-10 0:0:0", "login_ip": "192.0.2.6"} login_ip: 192.0.2.6 name: user1 - Delete irrelevant fields.
e_drop_fields("content", "item", "login_histories")Logs returned after processing:
{ "date": "2019-10-10 0:0:0", "name": "user1", "login_ip": "192.0.2.6" } { "date": "2019-10-10 1:0:0", "name": "user1", "login_ip": "192.0.2.6" } { "date": "2019-10-11 0:0:0", "name": "user2", "login_ip": "192.0.2.7" } { "date": "2019-10-11 1:0:0", "name": "user2", "login_ip": "192.0.2.9" } - The DSL rules are as follows:
e_split("content", jmes='users[*]', output='item') e_json("item",depth=1) e_split("login_histories") e_json("login_histories", depth=1) e_drop_fields("content", "item", "login_histories")Summary: For the preceding requirements, split logs, expand logs, and delete irrelevant information.
- Split and expand users in content.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot