What Is APM
O&M Challenges
In the cloud era, applications in the microservice architecture are increasingly diversified, bringing many application exceptions. Application O&M faces the following challenges:
- Distributed applications have complex relationships. As a result, it is hard to ensure normal application running, and quickly locate faults and performance bottlenecks.
- Users choose to leave due to poor experience. If O&M personnel cannot detect and trace services with poor experience in real time, or diagnose application exceptions in a timely manner, user experience will be greatly affected.
- There are a large number of widely distributed applications in the service system. Calls across systems, regions, and applications are frequent. Enterprises urgently need to reduce application management and O&M costs and improve O&M efficiency.
Introduction to APM
Huawei Cloud Application Performance Management (APM) helps O&M personnel quickly identify application performance bottlenecks and locate root causes of faults, ensuring user experience.
You only need to install Agents for applications so that APM can monitor them in an all-round manner. APM can quickly locate error APIs and slow APIs, reproduce calling parameters, and detect system bottlenecks, facilitating online diagnosis. Currently, APM supports Java applications. The following table lists the application monitoring capabilities of APM.
Capability |
Description |
---|---|
Non-intrusive collection of application performance data |
You do not need to modify application code. Instead, you only need to deploy an APM Agent package and modify application startup parameters to monitor applications. |
Application metric monitoring |
APM automatically monitors application metrics, such as JVM, JavaMethod, URL, Exception, Tomcat, HttpClient, MySQL, Redis, and Kafka. |
Application topology |
APM automatically generates call relationships between distributed applications based on dynamic analysis and intelligent computing of remote procedure call (RPC) information. |
Tracing |
After multiple applications are connected to APM, APM automatically samples requests, and collects the call relationships between services and the health status of intermediate calls for automatic tracing. |
Metric drill-down analysis |
APM enables you to drill down and analyze metrics such as application response time, number of requests, and error rate, and view metrics by application, component, environment, database, middleware, or other dimensions. |
Error or slow URL tracing |
APM identifies error or slow URLs based on URL tracing, and automatically associates them with corresponding APIs, such as SQL and MQ APIs. |
- Access to APM: Applications need to implement AK/SK authentication to connect to APM.
- O&M data collection: APM can collect data about applications, basic resources, and user experience from Agents in non-intrusive mode.
- Service implementation: APM supports application metric monitoring, application topology, tracing, and intelligent alarm reporting.
- Service expansion:
- You can quickly diagnose application performance exceptions based on the application topology and tracing of APM, and make judgments based on the application O&M metrics of Application Operations Management (AOM).
- After identifying performance bottlenecks, you can use CodeArts PerfTest to implement association analysis and generate performance reports.
- Based on the historical metric data learned using intelligent algorithms, APM associates metrics for analysis from multiple dimensions, extracts the context data of both normal and abnormal services for comparison, and locates root causes through cluster analysis.
Advantages
Connects to applications without having to modify code, and collects data in a non-intrusive mode.
- APM Agents collect service call, service inventory, and call KPI data.
Delivers high throughput (hundreds of millions of API calls), ensuring premium experience.
Provides open APIs to query O&M data, offers collection standards, and supports independent development.
Reports alarms using Artificial Intelligence (AI) threshold detection and machine learning based on historical baseline data, and supports root cause analysis.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot