Known Bot Detection
Known bot detection is the first step. It compares the user agent (UA) keywords carried in user requests with the UA signature database in bot protection. If a request is from a known bot (known client), the request will be handled based on the configured protective action.
Based on the open-source UA signature intelligence on the Internet and the UA signature library of WAF for anti-crawler protection, WAF can detect 10 types of known bots.
|
Type |
Description |
|---|---|
|
Search engine bots |
Search engines use web crawlers to aggregate and index online content (such as web pages, images, and other file types). They provide search results in real time. |
|
Online scanners |
An online scanner typically scans assets on the Internet for viruses or vulnerabilities that are caused by configuration errors or programming defects and exploits such weak points to launch attacks. Typical scanners include Nmap, sqlmap, and WPSec. |
|
Web crawlers |
Popular crawler tools or services on the Internet. They are often used to capture any web page and extract content to meet user requirements. Scrapy, Pyspider, and Prerender are typical ones. |
|
Website development and monitoring bots |
Some companies use robots to provide services and help web developers monitor status of their sites. These bots can check the availability of links and domain names, connections and web page loading time for requests from different geographical locations, DNS resolution issues, and other functions. |
|
Business analysis and marketing bots |
A company offering business analysis and marketing services utilizes bots to evaluate website content, conduct audience and competitor analysis, support online advertising and marketing campaigns, and optimize website or web page rankings in search engine results. |
|
News and social media bots |
News and social media platforms allow users to browse hot news, share ideas, and interact with each other online. Many enterprises' marketing strategies include operating pages on these websites and interacting with consumers about products or services. Some companies use robots to collect data from these platforms for insights into media trends and products, enriching network experience. |
|
Screenshot bots |
Some companies use bots to provide website screenshot services. It can take complete long-screen screenshots of online content such as posts on websites and social networks, news, and posts on forums and blogs. |
|
Academic and research bots |
Some universities and companies use bots to collect data from various websites for academic or research purposes, including reference search, semantic analysis, and specific types of search engines. |
|
RSS feed reader |
RSS uses the standard XML web feed format to publish content. Some Internet services use bots to aggregate information from RSS feeds. |
|
Online archiver |
Some organizations such as Wikipedia use bots to periodically crawl and archive valuable online information and content copies. These web archiving services are very similar to search engines, but the data provided is not up-to-date. They are mainly used for research. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot