Residential Proxies
Allowlisted 200M+ IPs from real ISP. Managed/obtained proxies via dashboard.
Proxies
Residential Proxies
Allowlisted 200M+ IPs from real ISP. Managed/obtained proxies via dashboard.
Residential (Socks5) Proxies
Over 200 million real IPs in 190+ locations,
Unlimited Residential Proxies
Use stable, fast, and furious 700K+ datacenter IPs worldwide.
Static Residential proxies
Long-lasting dedicated proxy, non-rotating residential proxy
Dedicated Datacenter Proxies
Use stable, fast, and furious 700K+ datacenter IPs worldwide.
Web Unblocker
View content as a real user with the help of ABC proxy's dynamic fingerprinting technology.
Proxies
API
Proxy list is generated through an API link and applied to compatible programs after whitelist IP authorization
User+Pass Auth
Create credential freely and use rotating proxies on any device or software without allowlisting IP
Proxy Manager
Manage all proxies using APM interface
Proxies
Residential Proxies
Allowlisted 200M+ IPs from real ISP. Managed/obtained proxies via dashboard.
Starts from
$0.77/ GB
Residential (Socks5) Proxies
Over 200 million real IPs in 190+ locations,
Starts from
$0.045/ IP
Unlimited Residential Proxies
Use stable, fast, and furious 700K+ datacenter IPs worldwide.
Starts from
$79/ Day
Rotating ISP Proxies
ABCProxy's Rotating ISP Proxies guarantee long session time.
Starts from
$0.77/ GB
Static Residential proxies
Long-lasting dedicated proxy, non-rotating residential proxy
Starts from
$5/MONTH
Dedicated Datacenter Proxies
Use stable, fast, and furious 700K+ datacenter IPs worldwide.
Starts from
$4.5/MONTH
Knowledge Base
English
繁體中文
Русский
Indonesia
Português
Español
بالعربية
This article systematically analyzes the technical implementation logic of the "follow" function in XPath, explains in detail the differences and application scenarios of the following and following-sibling axes, and provides efficient node positioning strategies and anti-climbing solutions based on engineering practice.
1. Analysis of core concepts
XPath's "follow" functionality is implemented through axes, which are used to locate nodes along a specific direction in the Document Object Model (DOM). The following two types of axes are the most critical:
following axis
Definition: Select all nodes that appear in the document sequence after the current node, regardless of the level and nesting relationship.
Syntax example: //div[@id='header']/following::p
Typical scenarios:
<div id="header">Title</div>
<p>Paragraph 1</p>
<section>
<p>Paragraph 2</p>
</section>
The above expression will match both paragraph 1 and paragraph 2 because both appear after <div id="header">.
following-sibling axis
Definition: Only select the subsequent nodes of the same level as the current node, without crossing levels.
Syntax example: //li[@class='target']/following-sibling::li
Typical scenarios:
<ul>
<li>Project A</li>
<li class="target">Project B</li>
<li>Project C</li>
<li>Project D</li>
</ul>
This expression accurately locates project C and project D, excluding non-same-level nodes.
2. Functional comparison and selection strategy
In addition to the above two axes, the functional differences of other related axes are as follows:
Preceding axis: locates all nodes before the current node in the document order, often used to search the history in reverse order.
Ancestor axis: Traverses all ancestor nodes upwards, suitable for locating the container that wraps the target element.
Selection suggestion:
If you need to search for subsequent content across levels (such as scattered price fields in a product details page), use the following axis first.
When processing structured data such as tables and lists (such as fields in the same row of a financial report), the following-sibling axis is more efficient.
When encountering dynamic ID or class name confusion, you can combine the stable features of adjacent elements (such as the data-testid attribute) with the following axis to achieve precise positioning.
3. Engineering Practice and Anti-climbing Countermeasures
Scenario 1: E-commerce price monitoring
Requirement: Capture the price element after the product title (may be nested in multiple layers of <div>).
Solution:
//h2[contains(text(),'Product Name')]/following::span[@class='price'][1]
Technical points:
By following, you can penetrate the hierarchical restrictions and directly locate the first price tag.
Add [1] index to avoid crawling duplicate content.
Scenario 2: IP protection for high-frequency data collection
Problem: Platforms such as LinkedIn trigger IP blocking for high-frequency requests.
Countermeasures:
Use a proxy IP pool (such as abcproxy's residential proxy) to rotate the request source IP.
Cooperate with the following-sibling axis to reduce invalid requests (precise positioning reduces the number of page parsing times).
Set a random request interval (2-10 seconds) to simulate the human operation rhythm.
Scenario 3: Dynamic rendering page adaptation
Challenge: Pages generated by frameworks such as React/Vue need to wait for JavaScript rendering to complete.
Solution design:
Use Selenium or Playwright to control the headless browser to load the complete DOM.
Combined with explicit wait (WebDriverWait) to ensure the target element has finished loading.
Use the following axis to position dynamically generated recommended content blocks.
4. Performance Optimization and Common Pitfalls
Performance pitfalls:
The full-document scanning nature of the following axis may cause slow queries on large pages.
Optimization plan:
//div[@id='content-area']//following::div[contains(@class,'target')]
Narrow the search by specifying an ancestor node (such as id='content-area').
Dynamic content invalidation:
If XPath fails on some pages, it may be that the DOM is not ready due to asynchronous loading. You need to add a retry mechanism or adjust the waiting strategy.
5. Technological Evolution and Expanded Applications
Smart positioning tools:
Modern browser developer tools support automatic XPath generation, but the generated paths often rely on volatile hierarchical structures. It is recommended to manually optimize them to robust expressions based on axis positioning.
Working with CSS selectors:
Simple scenario: CSS selectors (such as div.target + ul) are preferred.
Complex scenarios: Switch XPath axes to achieve cross-level positioning (such as following-sibling::ul).
AI-assisted positioning:
Some testing tools (such as Testim.io) automatically generate XPath through visual recognition, but their logical readability is poor and requires manual verification and optimization.
As a professional proxy IP service provider, abcproxy provides a variety of high-quality proxy IP products, including residential proxy, data center proxy, static ISP proxy, Socks5 proxy, unlimited residential proxy, suitable for a variety of application scenarios. If you are looking for a reliable proxy IP service, welcome to visit the abcproxy official website for more details.
Featured Posts
Popular Products
Residential Proxies
Allowlisted 200M+ IPs from real ISP. Managed/obtained proxies via dashboard.
Residential (Socks5) Proxies
Over 200 million real IPs in 190+ locations,
Unlimited Residential Proxies
Use stable, fast, and furious 700K+ datacenter IPs worldwide.
Rotating ISP Proxies
ABCProxy's Rotating ISP Proxies guarantee long session time.
Residential (Socks5) Proxies
Long-lasting dedicated proxy, non-rotating residential proxy
Dedicated Datacenter Proxies
Use stable, fast, and furious 700K+ datacenter IPs worldwide.
Web Unblocker
View content as a real user with the help of ABC proxy's dynamic fingerprinting technology.
Related articles
How to improve data acquisition efficiency through wget proxy user
This article analyzes the technical principles and application scenarios of the wget tool combined with the proxy IP, and discusses how to optimize the data acquisition process through the proxy service of abcproxy to improve efficiency and stability.
What is a proxy crawler
This article systematically analyzes the core technical principles, typical application scenarios and practical solutions of proxy crawlers, explores how proxy IP can improve crawler efficiency and stability, and provides technical references for developers.
How to efficiently download images using Python requests
This article explains in detail the core technology of Python request to download pictures, analyzes the anti-crawling response strategy and the application of proxy IP, and helps developers implement efficient and stable picture collection solutions.