Use XPath selectors to find elements and nodes
Problem
You want to find or manipulate elements using Xpath selectors.
Solution
Use the Element.selectXpath(String xpath)
and Element.selectXpath(String, Class<T>)
methods:
Document doc = Jsoup.connect("https://jsoup.org/").get();
Elements elements = doc.selectXpath("//div[@class='col1']/p");
// Each P element in div.col1
List<TextNode> textNodes = doc.selectXpath("//a/text()", TextNode.class);
// Each TextNode in every A element
Description
jsoup supports XPath selectors using the Element.selectXpath(String xpath)
method. By default, XPath 1.0 expressions are supported. You can also provide an alternate XPathFactory implementation for other versions.
The Element.selectXpath(String, Class<T>)
method enables selecting for specific node types, such as TextNode
, DataNode
, LeafNode
etc.
You can experiment with different XPath selectors on Try jsoup.
This XPath cheatsheet has helpful comparisons between CSS and XPath selectors.
Cookbook
Introduction
Input
- Parse a document from a String
- Parsing a body fragment
- Load a Document from a URL
- Load a Document from a File
Extracting data
- Use DOM methods to navigate a document
- Use CSS selectors to find elements
- Use XPath selectors to find elements and nodes
- Extract attributes, text, and HTML from elements
- Working with relative and absolute URLs
- Example program: list links