Dart parse HTML
last modified January 28, 2024
In this article we show how to parse HTML documents in Dart. We use the html package.
$ dart pub add html
We need to add the html
library to the project.
Dart parse local HTML file
In the first example, we parse an HTML file located on the disk.
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>My title</title> </head> <body> <h1>My Document</h1> <p> A simple document. </p> </body> </html>
This is the file to be parsed.
import 'dart:io'; import 'package:html/parser.dart'; void main() { var file = new File('index.html'); var data = file.readAsStringSync(); var doc = parse(data); print(doc.head!.innerHtml); }
We load the file and parse it. We print the inner HTML content of the
head
tag.
import 'dart:io'; import 'package:html/parser.dart';
We import the IO library and the HTML parser.
var file = new File('index.html'); var data = file.readAsStringSync();
We read the contents of the file.
var doc = parse(data);
We parse the data into a document; this document can be processed with various member functions and attributes.
print(doc.head!.innerHtml);
We get and print the contents of the head
tag.
$ dart main.dart <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>My title</title>
Dart parse HTML from web page
In the second example, we parse an HTML document from an external web page.
import 'package:http/http.dart' as http; import 'package:html/parser.dart'; Future<String> fetchData() async { final resp = await http.get(Uri.http('webcode.me')); if (resp.statusCode == 200) { return resp.body; } else { throw Exception('Failed to fetch data'); } } void main() async { var data = await fetchData(); var doc = parse(data); print(doc.querySelector('title')!.text); }
We mave a GET request to the webcode.me
web page and retrieve the
home page. We parse it and get its title.
print(doc.querySelector('title')!.text);
We retrive the text contents of the title
tag with
querySelector
.
$ dart main.dart My html page
Dart html querySelectorAll
The querySelectorAll
member function finds all descendant elements
of this document that match the specified group of selectors.
import 'package:http/http.dart' as http; import 'package:html/parser.dart'; Future<String> fetchData() async { final resp = await http.get(Uri.parse('http://webcode.me/os.html')); if (resp.statusCode == 200) { return resp.body; } else { throw Exception('Failed to fetch data'); } } void main() async { var data = await fetchData(); var doc = parse(data); var lis = doc.querySelectorAll('li'); lis.forEach((e) => {print(e.text)}); }
The program finds all li
tags in the external HTML document and
prints their text contents.
var lis = doc.querySelectorAll('li');
With querySelectorAll
, we get all the li
tags.
lis.forEach((e) => {print(e.text)});
We go through the list and print each element's title.
$ dart main.dart Solaris FreeBSD Debian NetBSD Windows
Source
Dart html library documentation
In this article we have covered HTML parsing in Dart.
Author
List all Dart tutorials.