ArchiveBox/archivebox/search
Ross Williams 310b4d1242 Add htmltotext extractor
Saves HTML text nodes and selected element attributes in
`htmltotext.txt` for each Snapshot. Primarily intended to be used
for search indexing.
2023-10-23 21:42:32 -04:00
..
backends bail out on sonic indexing after 5 errors 2021-04-10 05:18:03 -04:00
__init__.py refactor: Remove setup_django from search 2020-12-11 16:43:48 -05:00
utils.py Add htmltotext extractor 2023-10-23 21:42:32 -04:00