From f184a5522f5c7b926b5cabbacea9e5345bebe61b Mon Sep 17 00:00:00 2001 From: Nick Sweeting Date: Tue, 30 Jan 2024 02:02:02 -0800 Subject: [PATCH] more small README changes --- README.md | 58 +++++++++++++++++++++++++++---------------------------- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/README.md b/README.md index 9dd8a1af..6d2f6c62 100644 --- a/README.md +++ b/README.md @@ -611,20 +611,20 @@ docker run -it -v $PWD:/data archivebox/archivebox add --depth=1 'https://exampl - The official ArchiveBox Browser Extension - Provides realtime archiving of all browsing history or selected pages only from Chrome/Chromium/Firefox browsers + Provides realtime archiving of browsing history or selected pages from Chrome/Chromium/Firefox browsers -- Manual imports of URLs from RSS, JSON, CSV, TXT, SQL, HTML, Markdown - ArchiveBox supports injecting URLs in [any other text-based format...](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Import-a-list-of-URLs-from-a-text-file) +- Manual imports of URLs from RSS, JSON, CSV, TXT, SQL, HTML, Markdown, etc. files + ArchiveBox supports injesting URLs in [any text-based format...](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Import-a-list-of-URLs-from-a-text-file) -- Exported [browser history](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive) or [browser bookmarks](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive) from any browser +- Manually exported [browser history](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive) or [browser bookmarks](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive) from any browser See instructions for: Chrome, Firefox, Safari, IE, Opera, and more... -- Links exported from bookmarking services or social media sites (e.g. Twitter bookmarks, Reddit saved posts, etc.) - See instructions for: Pocket, Pinboard, Instapaper, Shaarli, Delicious, Reddit Saved, Wallabag, Unmark.it, OneTab, Firefox Sync, and more... - - [MITM Proxy](https://mitmproxy.org/) archiving with [`archivebox-proxy`](https://github.com/ArchiveBox/archivebox-proxy) Provides [realtime archiving](https://github.com/ArchiveBox/ArchiveBox/issues/577) of all traffic from any device going through the proxy. +- Links from bookmarking services or social media (e.g. Twitter bookmarks, Reddit saved posts, etc.) + See instructions for: Pocket, Pinboard, Instapaper, Shaarli, Delicious, Reddit Saved, Wallabag, Unmark.it, OneTab, Firefox Sync, and more... + @@ -679,7 +679,7 @@ It uses all available methods out-of-the-box, but you can disable extractors and
  • Article Text: article.html/json Article text extraction using Readability & Mercury
  • Archive.org Permalink: archive.org.txt A link to the saved site on archive.org
  • -
  • Audio & Video: media/ all audio/video files + playlists, including subtitles & metadata with youtube-dl (or yt-dlp)
  • +
  • Audio & Video: media/ all audio/video files + playlists, including subtitles & metadata w/ yt-dlp
  • Source Code: git/ clone of any repository found on GitHub, Bitbucket, or GitLab links
  • More coming soon! See the Roadmap...
  • @@ -737,7 +737,7 @@ To achieve high-fidelity archives in as many situations as possible, ArchiveBox > Under-the-hood, ArchiveBox uses [Django](https://www.djangoproject.com/start/overview/) to power its [Web UI](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#ui-usage) and [SQlite](https://www.sqlite.org/locrsf.html) + the filesystem to provide [fast & durable metadata storage](https://www.sqlite.org/locrsf.html) w/ [determinisitc upgrades](https://stackoverflow.com/a/39976321/2156113). -For the actual archiving, ArchiveBox bundles industry-standard tools like [Google Chrome](https://github.com/ArchiveBox/ArchiveBox/wiki/Chromium-Install), [`wget`, `yt-dlp`, `readability`, etc.](#dependencies) internally, and its operation can be [tuned, secured, and extended](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration) as-needed for many different applications. +ArchiveBox bundles industry-standard tools like [Google Chrome](https://github.com/ArchiveBox/ArchiveBox/wiki/Chromium-Install), [`wget`, `yt-dlp`, `readability`, etc.](#dependencies) internally, and its operation can be [tuned, secured, and extended](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration) as-needed for many different applications.
    @@ -788,7 +788,7 @@ Installing directly on **Windows without Docker or WSL/WSL2/Cygwin is not offici ## Archive Layout -All of ArchiveBox's state (SQLite DB, archived assets, config, logs, etc.) is stored in a single folder (`data/`). +All of ArchiveBox's state (SQLite DB, content, config, logs, etc.) is stored in a single folder per collection.
    @@ -824,11 +824,11 @@ Each snapshot subfolder ./archive/TIMESTAMP/ includes a static Learn More
    @@ -864,9 +864,9 @@ The paths in the static exports are relative, make sure to keep them next to you

    Learn More

    @@ -917,11 +917,11 @@ archivebox config --set CHROME_BINARY=chromium # ensure it's using Chromium

    Learn More

    @@ -954,10 +954,10 @@ https://127.0.0.1:8000/archive/*

    Learn More

    @@ -975,7 +975,7 @@ For various reasons, many large sites (Reddit, Twitter, Cloudflare, etc.) active