mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2025-05-27 13:14:24 -04:00
Merge pull request #107 from f0086/import-only-new-links
Optionally import only new links
This commit is contained in:
commit
678ce229c4
4 changed files with 32 additions and 6 deletions
|
@ -142,6 +142,11 @@ You can run it in parallel by using the `resume` feature, or by manually splitti
|
|||
```
|
||||
Users have reported running it with 50k+ bookmarks with success (though it will take more RAM while running).
|
||||
|
||||
If you already imported a huge list of bookmarks and want to import only new
|
||||
bookmarks, you can use the `ONLY_NEW` environment variable. This is useful if
|
||||
you want to import a bookmark dump periodically and want to skip broken links
|
||||
which are already in the index.
|
||||
|
||||
## Configuration
|
||||
|
||||
You can tweak parameters via environment variables, or by editing `config.py` directly:
|
||||
|
@ -160,6 +165,7 @@ env CHROME_BINARY=google-chrome-stable RESOLUTION=1440,900 FETCH_PDF=False ./arc
|
|||
|
||||
**Archive Options:**
|
||||
- maximum allowed download time per link: `TIMEOUT` values: [`60`]/`30`/`...`
|
||||
- import only new links: `ONLY_NEW` values `True`/[`False`]
|
||||
- archive methods (values: [`True`]/`False`):
|
||||
- fetch page with wget: `FETCH_WGET`
|
||||
- fetch images/css/js with wget: `FETCH_WGET_REQUISITES` (True is highly recommended)
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue