mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2025-05-23 11:17:02 -04:00
Update README.md
This commit is contained in:
parent
07db61bf4c
commit
21b28d392c
1 changed files with 10 additions and 10 deletions
20
README.md
20
README.md
|
@ -37,20 +37,20 @@ google-chrome --version && which wget && which python3 && echo "[√] All depend
|
||||||
|
|
||||||
**2. Run the archive script:**
|
**2. Run the archive script:**
|
||||||
|
|
||||||
1. Download your export file e.g. `ril_export.html` from https://getpocket.com/export
|
1. Get your HTML export file from [Pocket](https://getpocket.com/export), [Pinboard](https://pinboard.in/export/), [Chrome Bookmarks](https://support.google.com/chrome/answer/96816?hl=en), [Firefox Bookmarks](https://support.mozilla.org/en-US/kb/export-firefox-bookmarks-to-backup-or-transfer), or [Safari Bookmarks](http://i.imgur.com/AtcvUZA.png)
|
||||||
2. Clone the repo `git clone https://github.com/pirate/pocket-archive-stream`
|
2. Clone this repo `git clone https://github.com/pirate/pocket-archive-stream`
|
||||||
3. `cd pocket-archive-stream/`
|
3. `cd pocket-archive-stream/`
|
||||||
4. `./archive.py ~/Downloads/ril_export.html [pocket|pinboard|bookmarks]`
|
4. `./archive.py ~/Downloads/exported_file.html [pocket|pinboard|chrome]`
|
||||||
|
|
||||||
It produces a folder `pocket/` containing an `index.html`, and archived copies of all the sites,
|
It produces a folder `archive/` containing an `index.html`, and archived copies of all the sites,
|
||||||
organized by timestamp. For each sites it saves:
|
organized by starred timestamp. For each sites it saves:
|
||||||
|
|
||||||
- wget of site, e.g. `en.wikipedia.org/wiki/Example.html` with .html appended if not present
|
- wget of site, e.g. `en.wikipedia.org/wiki/Example.html` with .html appended if not present
|
||||||
- `sreenshot.png` 1440x900 screenshot of site using headless chrome
|
- `sreenshot.png` 1440x900 screenshot of site using headless chrome
|
||||||
- `output.pdf` Printed PDF of site using headless chrome
|
- `output.pdf` Printed PDF of site using headless chrome
|
||||||
- `archive.org.txt` A link to the saved site on archive.org
|
- `archive.org.txt` A link to the saved site on archive.org
|
||||||
|
|
||||||
You can tweak parameters like screenshot size, file paths, timeouts, etc. in `archive.py`.
|
You can tweak parameters like screenshot size, file paths, timeouts, dependencies, at the top of `archive.py`.
|
||||||
You can also tweak the outputted html index in `index_template.html`. It just uses python
|
You can also tweak the outputted html index in `index_template.html`. It just uses python
|
||||||
format strings (not a proper templating engine like jinja2), which is why the CSS is double-bracketed `{{...}}`.
|
format strings (not a proper templating engine like jinja2), which is why the CSS is double-bracketed `{{...}}`.
|
||||||
|
|
||||||
|
@ -80,14 +80,14 @@ will run fast subsequent times because it only downloads new links that haven't
|
||||||
## Publishing Your Archive
|
## Publishing Your Archive
|
||||||
|
|
||||||
The archive is suitable for serving on your personal server, you can upload the
|
The archive is suitable for serving on your personal server, you can upload the
|
||||||
archive to `/var/www/pocket` (or pinboard) and allow people to access your saved copies of sites.
|
archive to `/var/www/archive` and allow people to access your saved copies of sites.
|
||||||
|
|
||||||
|
|
||||||
Just stick this in your nginx config to properly serve the wget-archived sites:
|
Just stick this in your nginx config to properly serve the wget-archived sites:
|
||||||
|
|
||||||
```nginx
|
```nginx
|
||||||
location /pocket/ {
|
location /archive/ {
|
||||||
alias /var/www/pocket/;
|
alias /var/www/archive/;
|
||||||
index index.html;
|
index index.html;
|
||||||
autoindex on;
|
autoindex on;
|
||||||
try_files $uri $uri/ $uri.html =404;
|
try_files $uri $uri/ $uri.html =404;
|
||||||
|
@ -96,7 +96,7 @@ location /pocket/ {
|
||||||
|
|
||||||
Make sure you're not running any content as CGI or PHP, you only want to serve static files!
|
Make sure you're not running any content as CGI or PHP, you only want to serve static files!
|
||||||
|
|
||||||
Urls look like: `https://sweeting.me/pocket/archive/1493350273/en.wikipedia.org/wiki/Dining_philosophers_problem`
|
Urls look like: `https://sweeting.me/archive/archive/1493350273/en.wikipedia.org/wiki/Dining_philosophers_problem`
|
||||||
|
|
||||||
## Info
|
## Info
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue