mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2025-05-20 18:05:19 -04:00
Update README.md
This commit is contained in:
parent
67baea172e
commit
28e85e0b95
1 changed files with 5 additions and 5 deletions
10
README.md
10
README.md
|
@ -1070,13 +1070,13 @@ Because ArchiveBox is designed to ingest a large volume of URLs with multiple co
|
|||
<br/>
|
||||
|
||||
<ul>
|
||||
<li><strong>ArchiveBox can use anywhere from ~1gb per 1000 articles, to ~50gb per 1000 articles</strong>, mostly dependent on whether you're saving audio & video using <code>SAVE_MEDIA=True</code> and whether you lower <code>MEDIA_MAX_SIZE=750mb</code>.</li>
|
||||
<li>Disk usage can be reduced by using a compressed/[deduplicated](https://www.ixsystems.com/blog/ixsystems-and-klara-systems-celebrate-valentines-day-with-a-heartfelt-donation-of-fast-dedupe-to-openzfs-and-truenas/) filesystem like <a href="https://www.reddit.com/r/zfs/comments/t9cexx/a_simple_real_world_zfs_compression_speed_an/">ZFS</a>/BTRFS, or by turning off extractors methods you don't need. You can also deduplicate content with a tool like <a href="https://github.com/adrianlopezroche/fdupes">fdupes</a> or <a href="https://github.com/pauldreik/rdfind">rdfind</a>.
|
||||
<li><strong>ArchiveBox can use anywhere from ~1gb per 1000 Snapshots, to ~50gb per 1000 Snapshots</strong>, mostly dependent on whether you're saving audio & video using <code>SAVE_MEDIA=True</code> and whether you lower <code>MEDIA_MAX_SIZE=750mb</code>.</li>
|
||||
<li>Disk usage can be reduced by using a compressed/<a href="https://www.ixsystems.com/blog/ixsystems-and-klara-systems-celebrate-valentines-day-with-a-heartfelt-donation-of-fast-dedupe-to-openzfs-and-truenas/">deduplicated</a> filesystem like <a href="https://www.reddit.com/r/zfs/comments/t9cexx/a_simple_real_world_zfs_compression_speed_an/">ZFS</a>/BTRFS, or by turning off extractors methods you don't need. You can also deduplicate content with a tool like <a href="https://github.com/adrianlopezroche/fdupes"><code>fdupes</code></a> or <a href="https://github.com/pauldreik/rdfind"><code>rdfind</code></a>.
|
||||
</li>
|
||||
<li><strong>Don't store large collections on older filesystems like EXT3/FAT</strong> as they may not be able to handle more than 50k directory entries in the `data/archive/` folder.
|
||||
<li><strong>Don't store large collections on older filesystems like EXT3/FAT</strong> as they may not be able to handle more than 50k directory entries in the <code>data/archive/</code> folder.
|
||||
</li>
|
||||
<li><strong>Try to keep the `data/index.sqlite3` file on local drive (not a network mount)</strong> or SSD for maximum performance, however the `data/archive/` folder can be on a network mount or slower HDD.</li>
|
||||
<li>If using Docker or NFS/SMB/FUSE for the `data/archive/` folder, you may need to set [`PUID` & `PGID`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#puid--pgid) and [disable `root_squash`](https://github.com/ArchiveBox/ArchiveBox/issues/1304) on your fileshare server.
|
||||
<li><strong>Try to keep the <code>data/index.sqlite3</code> file on local drive (not a network mount)</strong> or SSD for maximum performance, however the <code>data/archive/</code> folder can be on a network mount or slower HDD.</li>
|
||||
<li>If using Docker or NFS/SMB/FUSE for the `data/archive/` folder, you may need to set <a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#puid--pgid"><code>PUID</code> & <code>PGID</code></a> and <a href="https://github.com/ArchiveBox/ArchiveBox/issues/1304">disable <code>root_squash</code></a> on your fileshare server.
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue