docs: restructure documentation

This rewrite follows the principles of https://diataxis.fr/ Co-authored-by: Erik Michelson <github@erik.michelson.eu> Signed-off-by: Philip Molares <philip.molares@udo.edu> Signed-off-by: Erik Michelson <github@erik.michelson.eu>
2025-05-24 03:57:06 -04:00 · 2023-07-02 22:31:04 +02:00 · 2023-07-02 22:31:04 +02:00 · e07cd62596
commit e07cd62596
parent e0dd24ed29
68 changed files with 1163 additions and 315 deletions
--- a/docs/content/concepts/api-auth.md
+++ b/docs/content/concepts/api-auth.md
@ -0,0 +1,64 @@
+# API Authentication
+
+!!! info "Design Document"
+    This is a design document, explaining the design and vision for a HedgeDoc 2
+    feature. It is not a user guide and may or may not be fully implemented.
+
+## Public API
+
+All requests to the public API require authentication using a [bearer token][bearer-token].
+
+This token can be generated using the profile page in the frontend
+(which in turn uses the private API to generate the token).
+
+### Token generation
+
+When a new token is requested via the private API, the backend generates a 64 bytes-long secret of
+cryptographically secure data and returns it as a base64url-encoded string,
+along with an identifier. That string can then be used by clients as a bearer token.
+
+A SHA-512 hash of the secret is stored in the database. To validate tokens, the backend computes
+the hash of the providedsecret and checks it against the stored hash for the provided identifier.
+
+#### Choosing a hash function
+
+Unfortunately, there does not seem to be any explicit documentation about our exact use-case.
+Most docs describe classic password-saving scenarios and recommend bcrypt, scrypt or argon2.
+These hashing functions are slow to stop brute-force or dictionary attacks, which would expose
+the original, user-provided password, that may have been reused across multiple services.
+
+We have a very different scenario:
+Our API tokens are 64 bytes of cryptographically strong pseudorandom data.
+Brute-force or dictionary attacks are therefore virtually impossible, and tokens are not
+reused across multiple services.
+We therefore need to only guard against one scenario:
+An attacker gains read-only access to the database. Saving only hashes in the database prevents the
+attacker from authenticating themselves as a user. The hash-function does not need to be very slow,
+as the randomness of the original token prevents inverting the hash. The function actually needs to
+be reasonably fast, as the hash must be computed on every request to the public API.
+SHA-512 (or alternatively SHA3) fits this use-case.
+
+## Private API
+
+The private API uses a session cookie to authenticate the user.
+Sessions are handled using [passport.js](https://www.passportjs.org/).
+
+The backend hands out a new session token after the user has successfully authenticated
+using one of the supported authentication methods:
+
+- Username & Password (`local`)
+- LDAP
+- SAML
+- OAuth2
+- GitLab
+- GitHub
+- Facebook
+- Twitter
+- Dropbox
+- Google
+
+The `SessionGuard`, which is added to each (appropriate) controller method of the private API,
+checks if the provided session is still valid and provides the controller method
+with the correct user.
+
+[bearer-token]: https://datatracker.ietf.org/doc/html/rfc6750
--- a/docs/content/concepts/config.md
+++ b/docs/content/concepts/config.md
@ -0,0 +1,101 @@
+# Config
+
+!!! info "Design Document"
+    This is a design document, explaining the design and vision for a HedgeDoc 2
+    feature. It is not a user guide and may or may not be fully implemented.
+
+The configuration of HedgeDoc 2 is handled entirely by environment variables.
+Most of these variables are prefixed with `HD_` (for HedgeDoc).
+NestJS - the framework we use - is reading the variables from the environment and also from
+the `.env` file in the root of the project.
+
+## How the config code works
+
+The config of HedgeDoc is split up into **nine** different modules:
+
+`app.config.ts`
+: General configuration of the app
+
+`auth.config.ts`
+: Which authentication providers are available and which options are set
+
+`csp.config.ts`
+: Configuration for [Content Security Policy][csp]
+
+`customization.config.ts`
+: Config to customize the instance and set instance specific links
+
+`database.config.ts`
+: Which database should be used
+
+`external-services.config.ts`
+: Which external services are activated and where can they be called
+
+`hsts.config.ts`
+: Configuration for [HTTP Strict-Transport-Security][hsts]
+
+`media.config.ts`
+: Where media files are being stored
+
+`note.config.ts`
+: Configuration for notes
+
+Each of those files (except `auth.config.ts` which is discussed later) consists of three parts:
+
+1. An interface
+2. A Joi schema
+3. A default export
+
+### Interface
+
+The interface just describes which options the configuration has and how the rest of HedgeDoc can
+use them. All enums that are used in here are put in their own files with the extension `.enum.ts`.
+
+### Joi Schema
+
+We use [Joi][joi] to validate each provided configuration to make sure the configuration of the user
+is sound and provides helpful error messages otherwise.
+
+The most important part here is that each value ends with `.label()`. This names the
+environment variable that corresponds to each config option. It's very important that each config
+option is assigned the correct label to have meaningful error messages that benefit the user.
+
+Everything else about how Joi works and how you should write schemas can
+be read in [their documentation][joi-doc].
+
+### A default export
+
+The default exports are used by NestJS to provide the values to the rest of the application.
+We mostly do four things here:
+
+1. Populate the config interface with environment variables, creating the config object.
+2. Validate the config object against the Joi schema.
+3. Polish the error messages from Joi and present them to the user (if any occur).
+4. Return the validated config object.
+
+## How `auth.config.ts` works
+
+Because it's possible to configure some authentication providers multiple times
+(e.g. multiple LDAPs or GitLabs), we use user defined environment variable names.
+With the user defined names it's not possible to put the correct labels in the schema
+or build the config objects as we do in every other file.
+
+Therefore, we have two big extra steps in the default export:
+
+1. To populate the config object we have some code at the top of the default export to gather all
+   configured variables into arrays.
+2. The error messages are piped into the util method `replaceAuthErrorsWithEnvironmentVariables`.
+   This replaces the error messages of the form `gitlab[0].providerName`
+   with `HD_AUTH_GITLAB_<nameOfFirstGitlab>_PROVIDER_NAME`. For this the util function gets
+   the error, the name of the config option (e.g `'gitlab'`), the approriate prefix
+   (e.g. `'HD_AUTH_GITLAB_'`), and an array of the user defined names.
+
+## Mocks
+
+Some config files also have a `.mock.ts` file which defines the configuration for the e2e tests.
+Those files just contain the default export and return the mock config object.
+
+[csp]: https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP
+[hsts]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Strict-Transport-Security
+[joi]: https://joi.dev/
+[joi-doc]: https://joi.dev/api
--- a/docs/content/concepts/events.md
+++ b/docs/content/concepts/events.md
@ -0,0 +1,18 @@
+# Events
+
+!!! info "Design Document"
+    This is a design document, explaining the design and vision for a HedgeDoc 2
+    feature. It is not a user guide and may or may not be fully implemented.
+
+In HedgeDoc 2, we use an event system based on [EventEmitter2][eventemitter2].
+It's used to reduce circular dependencies between different services and inform these services
+about changes.
+
+HedgeDoc's system is basically [the system NestJS offers][nestjs/eventemitter].  
+The config for the `EventEmitterModule` is stored in `events.ts` and
+exported as `eventModuleConfig`. In the same file enums for the event keys are defined.
+Each of these events is expected to be sent with an additional value.
+In the enum definition a comment should tell you what exactly this value should be.
+
+[eventemitter2]: https://github.com/EventEmitter2/EventEmitter2
+[nestjs/eventemitter]: https://docs.nestjs.com/techniques/events
--- a/docs/content/concepts/index.md
+++ b/docs/content/concepts/index.md
@ -0,0 +1,39 @@
+# Core concepts
+
+Core concepts explain the internal structure of HedgeDoc by providing
+background information and explanations. They are especially useful for contributing to HedgeDoc.
+
+<!-- markdownlint-disable no-inline-html -->
+<div class='topic-container'>
+    <a href='/concepts/notes/'>
+        <div class='topic'>
+            <span>📝</span>
+            <span>Notes</span>
+        </div>
+    </a>
+    <a href='/concepts/user-profiles/'>
+        <div class='topic'>
+            <span>🙎</span>
+            <span>User Profiles</span>
+        </div>
+    </a>
+    <a href='/concepts/config/'>
+        <div class='topic'>
+            <span>🛠️</span>
+            <span>Config</span>
+        </div>
+    </a>
+    <a href='/concepts/api-auth/'>
+        <div class='topic'>
+            <span>🤖️</span>
+            <span>API Auth</span>
+        </div>
+    </a>
+    <a href='/concepts/events/'>
+        <div class='topic'>
+            <span>🎩</span>
+            <span>Events</span>
+        </div>
+    </a>
+</div>
+<!-- markdownlint-enable no-inline-html -->
--- a/docs/content/concepts/notes.md
+++ b/docs/content/concepts/notes.md
@ -0,0 +1,109 @@
+# Notes
+
+!!! info "Design Document"
+    This is a design document, explaining the design and vision for a HedgeDoc 2
+    feature. It is not a user guide and may or may not be fully implemented.
+
+Each note in HedgeDoc 2 contains the following information:
+
+- publicId (`b604x5885k9k01bq7tsmawvnp0`)
+  <!-- markdownlint-disable proper-names -->
+- a list of aliases (`[hedgedoc-2, hedgedoc-next]`)
+  <!-- markdownlint-enable proper-names -->
+- groupPermissions
+- userPermissions
+- viewCount (`0`)
+- owner
+- revisions
+- authorColors
+- historyEntries
+- description (`All you never wanted to know about notes`)
+- title (`Notes`)
+- tags (`[features, cool, update]`)
+- version
+
+The `publicId` is the default possibility of identifying a note. It will be a randomly generated
+128-bit value encoded with [base32-encode][base32-encode] using the crockford variant and converted
+to lowercase. This variant of base32 is used, because that results in ids that only use one case of
+alpha-numeric characters and other url safe characters. We convert the id to lowercase, because we
+want to minimize case confusion.
+
+`aliases` are the other way of identifying a note. There can be any number of them, and the owner
+of the note is able to add or remove them. All aliases are just strings (especially to accommodate
+the old identifier from HedgeDoc 1 [see below](#conversion-of-hedgedoc-1-notes)), but new aliases
+added with HedgeDoc 2 will only allow characters matching this regex: `[a-z0-9\-_]`. This is done to
+once again prevent case confusion. One of the aliases can be set as the primary alias, which will be
+used as the identifier for the history entry.
+
+`groupPermissions` and `userPermissions` each hold a list of the appropriate permissions.
+Each permission holds a reference to a note and a user/group and specify what the user/group
+is allowed to do.
+Each permission is additive, that means a user that has only the right to read a note via a group,
+but the right to write via a different group or directly for his user, is able to write in the note.
+
+The `viewCount` is a simple counter that holds how often the read-only view of the note in question
+was requested.
+
+`owner` is the user that created the note or later got ownership of the note. The current owner is
+able to change the owner of the note to someone else. The owner of a note is the only person that
+can perform the following actions:
+
+- delete the note
+- modify `aliases`
+- remove all `revisions`
+
+The `revisions` hold all revisions of the note. These are the changes to the note content and by
+whom they were performed.
+
+The `authorColors` each specify for the tuple user and note which color should be used
+to highlight them.
+
+The `historyEntries` hold the history entries this note is referenced in. They are mainly here
+for the purpose of deleting the history entries on note deletion.
+
+`description`, `tags` and `title` are each information specified in the [frontmatter][frontmatter]
+of the note. They are extracted and saved in the database to allow the history page to show them and
+do a search for tags without having to do a full-text search or having to parse the tags of
+each note on search.
+While `description` and `tags` are only specified by the [frontmatter][frontmatter], the title is
+
+- the content of the *title* field of the [frontmatter][frontmatter] of the note
+- **OR** the content of the *title* field in the *opengraph* field of the [frontmatter][frontmatter]
+  of the note
+- **OR** the first level 1 heading of the note
+
+which ever of these is the first to not be unspecified.  
+All mentioned fields are extracted from the note content by the backend on save or update.
+
+`version` specifies if a note is an old HedgeDoc 1 note, or a new HedgeDoc 2 note.
+This is mainly used to redirect old notes form <https://md.example.org/noteid>
+to <https://md.example.org/n/noteid>.
+
+## Deleting Notes
+
+- The owner of a note may delete it.
+  - By default, this also removes all revisions and all files that were uploaded to that note.
+  - The owner may choose to skip deleting associated uploads, leaving them without a note.
+  - The frontend should show a list of all uploads that will be affected
+    and provide a method of skipping deletion.
+- The owner of a note may delete all revisions. This effectively purges the edit
+  history of a note.
+
+## Conversion of HedgeDoc 1 notes
+
+First we want to define some terms of the HedgeDoc 1 notes:
+
+- **noteId**: This refers to the auto-generated id for new notes.
+  (<https://demo.hedgedoc.org/Q_Iz5T_lQWGYxne0sbMtwg>)
+
+- **shortId**: This refers to the auto-generated short id which is used for "published" notes and
+  slide presentation mode. (<https://demo.hedgedoc.org/s/61ZHI6HGE>)
+
+- **alias**: This refers to user-defined URLs for notes on instances with Free-URL mode enabled.
+  (<https://md.kif.rocks/lowercase>)
+
+The noteId, shortId and alias of each HedgeDoc 1 note are saved as HedgeDoc 2 aliases.
+Each note gets a newly generated publicId.
+
+[frontmatter]: https://jekyllrb.com/docs/front-matter/
+[base32-encode]: https://www.npmjs.com/package/base32-encode
--- a/docs/content/concepts/user-profiles.md
+++ b/docs/content/concepts/user-profiles.md
@ -0,0 +1,99 @@
+# User Profiles and Authentication
+
+!!! info "Design Document"
+    This is a design document, explaining the design and vision for a HedgeDoc 2
+    feature. It is not a user guide and may or may not be fully implemented.
+
+Each user in HedgeDoc 2 has a profile
+which contains the following information:
+
+- username (`janedoe`)
+- display name (`Jane Doe`)
+- email address, optional (`janedoe@example.com`)
+- profile picture, optional
+- the date the user was created
+- the date the profile was last updated
+
+HedgeDoc 2 supports multiple authentication methods per user.  
+These are called *identities* and each identity is backed by an
+auth provider (like OAuth, SAML, LDAP or internal auth).
+
+One of a users identities may be marked as *sync source*.  
+This identity is used to automatically update profile attributes like the
+display name or profile picture on every login. If a sync source exists, the
+user can not manually edit their profile information.
+If an external provider was used to create the account,
+it is automatically used as sync source.
+
+The administrator may globally set one or more auth providers as sync source,
+e.g. to enforce that all profile information comes from the corporate
+LDAP and is the same across multiple applications.  
+If global sync sources exist, new accounts can only be created using
+these auth providers. The auth provider that was used to create the account
+is automatically set as sync source and cannot be changed by the user.
+This effectively pins the account to this provider.
+
+## Example: Corporate LDAP
+
+The administrator wants to allow users to log in via the corporate LDAP
+and Google. Login must only be possible for users present in LDAP and
+all users must be displayed as they are in the LDAP.
+
+The admin therefore sets up two login providers:
+
+- corporate LDAP, marked as global sync source
+- Google OAuth login
+
+If a new user tries to log in via Google, they will not be found in the
+database. The frontend detects that a global sync source exists and
+suggests logging in via LDAP first.
+
+After a new user created their account by logging in via LDAP, they can use
+the 'add a new login method' feature in their profile page to link their
+Google account and use it to login afterwards.
+
+## Example: Username Conflict
+
+HedgeDoc is configured with two auth providers.
+
+- A user logs in using auth provider A.
+- The backend receives the profile information from provider A and notices that the username
+  in the profile already exists in the database, but no identity for this provider-username
+  combination exists.
+- The backend creates a new user with another username to solve the username conflict.
+- The frontend warns the user that the username provided by the auth provider is already taken
+  and that another username has been generated. It also offers to instead link the new auth provider
+  (in this case A) with the existing auth provider (in this case B).
+- If the user chooses the latter option, the frontend sends a request to delete the newly created
+  user to the backend.
+- The user can then log in with auth provider B and link provider A using the "link auth provider"
+  feature in the profile page.
+
+### Handling of sync sources and username conflicts
+
+#### Global sync sources
+
+If at the time of logging in with auth provider A, *only* A is configured as a *global* sync source,
+the backend cannot automatically create a user with another username.
+
+This is because:
+
+- Creating new accounts is only possible with a sync source auth provider.
+- Setting an auth provider as sync-source entails that profile information the auth provider
+  provides must be saved into the local HedgeDoc database.
+- As the username the auth provider provides already exists in the database, a new user cannot
+  be created with that username.
+
+In this case, the frontend should show the use a notice that they should contact an admin
+to resolve the issue.
+
+!!! warning
+    Admins must ensure that usernames are unique across all auth providers set as a global sync
+    source. Otherwise, if e.g. in both LDAPs configured as sync source a user `johndoe` exists,
+    only the first that logs in can use HedgeDoc.
+
+#### Local sync sources
+
+If auth provider A is configured as a sync source by the user, syncing is automatically disabled,
+and a notice is shown. Re-enabling the sync source is not possible until the username conflict is
+resolved, e.g. by changing the username in the auth provider.