aboutsummaryrefslogtreecommitdiff
path: root/keyserver
AgeCommit message (Collapse)Author
2022-10-27Refactor `claimRemoteKeys`Neil Alexander
2022-10-24Fix slow querying of cross-signing signaturesNeil Alexander
2022-10-20Mutex protect query keys response (#2812)devonh
2022-10-19Fix lock contentionNeil Alexander
2022-10-19Fix concurrent map write in key serverNeil Alexander
2022-10-07Add test for `QueryDeviceMessages` (#2773)Till
Adds tests for `QueryDeviceMessages` and also includes some optimizations to reduce allocations in the DB layer.
2022-10-06Always return `one_time_key_counts` on `/keys/upload` (#2769)Till
The OTK count is [required](https://spec.matrix.org/v1.4/client-server-api/#post_matrixclientv3keysupload) in responses to `/keys/upload`, so return those.
2022-10-05Demote `Failed to query device keys for some users` warning to `level=debug`Neil Alexander
Many of these warnings are due to dead servers and are quite annoying when they fill up the logs.
2022-10-03Stop CPU burn in `PerformMarkAsStaleIfNeeded`Neil Alexander
2022-09-30Allow more time for device list updates (#2749)Neil Alexander
This updates the device list updater so that it has a context per-request, rather than a global 30 seconds for the entire server. This could mean that talking to a slow remote server or requesting a lot of user IDs was pretty much guaranteed to fail. It also uses the process context to allow correct cancellation when Dendrite wants to shut down cleanly.
2022-09-30Add `/_dendrite/admin/refreshDevices/{userID}` (#2746)Till
Allows to immediately query `/devices/{userID}` over federation to (hopefully) resolve E2EE issues.
2022-09-20Mark device list as stale, if we don't have the requesting device (#2728)Till
This hopefully makes E2EE chats a little bit more reliable by re-syncing devices if we don't have the `requesting_device_id` in our database. (As seen in [Synapse](https://github.com/matrix-org/synapse/blob/c52abc1cfdd9e5480cdb4a03d626fe61cacc6573/synapse/handlers/devicemessage.py#L157-L201))
2022-09-13Check unique constraint errors when manually inserting migrations (#2712)Till
This should avoid unnecessary logging on startup if the migration (were we need `InsertMigration`) was already executed. This now checks for "unique constraint errors" for SQLite and Postgres and fails the startup process if the migration couldn't be manually inserted for some other reason.
2022-09-09Fix database transaction for keyserver `DeleteDeviceKeys`Neil Alexander
2022-09-09Change detection of already executed migrations (#2665)Till
This changes the detection of already executed migrations for the roomserver state block and keychange refactor. It now uses schema tables provided by the database engine to check if the column was already removed. We now also store the migration in the migrations table. This should stop e.g. Postgres from logging errors like `ERROR: column "event_nid" does not exist at character 8`.
2022-09-08Fix issue with stale device lists (#2702)Till
We were only sending the last entry to the worker, so most likely missed updates.
2022-09-07Add HTTP status code to FederationClientError (#2699)Till
Also ensures we wait on more HTTP status codes.
2022-09-07Avoid unneeded JSON operations (#2698)Till
We were `json.Unmarshal`ing the EDU and `json.Marshal`ing right before sending the EDU to the stream. Those are now removed and the consumer does `json.Unmarshal` once.
2022-09-07Re-add waitTime if we're not blacklisted and no RetryAfter wasTill Faelligen
specified.
2022-09-07Add a SigningKeyUpdate producer (#2697)Till
This adds a new stream for signing key updates, this should ensure we don't lose any updates over federation.
2022-09-07Handle errors differently in the `DeviceListUpdater` (#2695)Till
`If a device list update goes missing, the server resyncs on the next one` was failing because a previous test would receive a `waitTime` of 1h, resulting in the test timing out. This now tries to handle the returned errors differently, e.g. by using the default `waitTime` of 2s. Also doesn't try further users in the list, if one of the errors would cause a longer `waitTime`.
2022-08-31Allow batching in `JetStreamConsumer` (#2686)Neil Alexander
This allows us to receive more than one message from NATS at a time if we want.
2022-08-29Race in keyserver intialization (#2619)Brian Meek
Signed-off-by: Brian Meek <brian@hntlabs.com>
2022-08-19Enforce device list backoffs (#2653)Neil Alexander
This ensures that if the device list updater is already backing off a node, we don't try to call processServer again anyway for server just because the server name arrived in the channel. Otherwise we can keep trying to hit a remote server that is offline or not behaving every second and that spams the logs too.
2022-08-11Generic-based internal HTTP API (#2626)Neil Alexander
* Generic-based internal HTTP API (tested out on a few endpoints in the federation API) * Add `PerformInvite` * More tweaks * Fix metric name * Fix LookupStateIDs * Lots of changes to clients * Some serverside stuff * Some error handling * Use paths as metric names * Revert "Use paths as metric names" This reverts commit a9323a6a343f5ce6461a2e5bd570fe06465f1b15. * Namespace metric names * Remove duplicate entry * Remove another duplicate entry * Tweak error handling * Some more tweaks * Update error behaviour * Some more error tweaking * Fix API path for `PerformDeleteKeys` * Fix another path * Tweak federation client proxying * Fix another path * Don't return typed nils * Some more tweaks, not that it makes any difference * Tweak federation client proxying * Maybe fix the key backup test
2022-08-08Fix issues with migrations not getting executed (#2628)Till
* Fix issues with migrations not getting executed * Check actual postgres error * Return error if it's not "column does not exist"
2022-08-05Do not use `ioutil` as it is deprecated (#2625)Neil Alexander
2022-08-05Fix linter issues (#2624)Till
* Try that again * All hail the mighty linter? * And once again * goimport all the things
2022-08-05Add race testing to tests, and fix a few small race conditions in the tests ↵Brian Meek
(#2587) * Add race testing to tests, and fix a few small race conditions in the tests * Enable run-sytest on MacOS * Remove deadlock detecting mutex, per code review feedback * Remove autoformatting related changes and a closure that is not needed * Adjust to importing nats client as 'natsclient' Signed-off-by: Brian Meek <brian@hntlabs.com> * Clarify the use of gooseMutex to proect goose internal state Signed-off-by: Brian Meek <brian@hntlabs.com> * Remove no longer needed mutex for guarding goose Signed-off-by: Brian Meek <brian@hntlabs.com>
2022-08-03Fix syncapi shared users query & device lists (#2614)Till
* Fix query issue, only add "changed" users if we actually share a room * Avoid log spam if context is done * Undo changes to filterSharedUsers * Add logging again.. * Fix SQLite shared users query * Change query to include invited users
2022-07-25Update database migrations, remove goose (#2264)Till
* Add new db migration * Update migrations Remove goose * Add possibility to test direct upgrades * Try to fix WASM test * Add checks for specific migrations * Remove AddMigration Use WithTransaction Add Dendrite version to table * Fix linter issues * Update tests * Update comments, outdent if * Namespace migrations * Add direct upgrade tests, skipping over one version * Split migrations * Update go version in CI * Fix copy&paste mistake * Use contexts in migrations Co-authored-by: kegsay <kegan@matrix.org> Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-07-05Use new testrig for key changes tests (#2552)Till
* Use new testrig for tests * Log the error message
2022-06-15Add `InputDeviceListUpdate` to the keyserver, remove old input API (#2536)Neil Alexander
* Add `InputDeviceListUpdate` to the keyserver, remove old input API * Fix copyright * Log more information when a device list update fails
2022-06-01Reduce error levels on device list updateNeil Alexander
2022-05-17bugfix: E2EE device keys could sometimes not be sent to remote servers (#2466)kegsay
* Fix flakey sytest 'Local device key changes get to remote servers' * Debug logs * Remove internal/test and use /test only Remove a lot of ancient code too. * Use FederationRoomserverAPI in more places * Use more interfaces in federationapi; begin adding regression test * Linting * Add regression test * Unbreak tests * ALL THE LOGS * Fix a race condition which could cause events to not be sent to servers If a new room event which rewrites state arrives, we remove all joined hosts then re-calculate them. This wasn't done in a transaction so for a brief period we would have no joined hosts. During this interim, key change events which arrive would not be sent to destination servers. This would sporadically fail on sytest. * Unbreak new tests * Linting
2022-05-11Fix OTK upload spam (#2448)Till
* Fix OTK spam * Update comment * Optimize selectKeysCountSQL to only return max 100 keys * Return CurrentPosition if the request timed out * Revert "Return CurrentPosition if the request timed out" This reverts commit 7dbdda964189f5542048c06ce5ffc6d4da1814e6. Co-authored-by: kegsay <kegan@matrix.org>
2022-05-09One NATS instance per `BaseDendrite` (#2438)Neil Alexander
* One NATS instance per `BaseDendrite` * Fix roomserver
2022-05-09Add `(user_id, device_id)` index on OTK table (#2435)Neil Alexander
2022-05-06Clean up interface definitions (#2427)kegsay
* tidy up interfaces * remove unused GetCreatorIDForAlias * Add RoomserverUserAPI interface * Define more interfaces * Use AppServiceInternalAPI for consistent naming * clean up federationapi constructor a bit * Fix monolith in -http mode
2022-05-05Define component interfaces based on consumers (2/2) (#2425)kegsay
* convert remaining interfaces * Tidy up the userapi interfaces
2022-05-05Define component interfaces based on consumers (1/2) (#2423)kegsay
* Specify interfaces used by appservice, do half of clientapi * convert more deps of clientapi to finer-grained interfaces * Convert mediaapi and rest of clientapi * Somehow this got missed
2022-05-05syncapi: define specific interfaces for internal HTTP communications (#2416)kegsay
* syncapi: use finer-grained interfaces when making the syncapi * Use specific interfaces for syncapi-roomserver interactions * Define query access token api for shared http auth code
2022-05-03Global database connection pool (for monolith mode) (#2411)Neil Alexander
* Allow monolith components to share a single database pool * Don't yell about missing connection strings * Rename field * Setup tweaks * Fix panic * Improve configuration checks * Update config * Fix lint errors * Update comments
2022-04-29Device list display name fixes (#2405)Neil Alexander
* Get device names from `unsigned` in `/user/devices` * Fix display name updates * Fix bug * Fix another bug
2022-04-28Ensure signature map exists (fixes #2393) (#2397)Neil Alexander
2022-04-26Fix bug when uploading device signatures (#2377)Neil Alexander
* Find the complete key ID when uploading signatures * Try that again * Try splitting the right thing * Don't do it for device keys * Refactor `QuerySignatures` * Revert "Refactor `QuerySignatures`" This reverts commit c02832a3e92569f64f180dec1555056dc8f8c3e3. * Both requested key IDs and master/self/user keys * Fix uniqueness * Try tweaking GMSL * Update GMSL again * Revert "Update GMSL again" This reverts commit bd6916cc379dd8d9e3f38d979c6550bd658938aa. * Revert "Try tweaking GMSL" This reverts commit 2a054524da9d64c6a2a5228262fbba5fde28798c. * Database migrations
2022-04-25Only call key update process functions if there are updates, don't send ↵Neil Alexander
things to ourselves over federation
2022-04-22Fix retrieving cross-signing signatures in `/user/devices/{userId}` (#2368)Neil Alexander
* Fix retrieving cross-signing signatures in `/user/devices/{userId}` We need to know the target device IDs in order to get the signatures and we weren't populating those. * Fix up signature retrieval * Fix SQLite * Always include the target's own signatures as well as the requesting user
2022-04-04Slower federation warm-up (#2320)Neil Alexander
* Wake destination queues gradually, rather than all at once * Delay device list updates too * Maximum two minute warmup period
2022-03-29Remove eduserver (#2306)S7evinK
* Move receipt sending to own JetStream producer * Move SendToDevice to producer * Remove most parts of the EDU server * Fix SendToDevice & copyrights * Move structs, cleanup EDU Server traces * Use HeadersOnly subscription * Missing file * Fix linter issues * Move consumers to own files * Rename durable consumer; Consumer cleanup * Docs/config cleanup