aboutsummaryrefslogtreecommitdiff
path: root/syncapi
AgeCommit message (Collapse)Author
2022-01-27Roomserver/federation input refactor (#2104)Neil Alexander
* Put federation client functions into their own file * Look for missing auth events in RS input * Remove retrieveMissingAuthEvents from federation API * Logging * Sorta transplanted the code over * Use event origin failing all else * Don't get stuck on mutexes: * Add verifier * Don't mark state events with zero snapshot NID as not existing * Check missing state if not an outlier before storing the event * Reject instead of soft-fail, don't copy roominfo so much * Use synchronous contexts, limit time to fetch missing events * Clean up some commented out bits * Simplify `/send` endpoint significantly * Submit async * Report errors on sending to RS input * Set max payload in NATS to 16MB * Tweak metrics * Add `workerForRoom` for tidiness * Try skipping unmarshalling errors for RespMissingEvents * Track missing prev events separately to avoid calculating state when not possible * Tweak logic around checking missing state * Care about state when checking missing prev events * Don't check missing state for create events * Try that again * Handle create events better * Send create room events as new * Use given event kind when sending auth/state events * Revert "Use given event kind when sending auth/state events" This reverts commit 089d64d271b5fca8c104e1554711187420dbebca. * Only search for missing prev events or state for new events * Tweaks * We only have missing prev if we don't supply state * Room version tweaks * Allow async inputs again * Apply backpressure to consumers/synchronous requests to hopefully stop things being overwhelmed * Set timeouts on roomserver input tasks (need to decide what timeout makes sense) * Use work queue policy, deliver all on restart * Reduce chance of duplicates being sent by NATS * Limit the number of servers we attempt to reduce backpressure * Some review comment fixes * Tidy up a couple things * Don't limit servers, randomise order using map * Some context refactoring * Update gmsl * Don't resend create events * Set stateIDs length correctly or else the roomserver thinks there are missing events when there aren't * Exclude our own servername * Try backing off servers * Make excluding self behaviour optional * Exclude self from g_m_e * Update sytest-whitelist * Update consumers for the roomserver output stream * Remember to send outliers for state returned from /gme * Make full HTTP tests less upsetti * Remove 'If a device list update goes missing, the server resyncs on the next one' from the sytest blacklist * Remove debugging test * Fix blacklist again, remove unnecessary duplicate context * Clearer contexts, don't use background in case there's something happening there * Don't queue up events more than once in memory * Correctly identify create events when checking for state * Fill in gaps again in /gme code * Remove `AuthEventIDs` from `InputRoomEvent` * Remove stray field Co-authored-by: Kegan Dougal <kegan@matrix.org>
2022-01-21Remodel how device list change IDs are created (#2098)kegsay
* Remodel how device list change IDs are created Previously we made them using the offset Kafka supplied. We don't run Kafka anymore, so now we make the SQL table assign the change ID via an AUTOINCREMENTing ID. Redesign the `keyserver_key_changes` table to have `UNIQUE(user_id)` so we don't accumulate key changes forevermore, we now have at most 1 row per user which contains the highest change ID. This needs a SQL migration. * Ensure we bump the change ID on sqlite * Actually read the DeviceChangeID not the Offset in synapi * Add SQL migrations * Prepare after migration; fixup dendrite-upgrade-test logging * Use higher version numbers; fix sqlite query to increment better * Default 0 on postgres * fixup postgres migration on fresh dendrite instances
2022-01-20BREAKING: Remove Partitioned Stream Positions (#2096)kegsay
* go mod tidy * Break complement to check it fails CI * Remove partitioned stream positions This was used by the device list stream position. The device list position now corresponds to the `Offset`, and the partition is always 0, in prep for removing reliance on Kafka topics for device list changes. * Linting * Migrate old style tokens to new style because element-web doesn't soft-logoout on 4xx errors on /sync
2022-01-07NATS JetStream tweaks (#2086)Neil Alexander
* Use named NATS durable consumers * Build fixes * Remove dupe call to SetFederationAPI * Use namespaced consumer name * Fix namespacing * Fix unit tests hopefully
2022-01-05Add NATS JetStream support (#1866)S7evinK
* Add NATS JetStream support Update shopify/sarama * Fix addresses * Don't change Addresses in Defaults * Update saramajetstream * Add missing error check Keep typing events for at least one minute * Use all configured NATS addresses * Update saramajetstream * Try setting up with NATS * Make sure NATS uses own persistent directory (TODO: make this configurable) * Update go.mod/go.sum * Jetstream package * Various other refactoring * Build fixes * Config tweaks, make random jetstream storage path for CI * Disable interest policies * Try to sane default on jetstream base path * Try to use in-memory for CI * Restore storage/retention * Update nats.go dependency * Adapt changes to config * Remove unneeded TopicFor * Dep update * Revert "Remove unneeded TopicFor" This reverts commit f5a4e4a339b6f94ec215778dca22204adaa893d1. * Revert changes made to streams * Fix build problems * Update nats-server * Update go.mod/go.sum * Roomserver input API queuing using NATS * Fix topic naming * Prometheus metrics * More refactoring to remove saramajetstream * Add missing topic * Don't try to populate map that doesn't exist * Roomserver output topic * Update go.mod/go.sum * Message acknowledgements * Ack tweaks * Try to resume transaction re-sends * Try to resume transaction re-sends * Update to matrix-org/gomatrixserverlib@91dadfb * Remove internal.PartitionStorer from components that don't consume keychanges * Try to reduce re-allocations a bit in resolveConflictsV2 * Tweak delivery options on RS input * Publish send-to-device messages into correct JetStream subject * Async and sync roomserver input * Update dendrite-config.yaml * Remove roomserver tests for now (they need rewriting) * Remove roomserver test again (was merged back in) * Update documentation * Docker updates * More Docker updates * Update Docker readme again * Fix lint issues * Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset) * Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that * Go 1.16 instead of Go 1.13 for upgrade tests and Complement * Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that" This reverts commit 368675283fc44501f227639811bdb16dd5deef8c. * Don't report any errors on `/send` to see what fun that creates * Fix panics on closed channel sends * Enforce state key matches sender * Do the same for leave * Various tweaks to make tests happier Squashed commit of the following: commit 13f9028e7a63662759ce7c55504a9d2423058668 Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Tue Jan 4 15:47:14 2022 +0000 Do the same for leave commit e6be7f05c349fafbdddfe818337a17a60c867be1 Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Tue Jan 4 15:33:42 2022 +0000 Enforce state key matches sender commit 85ede6d64bf10ce9b91cdd6d80f87350ee55242f Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Tue Jan 4 14:07:04 2022 +0000 Fix panics on closed channel sends commit 9755494a98bed62450f8001d8128e40481d27e15 Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Tue Jan 4 13:38:22 2022 +0000 Don't report any errors on `/send` to see what fun that creates commit 3bb4f87b5dd56882febb4db5621db484c8789b7c Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Tue Jan 4 13:00:26 2022 +0000 Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that" This reverts commit 368675283fc44501f227639811bdb16dd5deef8c. commit fe2673ed7be9559eaca134424e403a4faca100b0 Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Tue Jan 4 12:09:34 2022 +0000 Go 1.16 instead of Go 1.13 for upgrade tests and Complement commit 368675283fc44501f227639811bdb16dd5deef8c Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Tue Jan 4 11:51:45 2022 +0000 Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that commit b028dfc08577bcf52e6cb498026e15fa5d46d07c Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Tue Jan 4 10:29:08 2022 +0000 Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset) * Merge in NATS Server v2.6.6 and nats.go v1.13 into the in-process connection fork * Add `jetstream.WithJetStreamMessage` to make ack/nak-ing less messy, use process context in consumers * Fix consumer component name in federation API * Add comment explaining where streams are defined * Tweaks to roomserver input with comments * Finish that sentence that I apparently forgot to finish in INSTALL.md * Bump version number of config to 2 * Add comments around asynchronous sends to roomserver in processEventWithMissingState * More useful error message when the config version does not match * Set version in generate-config * Fix version in config.Defaults Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-12-03Cherry-pick typing fix from #2061Neil Alexander
Co-authored-by: Tommie Gannert <tommie@gannert.se>
2021-11-16Guard in all key consumersNeil Alexander
2021-11-03Reduce CPU usage of SelectStateInRange (#2038)Neil Alexander
2021-11-02Run gofmt on dendrite - apply go 1.17 preferred build tags (#2021)PiotrKozimor
2021-09-08- Removed double imports (#1989)Ryan W
- Lower cased error messages Signed-off-by: Ryan Whittington <twentybitdev@gmail.com> Co-authored-by: kegsay <kegan@matrix.org>
2021-08-18Delete device keys/signatures from key server when deleting devices (#1979)Neil Alexander
* Delete device keys/signatures from key server when deleting device from user API * Move loop to within database transaction * Don't fall over deleting no rows
2021-08-17Cross-signing fixes, notifications via sync, federation (#1974)Neil Alexander
* Initial work on signing key update EDUs * Fix build * Produce/consume EDUs * Producer logging * Only produce key change notifications for local users * Better naming * Try to notify sync * Enable feature * Use key change topic * Don't bother verifying signatures, validate key lengths if we can, notifier fixes * Copyright notices * Remove tests from whitelist until matrix-org/sytest#1117 * Some review comment fixes * Update to matrix-org/gomatrixserverlib@f9416ac * Remove unneeded parameter
2021-08-06Cross-signing validation for self-sigs, expose signatures over ↵Neil Alexander
`/user/keys/query` and `/user/devices/{userId}` (#1962) * Enable unstable feature again * Try to verify when a device signs a key * Try to verify when a key signs a device * It's the self-signing key, not the master key * Fix error * Try to verify master key uploads * Actually we can't guarantee we can do that so nevermind * Add signatures into /devices/list request * Fix nil pointer * Reprioritise map creation * Don't skip devices that don't have signatures * Add some debug logging * Fix logic error in QuerySignatures * Fix bugs * Expose master and self-signing keys on /devices/list hopefully * maps are tedious * Expose signatures via /keys/query * Upload signatures when uploading keys * Fixes * Disable the feature again
2021-08-04Cross-signing groundwork (#1953)Neil Alexander
* Cross-signing groundwork * Update to matrix-org/gomatrixserverlib#274 * Fix gobind builds, which stops unit tests in CI from yelling * Some changes from review comments * Fix build by passing in UIA * Update to matrix-org/gomatrixserverlib@bec8d22 * Process master/self-signing keys from devices call * nolint * Enum-ify the key type in the database * Process self-signing key too * Fix sanity check in device list updater * Fix check * Fix sytest, hopefully * Fix build
2021-07-22Don't set prev state when it is the same as the event it replaces (#1936)Neil Alexander
2021-07-20Only include go-sqlite3 on the relevant binaries (#1900)Neil Alexander
* Only include go-sqlite3 on the relevant binaries * The driver name is always sqlite3 now * Update to matrix-org/go-sqlite3-js@e537baa
2021-07-20Rename Riot to Element (#1874)S7evinK
* s/riot/element/g Signed-off-by: Till Faelligen <tfaelligen@gmail.com> * fix formatting Co-authored-by: kegsay <kegan@matrix.org> Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-07-14bugfix: retire invites even when we cannot talk to the remote server to ↵kegsay
make/send_leave (#1918) * bugfix: retire invites even when we cannot talk to the remote server to make/send_leave Also modify the leave response in /sync to include a fake event as this is ultimately what clients (and sytest) will use to determine leave-ness. * hash the event ID * Base64 not hex
2021-04-26Don't return immediately when there's nothing to syncNeil Alexander
2021-03-24Add Sentry support (#1803)Kegsay
* Add Sentry support * Use HTTP Sentry properly maybe * Capture panics * Log fed Sentry stuff correctly * British english linter
2021-03-03Increase gocyclo complexity to 25 (and remove all but 2 golint directives ↵Neil Alexander
related to it) (#1783)
2021-02-17Don't exclude an event from sync if it was previously not excluded (#1767)Neil Alexander
2021-02-04Don't re-request state events that are already in the timeline (#1739)Neil Alexander
* Don't request state events if we already have the timeline events (Postgres only) * Rename variable * nocyclo * Add SQLite * Tweaks * Revert query change * Don't dedupe if asking for full state * Update query
2021-02-04Fix ON CONFLICT on sync API account data (#1745) (#1750)Neil Alexander
2021-01-29Complete sync performance (#1741)Neil Alexander
* Parallelise PDU stream fetching for complete sync * Fixes * Fixes * Worker queue * Workers * Don't populate device list changes on complete sync * Don't fast-forward typing notifications either on complete sync * Revert "Don't fast-forward typing notifications either on complete sync" This reverts commit 01471f78431cdd840915111f71bd2b5176e584a8. * Comments
2021-01-26Graceful shutdowns (#1734)Neil Alexander
* Initial graceful stop * Fix dendritejs * Use process context for outbound federation requests in destination queues * Reduce logging * Fix log level
2021-01-22Peeking over federation via MSC2444 (#1391)Matthew Hodgson
* a very very WIP first cut of peeking via MSC2753. doesn't yet compile or work. needs to actually add the peeking block into the sync response. checking in now before it gets any bigger, and to gather any initial feedback on the vague shape of it. * make PeekingDeviceSet private * add server_name param * blind stab at adding a `peek` section to /sync * make it build * make it launch * add peeking to getResponseWithPDUsForCompleteSync * cancel any peeks when we join a room * spell out how to runoutside of docker if you want speed * fix SQL * remove unnecessary txn for SelectPeeks * fix s/join/peek/ cargocult fail * HACK: Track goroutine IDs to determine when we write by the wrong thread To use: set `DENDRITE_TRACE_SQL=1` then grep for `unsafe` * Track partition offsets and only log unsafe for non-selects * Put redactions in the writer goroutine * Update filters on writer goroutine * wrap peek storage in goid hack * use exclusive writer, and MarkPeeksAsOld more efficiently * don't log ascii in binary at sql trace... * strip out empty roomd deltas * re-add txn to SelectPeeks * re-add accidentally deleted field * reject peeks for non-worldreadable rooms * move perform_peek * fix package * correctly refactor perform_peek * WIP of implementing MSC2444 * typo * Revert "Merge branch 'kegan/HACK-goid-sqlite-db-is-locked' into matthew/peeking" This reverts commit 3cebd8dbfbccdf82b7930b7b6eda92095ca6ef41, reversing changes made to ed4b3a58a7855acc43530693cc855b439edf9c7c. * (almost) make it build * clean up bad merge * support SendEventWithState with optional event * fix build & lint * fix build & lint * reinstate federated peeks in the roomserver (doh) * fix sql thinko * todo for authenticating state returned by /peek * support returning current state from QueryStateAndAuthChain * handle SS /peek * reimplement SS /peek to prod the RS to tell the FS about the peek * rename RemotePeeks as OutboundPeeks * rename remote_peeks_table as outbound_peeks_table * add perform_handle_remote_peek.go * flesh out federation doc * add inbound peeks table and hook it up * rename ambiguous RemotePeek as InboundPeek * rename FSAPI's PerformPeek as PerformOutboundPeek * setup inbound peeks db correctly * fix api.SendEventWithState with no event * track latestevent on /peek * go fmt * document the peek send stream race better * fix SendEventWithRewrite not to bail if handed a non-state event * add fixme * switch SS /peek to use SendEventWithRewrite * fix comment * use reverse topo ordering to find latest extrem * support postgres for federated peeking * go fmt * back out bogus go.mod change * Fix performOutboundPeekUsingServer * Fix getAuthChain -> GetAuthChain * Fix build issues * Fix build again * Fix getAuthChain -> GetAuthChain * Don't repeat outbound peeks for the same room ID to the same servers * Fix lint * Don't omitempty to appease sytest Co-authored-by: Kegan Dougal <kegan@matrix.org> Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-01-20Add sync API memberships table (#1726)Neil Alexander
2021-01-19Basic sync filtering (#1721)Neil Alexander
* Add some filtering (postgres only for now) * Fix build error * Try to use request filter * Use default filter as a template when retrieving from the database * Remove unused strut * Update sytest-whitelist * Add filtering to SelectEarlyEvents * Fix Postgres selectEarlyEvents query * Attempt filtering on SQLite * Test limit, set field for limit/order in prepareWithFilters * Remove debug logging, add comments * Tweaks, debug logging * Separate SQLite stream IDs * Fix filtering in current state table * Fix lock issues * More tweaks * Current state requires room ID * Review comments
2021-01-18Log event ID on consumer errors (fixes #1714)Neil Alexander
2021-01-13Simplify send-to-device messaging (#1702)Neil Alexander
* Simplify send-to-device messaging * Don't return error if there's no work to do * Remove SQLite migrations for now * Tweak Postgres migrations * Tweaks * Fixes * Cleanup separately * Fix SQLite migration
2021-01-13Sync fixes (#1709)Neil Alexander
* omitempty some fields in sync * Add a few more * Don't send push rules over and over again in incremental sync * Further tweaks
2021-01-13Update /messages pagination token behaviour (#1708)Neil Alexander
* Tweak pagination tokens * start should be the specified from * Don't reverse start and end * Tweak getStartEnd again * Update sytest-whitelist * NOTSPEC: Re-add iOS end of topology
2021-01-09Tweak ApplyUpdates (#1691)Neil Alexander
2021-01-08Sync refactor — Part 1 (#1688)Neil Alexander
* It's half-alive * Wakeups largely working * Other tweaks, typing works * Fix bugs, add receipt stream * Delete notifier, other tweaks * Dedupe a bit, add a template for the invite stream * Clean up, add templates for other streams * Don't leak channels * Bring forward some more PDU logic, clean up other places * Add some more wakeups * Use addRoomDeltaToResponse * Log tweaks, typing fixed? * Fix timed out syncs * Don't reset next batch position on timeout * Add account data stream/position * End of day * Fix complete sync for receipt, typing * Streams package * Clean up a bit * Complete sync send-to-device * Don't drop errors * More lightweight notifications * Fix typing positions * Don't advance position on remove again unless needed * Device list updates * Advance account data position * Use limit for incremental sync * Limit fixes, amongst other things * Remove some fmt.Println * Tweaks * Re-add notifier * Fix invite position * Fixes * Notify account data without advancing PDU position in notifier * Apply account data position * Get initial position for account data * Fix position update * Fix complete sync positions * Review comments @Kegsay * Room consumer parameters
2020-12-21fix imports (#1665)6543
* fix imports Signed-off-by: 6543 <6543@obermui.de> * add sqlite driver import back Signed-off-by: 6543 <6543@obermui.de> * rm import of userapi/storage/accounts/sqlite3/storage.go
2020-12-18Ensure we wake for our own device list updates (#1661)Neil Alexander
* Make sure we wake up for our own key changes * Whitelist 'Users receive device_list updates for their own devices'
2020-12-18More sane next batch handling, typing notification tweaks, give invites ↵Neil Alexander
their own stream position, device list fix (#1641) * Update sync responses * Fix positions, add ApplyUpdates * Fix MarshalText as non-pointer, PrevBatch is optional * Increment by number of read receipts * Merge branch 'master' into neilalexander/devicelist * Tweak typing * Include keyserver position tweak * Fix typing next position in all cases * Tweaks * Fix typo * Tweaks, restore StreamingToken.MarshalText which somehow went missing? * Rely on positions from notifier rather than manually advancing them * Revert "Rely on positions from notifier rather than manually advancing them" This reverts commit 53112a62cc3bfd9989acab518e69eeb27938117a. * Give invites their own position, fix other things * Fix test * Fix invites maybe * Un-whitelist tests that look to be genuinely wrong * Use real receipt positions * Ensure send-to-device uses real positions too
2020-12-16Add event ID index on current state table (helps performance) (#1649)Neil Alexander
2020-12-16Add start_stream to /messages (#1648)Kegsay
2020-12-16NOTSPEC: Make ?from= optional in /messages (#1647)Kegsay
2020-12-16Add prometheus metrics for destination queues, sync requestsNeil Alexander
Squashed commit of the following: commit 7ed1c6cfe67429dbe378a763d832c150eb0f781d Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Wed Dec 16 14:53:27 2020 +0000 Updates commit 8442099d08760b8d086e6d58f9f30284e378a2cd Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Wed Dec 16 14:43:18 2020 +0000 Add some sync statistics commit ffe2a11644ed3d5297d1775a680886c574143fdb Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Wed Dec 16 14:37:00 2020 +0000 Fix backing off display commit 27443a93855aa60a49806ecabbf9b09f818301bd Author: Neil Alexander <neilalexander@users.noreply.github.com> Date: Wed Dec 16 14:28:43 2020 +0000 Add some destination queue metrics
2020-12-15De-map device list positions in streaming tokens (#1642)Neil Alexander
* De-map device list positions in streaming tokens * Fix lint error * Tweak toOffset
2020-12-11Give receipts their own stream ID in the database (#1631)Neil Alexander
* Give read recipts their own database sequence * Give receipts their own stream ID * Change migration names * Reset sequences * Add max receipt queries, missing stream_id table entry for SQLite
2020-12-10Refactor sync tokens (#1628)Neil Alexander
* Refactor sync tokens * Comment out broken notifier test * Update types, sytest-whitelist * More robust token checking * Remove New functions for streaming tokens * Export Logs in StreamingToken * Fix tests
2020-12-09Don't recalculate event ID so often in sync (#1624)Neil Alexander
* Don't bail so quickly in fetchMissingStateEvents * Don't recalculate event IDs so often in sync API * Add comments * Fix comments * Update to matrix-org/gomatrixserverlib@eb6a890
2020-12-03Peeking updates (#1607)Neil Alexander
* Add unpeek * Don't allow peeks into encrypted rooms * Fix send tests * Update consumers
2020-12-02Top-level setup package (#1605)Neil Alexander
* Move config, setup, mscs into "setup" top-level folder * oops, forgot the EDU server * Add setup * goimports
2020-12-02Send client events to appservices (#1603)Neil Alexander
* Send client events to appservices * FormatSync instead of FormatAll
2020-12-01syncapi/requestpool: fix initial sync logic error in appendAccountData() (#1594)Ariadne Conill
* requestpool: fix initial sync logic error in appendAccountData() In initial sync, req.since is no longer nil, but instead, req.since.PDUPosition() and req.since.EDUPosition() returns 0. This ensures forgotten rooms do not come back as zombies. * syncapi/requestpool: reintroduce req.since == nil check