aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-02-04Fix CPU spin from key change consumer when an invalid message is supplied ↵Neil Alexander
(#2146)
2022-02-04Version 0.6.1 (#2145)v0.6.1Neil Alexander
2022-02-04Remove sarama/saramajetstream dependencies (#2138)S7evinK
* Remove dependency on saramajetstream & sarama Signed-off-by: Till Faelligen <tfaelligen@gmail.com> * Remove internal.ContinualConsumer from federationapi * Remove internal.ContinualConsumer from syncapi * Remove internal.ContinualConsumer from keyserver * Move to new Prepare function * Remove saramajetstream & sarama dependency * Delete unneeded file * Remove duplicate import * Log error instead of silently irgnoring it * Move `OffsetNewest` and `OffsetOldest` into keyserver types, change them to be more sane values * Fix comments Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-02-04Remove roomserver input deadlines (#2144)Neil Alexander
It isn't really clear that the deadlines actually help in any way. Currently we can use up our 2 minutes doing something, run out of context time and then return an error which causes the transaction to rollback and forgetting everything we've done. If the message came to us from NATS then we probably will end up retrying just to be in the same situation. We'd be really a lot better if we just spent the time reconciling the problem in the first place, and then we're much less likely to need to fetch those missing auth or prev events in the future. Also includes matrix-org/gomatrixserverlib#287 so we don't wait so long for servers that are obviously dead.
2022-02-04Full roomserver input transactional isolation (#2141)Neil Alexander
* Add transaction to all database tables in roomserver, rename latest events updater to room updater, use room updater for all RS input * Better transaction management * Tweak order * Handle cases where the room does not exist * Other fixes * More tweaks * Fill some gaps * Fill in the gaps * good lord it gets worse * Don't roll back transactions when events rejected * Pass through errors properly * Fix bugs * Fix incorrect error check * Don't panic on nil txns * Tweaks * Hopefully fix panics for good in SQLite this time * Fix rollback * Minor bug fixes with latest event updater * Some review comments * Revert "Some review comments" This reverts commit 0caf8cf53e62c33f7b83c52e9df1d963871f751e. * Fix a couple of bugs * Clearer commit and rollback results * Remove unnecessary prepares
2022-02-02Fix panic from closing the input channel before the workers complete (it'll ↵Neil Alexander
get GC'd either way)
2022-02-02Use background contexts during federated join for clarity (#2134)Neil Alexander
* Use background contexts for clarity * Don't wait for the context to expire before trying to return * Actually we don't really need a goroutine here
2022-02-02Use pull consumers (#2140)Neil Alexander
* Pull consumers * Pull consumers * Only nuke consumers if they are push consumers * Clean up old consumers * Better error handling * Update comments
2022-02-02PerformInvite: bugfix and rejig control flow (#2137)kegsay
* PerformInvite: bugfix and rejig control flow Local clients would not be notified of invites to rooms Dendrite had already joined in all cases due to not returning an `api.OutputNewInviteEvent` for local invites. We now do this. This was an easy mistake to make due to the control flow of the function which doesn't handle the happy case at the end of the function and instead forks the function depending on if the invite was via federation or not. This has now been changed to handle the federated invite as if it were an error (in that we check it, do it and bail out) rather than outstay our welcome. This ends up with the local invite being the happy case, which now both sends an `InputRoomEvent` to the roomserver _and_ a `api.OutputNewInviteEvent` is returned. * Don't send invite pokes in PerformInvite * Move event ID into logger
2022-02-01Support CA certificates in CI (#2136)kegsay
* Support CA setting in generate-keys * Set DNS names correctly * Use generate-config -server not sed
2022-02-01Fix JetStream paths for P2P demo buildsNeil Alexander
2022-01-31More logging tweaksNeil Alexander
2022-01-31Improve roomserver loggingNeil Alexander
2022-01-31Roomserver fixes (#2133)Neil Alexander
* Improve server selection somewhat * Remove things from the map when we're done * Be less panicky about auth event signatures in case they are not fatal after all * Accept HasState in all cases * Send join asynchronously * Revert "Send join asynchronously" This reverts commit 5b685bfcd0b1150a66c7b1e70fb3a3eda509efd1. * Joins and leaves use background context
2022-01-31Update to matrix-org/gomatrixserverlib#286Neil Alexander
2022-01-31Allow uppercase username on login (#2126)Hoernschen
* ADD jetstream folder to gitignore * CHANGE login to check on uppercase if lowercase not exists Co-authored-by: kegsay <kegan@matrix.org>
2022-01-31Tweak roomserver logging for rejected eventsNeil Alexander
2022-01-31Revert Prometheus client upgrades altogetherNeil Alexander
2022-01-31Update prometheus clientNeil Alexander
2022-01-31Update to matrix-org/gomatrixserverlib@801c51af9f29e3630c8d83b0772c7ba52c0d8908Neil Alexander
2022-01-31Tweak some logging (#2130)Neil Alexander
* Modify some log levels * Update gomatrixserverlib to matrix-org/gomatrixserverlib@336334f * Update gomatrixserverlib to matrix-org/gomatrixserverlib@cde7ac8 * Demote warning about key change producer * Add more useful roomserver logging * Further tweaking
2022-01-31Revert consumer changeNeil Alexander
2022-01-31Only limit context for fetching missing auth/prev events (#2131)Neil Alexander
2022-01-28Update Sarama to fix 32-bit builds (#2120)v0.6.0Neil Alexander
2022-01-28Require Go 1.16 (#2122)Neil Alexander
2022-01-28Version 0.6 (#2117)v0.6Neil Alexander
* Bump version, release notes * Update changelog * Update changelog
2022-01-28Call hooks for outliers (#2119)Neil Alexander
* Move hook call when processing room events * Fix build * Call hooks for outliers too
2022-01-28Move hook call when processing room events (#2118)Neil Alexander
* Move hook call when processing room events * Fix build
2022-01-28Add debug logging for incoming CSAPI calls on authentication failure (#2116)kegsay
* Add debug logging for incoming CSAPI calls on authentication failure Will help to debug Complement failures, and just generally useful. * Update httpapi.go Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-01-28"Enable" remote room search (#2099)S7evinK
* "Enable" remote room search Signed-off-by: Till Faelligen <tfaelligen@gmail.com> * Update go.mod * Fix formatting
2022-01-28Don't flood Sentry with context cancelled/deadline exceeded errors (#2115)Neil Alexander
2022-01-28Upgrade dependencies (#2074)Neil Alexander
* Upgrade dependencies * Revert gjson/sjson due to panics * Revert sarama as it requires Go 1.16 * Revert quic-go as it requires Go 1.16 * Revert sarama again
2022-01-28Update gomatrixserverlibNeil Alexander
2022-01-27Try federation when backfill fails to find events in the database (#2113)Neil Alexander
* Try to backfill via federation in error cases * Cleaner retry for backfill * Simpler condition
2022-01-27Roomserver/federation input refactor (#2104)Neil Alexander
* Put federation client functions into their own file * Look for missing auth events in RS input * Remove retrieveMissingAuthEvents from federation API * Logging * Sorta transplanted the code over * Use event origin failing all else * Don't get stuck on mutexes: * Add verifier * Don't mark state events with zero snapshot NID as not existing * Check missing state if not an outlier before storing the event * Reject instead of soft-fail, don't copy roominfo so much * Use synchronous contexts, limit time to fetch missing events * Clean up some commented out bits * Simplify `/send` endpoint significantly * Submit async * Report errors on sending to RS input * Set max payload in NATS to 16MB * Tweak metrics * Add `workerForRoom` for tidiness * Try skipping unmarshalling errors for RespMissingEvents * Track missing prev events separately to avoid calculating state when not possible * Tweak logic around checking missing state * Care about state when checking missing prev events * Don't check missing state for create events * Try that again * Handle create events better * Send create room events as new * Use given event kind when sending auth/state events * Revert "Use given event kind when sending auth/state events" This reverts commit 089d64d271b5fca8c104e1554711187420dbebca. * Only search for missing prev events or state for new events * Tweaks * We only have missing prev if we don't supply state * Room version tweaks * Allow async inputs again * Apply backpressure to consumers/synchronous requests to hopefully stop things being overwhelmed * Set timeouts on roomserver input tasks (need to decide what timeout makes sense) * Use work queue policy, deliver all on restart * Reduce chance of duplicates being sent by NATS * Limit the number of servers we attempt to reduce backpressure * Some review comment fixes * Tidy up a couple things * Don't limit servers, randomise order using map * Some context refactoring * Update gmsl * Don't resend create events * Set stateIDs length correctly or else the roomserver thinks there are missing events when there aren't * Exclude our own servername * Try backing off servers * Make excluding self behaviour optional * Exclude self from g_m_e * Update sytest-whitelist * Update consumers for the roomserver output stream * Remember to send outliers for state returned from /gme * Make full HTTP tests less upsetti * Remove 'If a device list update goes missing, the server resyncs on the next one' from the sytest blacklist * Remove debugging test * Fix blacklist again, remove unnecessary duplicate context * Clearer contexts, don't use background in case there's something happening there * Don't queue up events more than once in memory * Correctly identify create events when checking for state * Fill in gaps again in /gme code * Remove `AuthEventIDs` from `InputRoomEvent` * Remove stray field Co-authored-by: Kegan Dougal <kegan@matrix.org>
2022-01-26Use std logging when running under CIKegan Dougal
2022-01-25Exclude our own server name in `GetJoinedHostsForRooms` (#2110)Neil Alexander
* Exclude our own servername * Make excluding self behaviour optional
2022-01-25Increase maximum message size to 16MB (#2109)Neil Alexander
2022-01-24Add Complement to GHA (#2108)kegsay
* Add Complement to GHA * Only run on push on master
2022-01-24Update bridge FAQ & README (#2106)S7evinK
* Update bridge FAQ Signed-off-by: Till Faelligen <tfaelligen@gmail.com> * Update README
2022-01-24Update to matrix-org/gomatrixserverlib@f3e2ef8 (matrix-org/matrix-doc#3667)Neil Alexander
2022-01-21Expand issue template (#2103)kegsay
2022-01-21Add `Forward extremities remain so even after the next events are populated ↵Neil Alexander
as outliers` to `sytest-whitelist`
2022-01-21Document log levels (#2101)kegsay
2022-01-21Update monolith-sample.conf (#2087)FORCHA
* Update monolith-sample.conf -Replaced undefined monolith value with server_name (my.hostname.com) value in reference tho ths issue https://github.com/matrix-org/dendrite/issues/2078 * Update monolith-sample.conf Changed IP to location of monolith server Co-authored-by: kegsay <kegan@matrix.org>
2022-01-21Fix #2027 by gracefully handling stub rooms (#2100)kegsay
The server ACL code on startup will grab all known rooms from the rooms_table and then call `GetStateEvent` with each found room ID to find the server ACL event. This can fail for stub rooms, which will be present in the rooms table. Previously this would result in an error being returned and the server failing to start (!). Now we just return no event for stub rooms.
2022-01-21Remodel how device list change IDs are created (#2098)kegsay
* Remodel how device list change IDs are created Previously we made them using the offset Kafka supplied. We don't run Kafka anymore, so now we make the SQL table assign the change ID via an AUTOINCREMENTing ID. Redesign the `keyserver_key_changes` table to have `UNIQUE(user_id)` so we don't accumulate key changes forevermore, we now have at most 1 row per user which contains the highest change ID. This needs a SQL migration. * Ensure we bump the change ID on sqlite * Actually read the DeviceChangeID not the Offset in synapi * Add SQL migrations * Prepare after migration; fixup dendrite-upgrade-test logging * Use higher version numbers; fix sqlite query to increment better * Default 0 on postgres * fixup postgres migration on fresh dendrite instances
2022-01-20BREAKING: Remove Partitioned Stream Positions (#2096)kegsay
* go mod tidy * Break complement to check it fails CI * Remove partitioned stream positions This was used by the device list stream position. The device list position now corresponds to the `Offset`, and the partition is always 0, in prep for removing reliance on Kafka topics for device list changes. * Linting * Migrate old style tokens to new style because element-web doesn't soft-logoout on 4xx errors on /sync
2022-01-07NATS JetStream tweaks (#2086)Neil Alexander
* Use named NATS durable consumers * Build fixes * Remove dupe call to SetFederationAPI * Use namespaced consumer name * Fix namespacing * Fix unit tests hopefully
2022-01-07Fix panic at startup if roomserver was not given federation API reference by ↵Neil Alexander
the time NATS consumes an event, tweak backpressure metrics