aboutsummaryrefslogtreecommitdiff
path: root/iri.c
AgeCommit message (Collapse)Author
2024-06-10iri: don't error on a '..' component at the start of the pathOmar Polo
I choose to out of paranoia, but the algorithm defined in RFC3986 allows for them. So, we should rather remove the leading '..' component and continue to handle the rest of the path. Fixes https://github.com/omar-polo/gmid/issues/12
2024-06-10remove stale commentOmar Polo
2024-06-10refactor path_clean()Omar Polo
Instead of doing multiple passes over the string use a modified version canonpath() from kern_pledge.c that does all in a single go.
2024-05-29iri: add support for raw IPv6 addressesOmar Polo
2022-11-29more is*() unsigned char castOmar Polo
continuation of 6130e0eeac9db4fa8e6fe5934ec2d0ab202f979e
2022-11-27add an implicit fastcgi parameter: GEMINI_SEARCH_STRINGOmar Polo
it’s the QUERY_STRING decoded if it’s a search-string (i.e. not a key-value pair.) It’s useful for scripts to avoid percent-decoding the querystring in the most common case of a query, because in Gemini querystrings key-value paired are not common. Idea from a discussion with Allen Sobot.
2022-11-17always cast is*() arguments to unsigned charOmar Polo
2022-07-04copyright yearsOmar Polo
2022-07-04encode file names in the directory indexOmar Polo
Spotted the hard way by cage
2022-07-04bugfix: allow @ and : in pathsOmar Polo
gmid would disallow the '@' and ':' characters in paths (unless percent-encoded.) Issue reported by freezr.
2021-10-18fmtOmar Polo
2021-10-02drop now unused trim_req_iriOmar Polo
2021-09-24change struct initializationOmar Polo
makes more explicit which fields we're setting. (and kill an extra empty line)
2021-09-24use memset(3) rather than bzero(3)Omar Polo
There's no difference, but bzero(3) says STANDARDS The bzero() function conforms to the X/Open System Interfaces option of the IEEE Std 1003.1-2004 (“POSIX.1”) specification. It was removed from the standard in IEEE Std 1003.1-2008 (“POSIX.1”), which recommends using memset(3) instead. so here we are.
2021-07-07style(9)-ifyOmar Polo
2021-06-16make sure l is always initializedOmar Polo
I can't think of cases where we reach serialize_iri and path is NULL, but let's keep the safe side and initialize l. gcc 8 found this, clang didn't.
2021-04-12fix IRI-parsing bugOmar Polo
Some particularly crafted IRIs can cause a denial of service (DOS). IRIs which have a trailing `..' segment and resolve to a valid IRI (i.e. a .. that's not escaping the root directory) will make the server process loop forever. This is """just""" an DOS vulnerability, it doesn't expose anything sensitive or give an attacker anything else.
2021-02-12fix various compilation errorsOmar Polo
Include gmid.h as first header in every file, as it then includes config.h (that defines _GNU_SOURCE for instance). Fix also a warning about unsigned vs signed const char pointers in openssl.
2021-02-07[cgi] split the query in words if needed and add them to the argvOmar Polo
2021-02-06[iri] accept also : and @Omar Polo
again, to be RFC3986 compliant.
2021-02-05don't %-decode the queryOmar Polo
2021-02-01bring the CGI implementation in par with GLV-1.12556Omar Polo
2021-01-31ensure iri.host isn't NULLOmar Polo
2021-01-29accept a wider range of UNICODE codepoints while parsing hostnamesOmar Polo
2021-01-28legibility: use p[n] instead of (*(p + n))Omar Polo
2021-01-27trim_req_iri: set error stringOmar Polo
2021-01-21trim initial forward slashesOmar Polo
this parse gemini://example.com///foo into an IRI whose path is "foo". I'm not 100% this is standard-compliant but: 1. it seems a logical consequence of the URI/IRI cleaning algo (where we drop sequential slashes) 2. practically speaking serving file a sequence of forward slashes doesn't really make sense, even in the case of CGI scripts
2021-01-16wordingOmar Polo
2021-01-15check also that the port number matchesOmar Polo
2021-01-15styleOmar Polo
2021-01-15normalize host name when parsing the IRIOmar Polo
RFC3986 3.2.2 "Host" says that > Although host is case-insensitive, producers and normalizers should > use lowercase for registered names and hexadecimal addresses for the > sake of uniformity, while only using uppercase letters for > percent-encodings. so we cope with that.
2021-01-13normalize schema when parsing the IRIOmar Polo
RFC3986 in section 3.1 "Scheme" says that > Although schemes are case-insensitive, the canonical form is > lowercase and documents that specify schemes must do so with > lowercase letters. An implementation should accept uppercase > letters as equivalent to lowercase in scheme names (e.g., allow > "HTTP" as well as "http") for the sake of robustness but should only > produce lowercase scheme names for consistency. so we cope with that. The other possibility would have been to use strcasecmp instead of strcmp when checking on the protocol, but since the "case" version, although popular, is not part of any standard AFAIK I prefer downcasing while parsing and be done with it.
2021-01-11remove infinite loopOmar Polo
2021-01-11s/uri/iri since we accept IRIsOmar Polo