Age | Commit message (Collapse) | Author |
|
I choose to out of paranoia, but the algorithm defined in RFC3986
allows for them. So, we should rather remove the leading '..'
component and continue to handle the rest of the path.
Fixes https://github.com/omar-polo/gmid/issues/12
|
|
|
|
Instead of doing multiple passes over the string use a modified
version canonpath() from kern_pledge.c that does all in a single
go.
|
|
|
|
continuation of 6130e0eeac9db4fa8e6fe5934ec2d0ab202f979e
|
|
it’s the QUERY_STRING decoded if it’s a search-string (i.e. not a
key-value pair.) It’s useful for scripts to avoid percent-decoding
the querystring in the most common case of a query, because in Gemini
querystrings key-value paired are not common.
Idea from a discussion with Allen Sobot.
|
|
|
|
|
|
Spotted the hard way by cage
|
|
gmid would disallow the '@' and ':' characters in paths (unless
percent-encoded.) Issue reported by freezr.
|
|
|
|
|
|
makes more explicit which fields we're setting.
(and kill an extra empty line)
|
|
There's no difference, but bzero(3) says
STANDARDS
The bzero() function conforms to the X/Open System Interfaces option of
the IEEE Std 1003.1-2004 (“POSIX.1”) specification. It was removed from
the standard in IEEE Std 1003.1-2008 (“POSIX.1”), which recommends using
memset(3) instead.
so here we are.
|
|
|
|
I can't think of cases where we reach serialize_iri and path is NULL,
but let's keep the safe side and initialize l. gcc 8 found this,
clang didn't.
|
|
Some particularly crafted IRIs can cause a denial of service (DOS).
IRIs which have a trailing `..' segment and resolve to a valid IRI
(i.e. a .. that's not escaping the root directory) will make the
server process loop forever.
This is """just""" an DOS vulnerability, it doesn't expose anything
sensitive or give an attacker anything else.
|
|
Include gmid.h as first header in every file, as it then includes
config.h (that defines _GNU_SOURCE for instance).
Fix also a warning about unsigned vs signed const char pointers in
openssl.
|
|
|
|
again, to be RFC3986 compliant.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
this parse gemini://example.com///foo into an IRI whose path is
"foo". I'm not 100% this is standard-compliant but:
1. it seems a logical consequence of the URI/IRI cleaning algo (where
we drop sequential slashes)
2. practically speaking serving file a sequence of forward slashes
doesn't really make sense, even in the case of CGI scripts
|
|
|
|
|
|
|
|
RFC3986 3.2.2 "Host" says that
> Although host is case-insensitive, producers and normalizers should
> use lowercase for registered names and hexadecimal addresses for the
> sake of uniformity, while only using uppercase letters for
> percent-encodings.
so we cope with that.
|
|
RFC3986 in section 3.1 "Scheme" says that
> Although schemes are case-insensitive, the canonical form is
> lowercase and documents that specify schemes must do so with
> lowercase letters. An implementation should accept uppercase
> letters as equivalent to lowercase in scheme names (e.g., allow
> "HTTP" as well as "http") for the sake of robustness but should only
> produce lowercase scheme names for consistency.
so we cope with that. The other possibility would have been to use
strcasecmp instead of strcmp when checking on the protocol, but since
the "case" version, although popular, is not part of any standard
AFAIK I prefer downcasing while parsing and be done with it.
|
|
|
|
|