README.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140

# gmid

gmid is a Gemini server written with security in mind.  I initially
wrote it to serve static files, but it has grown into a featureful
server that can be used from either the command line to serve local
directories

    gmid docs  # serve the directory docs over gemini

or as a traditional daemon

    gmid -c /etc/gmid.conf


## Features

(random order)

 - IRI support (RFC3987)
 - punycode support
 - dual stack (IPv4 and IPv6)
 - automatic certificate generation for config-less mode
 - CGI scripts
 - (very) low memory footprint
 - small codebase, easily hackable
 - virtual hosts
 - per-location rules
 - optional directory listings
 - configurable mime types
 - sandboxed by default on OpenBSD, Linux and FreeBSD
 - chroot support


## Drawbacks

 - not suited for very busy hosts.  If you receive an high number of
   connection per-second you'd probably want to run multiple gmid
   instances behind relayd/haproxy or a different server.


## Internationalisation (IRIs, UNICODE, punycode, all that stuff)

Even thought the current Gemini specification doesn't mention anything
in this regard, I do think these are important things and so I tried
to implement them in the most user-friendly way I could think of.

For starters, gmid has full support for IRI (RFC3987 —
Internationalized Resource Identifiers).  IRIs are a superset of URIs,
so there aren't incompatibilities with URI-only clients.

There is full support also for punycode.  In theory, the user doesn't
even need to know that punycode is a thing.  The hostname in the
configuration file can (and must be) in the decoded form (e.g. `naïve`
and not `xn--nave-6pa`), gmid will do the rest.

The only missing piece is UNICODE normalisation of the IRI path: gmid
doesn't do that (yet).


## Building

gmid depends on a POSIX libc, OpenSSL/LibreSSL and libtls (provided
either by LibreSSL or libretls).  At build time, flex and yacc (or GNU
bison) are also needed.

The build is as simple as

    ./configure
    make

If the configure scripts fails to pick up something, please open an
issue or notify me via email.

To install execute:

    make install

### Docker

If you have trouble installing LibreSSL or libretls, you can use
Docker to build a `gmid` image with:

    docker build -t gmid .

and then run it with something along the lines of

    docker run --rm -it -p 1965:1965 \
        -v /path/to/gmid.conf:...:ro \
        -v /path/to/docs:/var/gemini \
        gmid -c .../gmid.conf

(ellipses used for brevity)

### Local libretls

This is **NOT** recommended, please try to port LibreSSL/LibreTLS to
your distribution of choice or use docker instead.

However, it's possible to statically-link `gmid` to locally-installed
libretls quite easily.  (It's how I test gmid on Fedora, for instance)

Let's say you have compiled and installed libretls in `$LIBRETLS`,
then you can build `gmid` with

    ./configure CFLAGS="-I$LIBRETLS/include" \
                LDFLAGS="$LIBRETLS/lib/libtls.a -lssl -lcrypto -lpthread"
    make

### Testing

Execute

    make regress

to start the suite.  Keep in mind that the regression tests will
create files inside the `regress` directory and bind the 10965 port.


## Architecture/Security considerations

gmid is composed by two processes: a listener and an executor.  The
listener process is the only one that needs internet access and is
sandboxed.  When a CGI script needs to be executed, the executor
(outside of the sandbox) sets up a pipe and gives one end to the
listener, while the other is bound to the CGI script standard output.
This way, is still possible to execute CGI scripts without
restrictions even in the presence of a sandboxed network process.

On OpenBSD, the listener runs with the `stdio recvfd rpath inet`
pledges, while the executor has `stdio sendfd proc exec`; both have
unveiled only the served directories.

On FreeBSD, the executor process is sandboxed with `capsicum(4)`.

On Linux, a `seccomp(2)` filter is installed in the listener to allow
only certain syscalls, see [sandbox.c](sandbox.c) for more information
on the BPF program.

In any case, you are invited to run gmid inside some sort of
container/jail/chroot.