555 lines
No EOL
23 KiB
Text
555 lines
No EOL
23 KiB
Text
Before delving in here, do note this is largely guidelines. I'm known to not
|
|
follow my own code standards. This is largely just to help keep anyone who wants
|
|
to help out on the same page. Ultimately, if you need to break a "rule", break
|
|
it. Just make sure you know what you're breaking and why.
|
|
|
|
Intellectual property
|
|
=====================
|
|
|
|
Regarding proprietary programs:
|
|
|
|
Don't reference them. We don't need to get in trouble.
|
|
|
|
Regarding free programs:
|
|
|
|
Be careful referencing them, especially if they use a GPL-style license. I
|
|
don't need that fucking licensing nightmare getting dragged into this. If it
|
|
uses a 0BSD-style license, you can freely reference it. I just ask that you
|
|
properly credit the original. Also, try not to straight up copy it. I'm going
|
|
for original code here.
|
|
|
|
Trademarks:
|
|
|
|
Don't bother with acknowledgements. Just make sure it's written in a way to
|
|
not be mistaken for our own trademark.
|
|
|
|
Program design
|
|
==============
|
|
|
|
Which language:
|
|
|
|
C. We use C around here. It's honestly the only language I'm still comfortable
|
|
with. I'm out of practice with anything else. Though, for what we do here, C is
|
|
kinda the most suitable language we have anyways.
|
|
|
|
In particular, though, of what I'm even familiar with, I don't like
|
|
object-oriented programming, so no C++ or Java. Java's too bloated anyways.
|
|
Same for Clojure. FORTRAN's not really useful here, though I do want to have
|
|
a library for FORTRAN. And Prolog is really only useful as a proof language.
|
|
So, really, C is the only thing I want to use (except where we need asm).
|
|
|
|
Compatibility:
|
|
|
|
We do compatibility here. Current, we're working off POSIX/the Single UNIX
|
|
Specification, the ISO C standard, and the Filesystem Hierarchy Standard.
|
|
Everything implemented should *not* conflict with these standards. We can add
|
|
things here and there, where needed or where it might be fun. But anything added
|
|
should either be something not specified by the standard (like the init system)
|
|
or that is otherwise optional, e.g. a game.
|
|
|
|
Non-standard features:
|
|
|
|
Don't use them. For instance, don't use any GNU C extensions. You should be
|
|
writing standard compliant code. Specifically, we use `-std=c99`. This said,
|
|
extensions may be unavoidable in places. The packing requirements for i386
|
|
interrupt structs, for instance, seemingly necessitates __attribute__((packed)),
|
|
but that may just be a lack of understanding of C structure packing on my part.
|
|
Maybe, just maybe, that could be done without. But, yeah. Unless absolutely
|
|
necessary, don't use extensions beyond the POSIX extensions.
|
|
|
|
More on C standards:
|
|
|
|
We use C99. However, there are some exceptions. For instance, we never use
|
|
single-line comments (//). Instead, we always use block comments (/* */).
|
|
|
|
Other things not to do include using trigraphs (why do those even still exist?)
|
|
and writing anything in pre-standard style.
|
|
|
|
Conditional compilation:
|
|
|
|
I guess use if statements in functions where possible. Outside of functions,
|
|
of course, use #if/#ifdef/#ifndef. In general, that's what you should do.
|
|
|
|
Writing Programs
|
|
================
|
|
|
|
Standards:
|
|
|
|
We obey POSIX around here. We want to implement all stuff found in the standards
|
|
referenced above. We don't want to deviate where it matters. Where does it
|
|
matter? Well, if the standard has something to say about it, it matters. For
|
|
instance, the POSIX standard specifies that cat has the following form:
|
|
cat [-u] [file...]
|
|
Don't add shit to that. We don't need `cat -v`.
|
|
|
|
If the standard doesn't have something to say, then consider the following:
|
|
1. Are we considering implementation details?
|
|
2. Is this optional, insofar as it's just a program that a user might install?
|
|
In the first case, for instance, POSIX doesn't say how printf should do its
|
|
thing. It says what printf needs to support (format specifiers, etc.) and what
|
|
results it should put out, but it doesn't specify the internals. It's a black
|
|
box. Here's what goes in, here's what goes out. What happens in the middle?
|
|
That's up to us to decide. Anything like that is fair game. POSIX and SUS don't
|
|
specify the initialization process/sequence for a UNIX system. Thus, how we
|
|
actually start up the system, our init program, is completely up to us.
|
|
|
|
In the second case, just make sure it's optional. Don't make it a key part of
|
|
the system. Like, if we were to add a COBOL compiler, don't make it important.
|
|
You should be able to remove the COBOL compiler from the system without it
|
|
breaking things. It should basically function like a software package that the
|
|
user'd install themselves. (In fact, you may just want to make it an installable
|
|
software package.) Do keep in mind for this kind of stuff, if there's a
|
|
standard, you should probably follow it. Like, if we do a COBOL compiler, follow
|
|
the COBOL standards. Please.
|
|
|
|
Robustness:
|
|
|
|
In general, arbitrary limits should be avoided. In general, try to accommodate,
|
|
e.g., long file names. It's okay to use, say, limits.h to get a limit for
|
|
something like path name lengths, but try to not do something like that, to have
|
|
a cap on that kinda stuff.
|
|
|
|
Keep all non-printing characters found in files. Try to support different
|
|
locales and character encodings, like UTF-8. There should be functions to help
|
|
you deal with that stuff.
|
|
|
|
Check for errors when making calls, especially system calls. In cases of errors,
|
|
print a fucking error message. Nothing is more frustrating for an end user than
|
|
having something fail and not being told why. (This was frustrating for me
|
|
during my first attempt at moving FENIX's kernel to the higher half. grub-file
|
|
tells you whether it's a valid multiboot header but not why it fails if it
|
|
isn't. I still have no clue why it wasn't working.)
|
|
|
|
Check memory allocation calls (malloc, calloc, realloc) for NULL return. Don't
|
|
just assume it allocated successfully. Be careful with realloc. Don't just
|
|
assume it's going to keep stuff in place.
|
|
|
|
If you call free, don't try to reference that memory again. Assume that the
|
|
moment you free'd it, something else immediately snapped that memory up.
|
|
|
|
If a malloc/calloc/realloc fails, that should be a fatal error. The only
|
|
exception should be in the login shell and related important system processes,
|
|
like TTY. If those have failed calls, that needs to be dealt with somehow.
|
|
After all, these are important bits. So, I don't know quite how to deal with
|
|
that, but definitely keep that in mind.
|
|
|
|
When declaring variables, most should be immediately initialized. For instance,
|
|
if you create a variable for a flag, it should be initialized to a sane default.
|
|
For example, if you want a flag for "hey, use this option", it should be
|
|
initialized to whatever the default should be (off or on). This is especially
|
|
important for static variables, which may be referenced by a different call
|
|
without it being initialized, which would be bad!
|
|
|
|
In the case of impossible conditions, print something to tell the user that
|
|
the program's hit a bug. Sure, someone's going to have to go in with the source
|
|
and a debugger and find that bug, but at least let the user know that a bug's
|
|
occurred. That way, the program doesn't just stop and they have no idea why.
|
|
You don't need to give a detailed explanation (though you may want to give some
|
|
info that might be useful to someone who wants to debug), but at least give
|
|
them something like "Oops, we've hit a bug! Killing <program>..." so they know
|
|
that what they tried to do led to a bug.
|
|
|
|
For exit values, don't just use error counts. Different errors should have
|
|
different error values. In general, the POSIX standard will give you an idea
|
|
of what error values matter, and you can define others as they're needed.
|
|
|
|
For temporary files, you should generally put them in /tmp, but probably have
|
|
some way for the user to tell the program where they might actually want temp.
|
|
files to go. Check TMPDIR, maybe? (By the way, be careful with this stuff. You
|
|
might hit security issues.)
|
|
|
|
Library behaviour:
|
|
|
|
Try to make functions reentrant. In general, if a function needs to be
|
|
thread-safe, it'll be reentrant, but even if it doesn't, it's not the worst
|
|
idea to make it reentrant. Even dynamic memory, though if it can't be avoided,
|
|
then so be it.
|
|
|
|
Also, some naming convention for you:
|
|
- If it's a POSIX function/macro/whatever, use that name
|
|
- Structs should be struct posix_name, not just posix_name
|
|
- If it's a function/whatever for making something work, start it with a _
|
|
- For instance _do_thing()
|
|
|
|
Basically, if it's not supposed to be user-facing, start it with _. Otherwise,
|
|
use a normal name, probably with one given in the standard.
|
|
|
|
Formatting error messages:
|
|
|
|
In compilers, use give error messages like so:
|
|
<sourcefile>:<lineno>: <message>
|
|
Line numbers start from 1.
|
|
|
|
If it's a program, tell give an error message that tells the user the program
|
|
and the issue. You'll probably want to also include a PID. For example:
|
|
<program> (<PID>): <message>
|
|
|
|
You might be able to get away with omitting the program and PID.
|
|
|
|
Interface standards:
|
|
|
|
Don't make behaviour specific to the name used. Even if we were to add GNU awk
|
|
compatibility to our awk implementation, don't make whether it uses GNU awk
|
|
mode dependent on whether you call awk or gawk. Make it a flag. In general,
|
|
though, this probably won't be an issue.
|
|
|
|
Don't make behaviour dependent on device. For instance, don't make output
|
|
differ depending on whether we're using a VTTY or a proper physical TTY. Leave
|
|
the specifics to the low-level interfaces (unless, of course, you're working on
|
|
those low-level interface, in which case you should do what you need to do).
|
|
The only exception is checking whether a thing is running interactively before
|
|
printing out terminal control characters for things like color. Like, in cal,
|
|
someone may just want to redirect that into a file. It shouldn't include the
|
|
codes used to change the color of the current day. Also, if you're outputting
|
|
binary data, maybe don't just send it to stdout. Ask, probably.
|
|
|
|
Finding executable and stuff:
|
|
|
|
Start with argv[0]. If it's a path name, it should be basename(argv[0]).
|
|
|
|
That's all I have to say, really.
|
|
|
|
CLIs:
|
|
|
|
Follow POSIX guidelines for options. I've broken this next rule quite a bit,
|
|
but maybe make use of getopt() for parsing arguments, instead of what I've been
|
|
doing in manually searching for them.
|
|
|
|
We're not doing long-named options. The idea is nice, but for now, we're
|
|
sticking with the POSIX standard. Maybe we'll add 'em in one day. (I mean, I
|
|
guess we could use them in non-POSIX utilities, but still.)
|
|
|
|
Memory usage:
|
|
|
|
Try to keep it reasonably low. Obviously, we don't need to keep it that low, but
|
|
don't go using all the RAM just to print out a message.
|
|
|
|
Valgrind's not the worst tool to play with. You might not need to worry about
|
|
all the messages it gives, but in general, try to keep it quiet, unless that
|
|
would really fuck up things otherwise. In other words, if it bitches at you
|
|
about not freeing up memory before exiting, add in the necessary free()s.
|
|
|
|
File usage:
|
|
|
|
In general, /etc is where configuration for system-level stuff goes. Runtime
|
|
created files should go in /var/cache or /tmp. You can also use /var/lib.
|
|
Files may be stored in /usr, but be prepared for a read-only /usr. Don't assume
|
|
you can write to /usr. In general, for system scope files (i.e. everything
|
|
outside of /home), refer to the filesystem hierarchy.
|
|
|
|
Style and other important things about C
|
|
========================================
|
|
|
|
Formatting:
|
|
|
|
Keep lines to 80 characters or less (especially since FENIX currently only
|
|
supports 80 character lines).
|
|
|
|
The open brace for any code block should go on the line where it starts.
|
|
int main(void) {
|
|
if(test) {
|
|
do{
|
|
} while(test);
|
|
}
|
|
}
|
|
|
|
There should be a space before any open brace or after any close brace (if
|
|
something follows said close brace).
|
|
|
|
Keep any function definitions to one line. Like this:
|
|
int main(void)
|
|
not:
|
|
int
|
|
main(void)
|
|
If a function definition is longer than 80 character, you are allowed to split
|
|
it.
|
|
|
|
For function bodies, use the following standards:
|
|
|
|
No space between function/keyword and paren. Like this:
|
|
int main(void)
|
|
if(test)
|
|
for(int i = 0; i < 10; i++)
|
|
printf("Hello, World!\n");
|
|
|
|
When splitting expressions, split after the operator:
|
|
if(condition1 && condition2 &&
|
|
condition3)
|
|
|
|
Do-whiles, as hinted at above, should have the end brace and while on the same
|
|
line, like so:
|
|
do {
|
|
thing();
|
|
} while(test);
|
|
|
|
For indentation, spaces, 2 spaces specifically. And, yeah. Use your brain on
|
|
how to actually use that. (The exception to this rule is makefiles, which
|
|
require tabs and a specific indentation style. Look up more on makefiles for
|
|
that information.) I know that tabs are technically better for accessibility,
|
|
but I've already written so much code with 2-space indents, and I really don't
|
|
feel like going back and fixing every single file, and I'd like everything to
|
|
remain consistent. If you want to replace every single space indent with tabs,
|
|
let me know. I'll give you my blessing and we'll use tabs for all new code.
|
|
Until then, sorry, I guess.
|
|
|
|
Comments:
|
|
|
|
Try to start programs and headers with a description of what it is. For example,
|
|
from stdlib.h:
|
|
/*
|
|
* <stdlib.h> - standard library definitions
|
|
*
|
|
* This header is a part of the FENIX C Library and is free software.
|
|
* You can redistribute and/or modify it subject to the terms of the
|
|
* Clumsy Wolf Public License v4. For more details, see the file COPYING.
|
|
*
|
|
* The FENIX C Library is distributed WITH NO WARRANTY WHATSOEVER. See
|
|
* The CWPL for more details.
|
|
*/
|
|
|
|
In general, it should say what the file/program is, what it does, and include
|
|
that free software, no warranty header.
|
|
|
|
Try to write comments in english, but if you can't then write it in your native
|
|
language. I'd prefer you use english, since that's my native language. I also
|
|
kinda understand spanish and know a small bit of danish. But, if you can't do
|
|
english or spanish, write in what you know. Just romanize your comment. For
|
|
instance, if you're writing in japanese, please use romaji instead of normal
|
|
japanese script. Otherwise, things get weird.
|
|
|
|
Probably not the worst idea to have a comment on what functions do, but don't
|
|
feel like you *need* to add them, especially if the function has a sensible
|
|
name. Like, if they function is called do_x(), you don't need a comment that
|
|
says that it "Does X".
|
|
|
|
Please only use one space after a sentence. It'll annoy me otherwise. Otherwise,
|
|
I'm not your English teacher. You're [sic] grammar doesn't need to be perfect,
|
|
as long as i [sic] (and others) can tell what you mean.
|
|
|
|
When commenting code, try not to comment the obvious. In general, if you need
|
|
to have a comment on what a variable does, you need to re-name the variable.
|
|
If you need to comment on what a block of code is doing, consider whether it
|
|
needs to be that complicated or if you can simplify it. This isn't always true,
|
|
but in general, try to keep to that rule of thumb.
|
|
|
|
If a variable is supposed to be a command-line flag, maybe include a comment
|
|
for what option it's supposed to be (i.e. /* -b */) if the name doesn't make it
|
|
otherwise obvious (e.g. the using_unbuffered_output variable in cat.c
|
|
corresponding to the -u flag.) If your flags are those kinda octal things
|
|
(04 for -x, 02 for -y, 01 for -z), definitely include a comment on what
|
|
corresponds to what (/* 04: -x, 02: -y, 01: -z */).
|
|
|
|
Using C constructs:
|
|
|
|
Always declare the type of objects. Explicitly declare all arguments to
|
|
functions and declare the return type of the function. So, it's `int var`, not
|
|
just `var`, and it's `int main(void)`, not `main()`.
|
|
|
|
When it comes to compiler options, use -Wall. Try to get rid of any warnings you
|
|
can, but if you can't, don't fret about it.
|
|
|
|
Be careful with linting tools. In general, don't bother with them, unless you
|
|
think it'll help with a bug. Don't just run linting tools, though. Like, you
|
|
generally don't need to bother casting malloc(). If your tool is telling you
|
|
that you need to, you can probably ignore it.
|
|
|
|
extern declarations should go at the start of a file or in a header. Don't put
|
|
externs inside function if you can avoid it.
|
|
|
|
Declare as many variables as you need. Sure, you can just carefully reuse the
|
|
same variable for different things, but it's probably better to just make
|
|
another variable. (The exception to this rule is counters like i, j, etc,
|
|
unless you specifically need to preserve the value of one of them.)
|
|
|
|
Avoid shadowing. If you declare a global variable, don't declare local variables
|
|
with the same name.
|
|
|
|
Declarations should either be on the same line or in multiple declarations.
|
|
So, do this:
|
|
int foo, bar;
|
|
Or this:
|
|
int foo;
|
|
int bar;
|
|
Not this:
|
|
int foo,
|
|
bar;
|
|
|
|
In if-else statements, *always use brackets*. Please. It makes it much clearer
|
|
as to what belongs to what. It'll keep you from ending up in a situation where
|
|
you've accidentally got a function call outside of an if-statement that should
|
|
actually be inside it. Also, single line else if.
|
|
|
|
Typedef your structs in the declaration. Basically, your structure declarations
|
|
should probably look like this:
|
|
typedef struct _f_foo {
|
|
/* Stuff goes here */
|
|
} foo;
|
|
|
|
Names:
|
|
|
|
Don't be terse in your naming. Give it a descriptive english name. Like, name
|
|
it `do_ending_newline`, not just `d`.
|
|
|
|
If it's only used shortly for an obvious purpose, you can ignore this. Like,
|
|
you can continue with for(int i = 0...). You don't need to name that variable
|
|
something else like counter. We're programmers. We know what i is used for.
|
|
|
|
Try not to use too many abbreviations in names. You can if you need to, but
|
|
you should try not to.
|
|
|
|
Use underscores to separate words in identifiers (snake_case), not CamelCase.
|
|
|
|
For names with constant values, you can probably decide on your own whether it's
|
|
better to use an enum or #define. If you use a #define, the name should be in
|
|
all uppercase ("DEFINE_CONST"). Enums should use lowercase ("enum_const").
|
|
|
|
For file names, I guess try to keep them short (14 characters), but don't feel
|
|
like you need to. It's not the worst idea for working with older UNIXes. Just
|
|
don't feel the need to be compatible with DOS 8.3 filenames. We don't care about
|
|
Microsoft's shit. Just other UNIXes.
|
|
|
|
Portability:
|
|
|
|
FENIX doesn't necessarily need to be portable, but should be portable purely
|
|
as a result of being completely to standard. So, our programs should be able to
|
|
run on any system that is standards compliant.
|
|
|
|
In general, you should use features in the standard. Don't try to write an
|
|
interface in a program if you can just use an interface in POSIX. Again, if
|
|
you're doing kernel or libc dev, you can kinda fudge this, but for util dev,
|
|
definitely use the standard interfaces.
|
|
|
|
Don't worry about supporting non-UNIX systems. Windows? Don't worry about it.
|
|
If you can do something using pure ISO C, then it doesn't hurt, but don't
|
|
overcomplicate it if you can just use a POSIX function instead.
|
|
|
|
Porting between CPU:
|
|
|
|
For a start, we're not worried about anything not 32-bit or higher. 16-bit?
|
|
Not a concern. In general, though, anything architecture dependent should be
|
|
in an arch dir. So, the actual low-level kernel code? arch/i386 (or whatever
|
|
arch you're writing for).
|
|
|
|
In general, assume that your types will be as defined in limits.h.
|
|
|
|
Don't assume endianness. Be careful about that stuff.
|
|
|
|
Calling system functions:
|
|
|
|
Just call the POSIX functions. Use standard interfaces to stuff.
|
|
|
|
Internationalization:
|
|
|
|
Um. Not sure how to handle this right now. If you want to port into another
|
|
language, get in touch.
|
|
|
|
Character set:
|
|
|
|
Try to stick to ASCII. (This is why I asked for romaji earlier.)
|
|
|
|
If you need non-ASCII, use UTF-8. Try to avoid this if possible.
|
|
|
|
Quote characters:
|
|
|
|
Be careful. Quotes are 0x22 ('"') and 0x27 ('''). Don't let your computer fuck
|
|
that stuff up.
|
|
|
|
If you're internationalizing, though, use the right quotes for stuff. Like, in
|
|
French, you'd want «».
|
|
|
|
mmap:
|
|
|
|
Don't assume it works for everything or fails for everything.
|
|
|
|
Try it on a file you want to use. If it fails, use read() and write().
|
|
|
|
Documentation
|
|
=============
|
|
|
|
Man pages:
|
|
|
|
The primary method of documenting stuff for FENIX is in the man pages. All else
|
|
is secondary. Man pages should have a fairly standard format of:
|
|
Title
|
|
Synopsis
|
|
Description
|
|
Options
|
|
Examples
|
|
Author
|
|
Copyright
|
|
|
|
The title is fairly basic. Name the thing and give a short description. For
|
|
instance: head - copy the first part of files.
|
|
|
|
The synopsis is just a listing of how the program works. So, for head, it's
|
|
head [-n number] file ...
|
|
Any optional arguments should be in square brackets. Mutually exclusive options
|
|
should be in the same brackets separated by pipes ([-s|-v]). Options that don't
|
|
take a further parameter (unlike -n in the above example) can be grouped
|
|
togethers if the software can take them that way ([-elm] for you can do -e, -l
|
|
and/or -m, -el, -lm or -em, or -elm). Variable type stuff, like file or number
|
|
should be """italicized""" (/fIvariable/fR). If you want to note that you can
|
|
repeat the last argument, e.g. take multiple files, use ellipses.
|
|
|
|
Description should give a full description of how it works. Tell what it does,
|
|
note any oddities or behaviour to be noted, and give a quick rundown of how
|
|
options change things.
|
|
|
|
Options should list each option one by one and explain it in detail.
|
|
|
|
Examples should give at least one example of how to use it and what the example
|
|
would do.
|
|
|
|
The author should be whoever wrote the program. List the original author(s).
|
|
So, you may notice plenty of utils have the author as "Kat", since I, Kat, am
|
|
the one who wrote them.
|
|
|
|
Copyright should contain the following:
|
|
Copyright (C) 2019 The FENIX Project
|
|
|
|
This software is free software. Feel free to modify it
|
|
and/or pass it around.
|
|
A lot of older man pages will list the copyright as:
|
|
Copyright (C) 2019 Katlynn Richey
|
|
Not the worst idea to update that if you see it.
|
|
|
|
You might also want to include the package of a thing, if it's got one. For
|
|
example, utilities include:
|
|
This <util> implementation is a part of the fenutils package.
|
|
|
|
Physical manual:
|
|
|
|
The secondary documentation is the manual written in roff. Practically, it
|
|
documents the same kinda things. Name, synopsis, options, etc. It also includes
|
|
some other stuff, like See Also, for related stuff; Diagnostics, for what all
|
|
errors it can produce; and Bugs, for any bugs in a program. Additionally, the
|
|
author field is different, and the details for that, along with other bits,
|
|
can be found in the Intro.
|
|
|
|
License for the manuals:
|
|
|
|
The manuals are tentatively under CC-BY. Maybe I'll write my own license, like
|
|
with the CWPL. For now, though, it's CC-BY.
|
|
|
|
Credits:
|
|
|
|
In man pages (including physical), include the original author of the program,
|
|
not the author of the man page. (Generally, you should be writing the man
|
|
pages yourself anyways.)
|
|
|
|
On the title page of the physical manual, all the folx behind the project should
|
|
be named. For now, this is fairly small. If it gets too big, we'll give them
|
|
all credit within the first few pages and have the author on the manual as
|
|
"The FENIX Project Manual Team".
|
|
|
|
On other manuals:
|
|
|
|
Don't copy other people's manuals wholesale. Try to write it yourself. The
|
|
exception is in working with the POSIX standard. Don't copy that wholesale, but
|
|
you can base your man page on the relevant POSIX page. Just know that the
|
|
POSIX page probably won't give you a complete man page!
|
|
|
|
Releases
|
|
========
|
|
|
|
I'll worry about that later. |