Initial commit
This commit is contained in:
commit
b686116502
33 changed files with 2722 additions and 0 deletions
543
notes/notes on code standards and style
Normal file
543
notes/notes on code standards and style
Normal file
|
@ -0,0 +1,543 @@
|
|||
Intellectual property
|
||||
=====================
|
||||
|
||||
Regarding proprietary programs:
|
||||
|
||||
Don't reference them. We don't need to get in trouble.
|
||||
|
||||
Regarding free programs:
|
||||
|
||||
Be careful referencing them, especially if they use a GPL-style license. I
|
||||
don't need that fucking licensing nightmare getting dragged into this. If it
|
||||
uses a 0BSD-style license, you can freely reference it, even copy it. I just
|
||||
ask that you properly credit the original.
|
||||
|
||||
Trademarks:
|
||||
|
||||
Don't bother with acknowledgements. Just make sure it's written in a way to
|
||||
not be mistaken for our own trademark.
|
||||
|
||||
Program design
|
||||
==============
|
||||
|
||||
Which language:
|
||||
|
||||
C. We use C around here. It's honestly the only language I'm still comfortable
|
||||
with. I'm out of practice with anything else. Though, for what we do here, C is
|
||||
kinda the most suitable language we have anyways.
|
||||
|
||||
In particular, though, of what I'm even familiar with, I don't like
|
||||
object-oriented programming, so no C++ or Java. Java's too bloated anyways.
|
||||
Same for Clojure. FORTRAN's not really useful here, though I do want to have
|
||||
a library for FORTRAN. And Prolog is really only useful as a proof language.
|
||||
So, really, C is the only thing I want to use (except where we need asm).
|
||||
|
||||
Compatibility:
|
||||
|
||||
We do compatibility here. Current, we're working off POSIX/the Single UNIX
|
||||
Specification, the ISO C standard, and the Filesystem Hierarchy Standard.
|
||||
Everything implemented should *not* conflict with these standards. We can add
|
||||
things here and there, where needed or where it might be fun. But anything added
|
||||
should either be something not specified by the standard (like the init system)
|
||||
or that is otherwise optional, e.g. a game.
|
||||
|
||||
Non-standard features:
|
||||
|
||||
Don't use them. For instance, don't use any GNU C extensions. You should be
|
||||
writing standard compliant code. Specifically, we use `-std=c99`.
|
||||
|
||||
More on C standards:
|
||||
|
||||
We use C99. However, there are some exceptions. For instance, we never use
|
||||
single-line comments (//). Instead, we always use block comments (/* */).
|
||||
|
||||
Other things not to do include using trigraphs (why do those even still exist?)
|
||||
and writing anything in pre-standard style.
|
||||
|
||||
Conditional compilation:
|
||||
|
||||
I guess use if statements in functions where possible. Outside of functions,
|
||||
of course, use #if/#ifdef/#ifndef. In general, that's what you should do.
|
||||
|
||||
Writing Programs
|
||||
================
|
||||
|
||||
Standards:
|
||||
|
||||
We obey POSIX around here. We want to implement all stuff found in the standards
|
||||
referenced above. We don't want to deviate where it matters. Where does it
|
||||
matter? Well, if the standard has something to say about it, it matters. For
|
||||
instance, the POSIX standard specifies that cat has the following form:
|
||||
cat [-u] [file...]
|
||||
Don't add shit to that. We don't need `cat -v`.
|
||||
|
||||
If the standard doesn't have something to say, then consider the following:
|
||||
1. Are we considering implementation details?
|
||||
2. Is this optional, insofar as it's just a program that a user might install?
|
||||
In the first case, for instance, POSIX doesn't say how printf should do its
|
||||
thing. It says what printf needs to support (format specifiers, etc.) and what
|
||||
results it should put out, but it doesn't specify the internals. It's a black
|
||||
box. Here's what goes in, here's what goes out. What happens in the middle?
|
||||
That's up to us to decide. Anything like that is fair game. POSIX and SUS don't
|
||||
specify the initialization process/sequence for a UNIX system. Thus, how we
|
||||
actually start up the system, our init program, is completely up to us.
|
||||
|
||||
In the second case, just make sure it's optional. Don't make it a key part of
|
||||
the system. Like, if we were to add a COBOL compiler, don't make it important.
|
||||
You should be able to remove the COBOL compiler from the system without it
|
||||
breaking things. It should basically function like a software package that the
|
||||
user'd install themselves. (In fact, you may just want to make it an installable
|
||||
software package.) Do keep in mind for this kind of stuff, if there's a
|
||||
standard, you should probably follow it. Like, if we do a COBOL compiler, follow
|
||||
the COBOL standards. Please.
|
||||
|
||||
Robustness:
|
||||
|
||||
In general, arbitrary limits should be avoided. In general, try to accommodate,
|
||||
e.g., long file names. It's okay to use, say, limits.h to get a limit for
|
||||
something like path name lengths, but try to not do something like that, to have
|
||||
a cap on that kinda stuff.
|
||||
|
||||
Keep all non-printing characters found in files. Try to support different
|
||||
locales and character encodings, like UTF-8. There should be functions to help
|
||||
you deal with that stuff.
|
||||
|
||||
Check for errors when making calls, especially system calls. In cases of errors,
|
||||
print a fucking error message. Nothing is more frustrating for an end user than
|
||||
having something fail and not being told why. (This was frustrating for me
|
||||
during my first attempt at moving FENIX's kernel to the higher half. grub-file
|
||||
tells you whether it's a valid multiboot header but not why it fails if it
|
||||
isn't. I still have no clue why it wasn't working.)
|
||||
|
||||
Check memory allocation calls (malloc, calloc, realloc) for NULL return. Don't
|
||||
just assume it allocated successfully. Be careful with realloc. Don't just
|
||||
assume it's going to keep stuff in place.
|
||||
|
||||
If you call free, don't try to reference that memory again. Assume that the
|
||||
moment you free'd it, something else immediately snapped that memory up.
|
||||
|
||||
If a malloc/calloc/realloc fails, that should be a fatal error. The only
|
||||
exception should be in the login shell and related important system processes,
|
||||
like TTY. If those have failed calls, that needs to be dealt with somehow.
|
||||
After all, these are important bits. So, I don't know quite how to deal with
|
||||
that, but definitely keep that in mind.
|
||||
|
||||
When declaring variables, most should be immediately initialized. For instance,
|
||||
if you create a variable for a flag, it should be initialized to a sane default.
|
||||
For example, if you want a flag for "hey, use this option", it should be
|
||||
initialized to whatever the default should be (off or on). This is especially
|
||||
important for static variables, which may be referenced by a different call
|
||||
without it being initialized, which would be bad!
|
||||
|
||||
In the case of impossible conditions, print something to tell the user that
|
||||
the program's hit a bug. Sure, someone's going to have to go in with the source
|
||||
and a debugger and find that bug, but at least let the user know that a bug's
|
||||
occurred. That way, the program doesn't just stop and they have no idea why.
|
||||
You don't need to give a detailed explanation (though you may want to give some
|
||||
info that might be useful to someone who wants to debug), but at least give
|
||||
them something like "Oops, we've hit a bug! Killing <program>..." so they know
|
||||
that what they tried to do led to a bug.
|
||||
|
||||
For exit values, don't just use error counts. Different errors should have
|
||||
different error values. In general, the POSIX standard will give you an idea
|
||||
of what error values matter, and you can define others as they're needed.
|
||||
|
||||
For temporary files, you should generally put them in /tmp, but probably have
|
||||
some way for the user to tell the program where they might actually want temp.
|
||||
files to go. Check TMPDIR, maybe? (By the way, be careful with this stuff. You
|
||||
might hit security issues.)
|
||||
|
||||
Library behaviour:
|
||||
|
||||
Try to make functions reentrant. In general, if a function needs to be
|
||||
thread-safe, it'll be reentrant, but even if it doesn't, it's not the worst
|
||||
idea to make it reentrant. Even dynamic memory, though if it can't be avoided,
|
||||
then so be it.
|
||||
|
||||
Also, some naming convention for you:
|
||||
- If it's a POSIX function/macro/whatever, use that name
|
||||
- Structs should be struct posix_name, not just posix_name
|
||||
- If it's a function/whatever for making something work, start it with a _
|
||||
- For instance _do_thing()
|
||||
|
||||
Basically, if it's not supposed to be user-facing, start it with _. Otherwise,
|
||||
use a normal name, probably with one given in the standard.
|
||||
|
||||
Formatting error messages:
|
||||
|
||||
In compilers, use give error messages like so:
|
||||
<sourcefile>:<lineno>: <message>
|
||||
Line numbers start from 1.
|
||||
|
||||
If it's a program, tell give an error message that tells the user the program
|
||||
and the issue. You'll probably want to also include a PID. For example:
|
||||
<program> (<PID>): <message>
|
||||
|
||||
You might be able to get away with omitting the program and PID.
|
||||
|
||||
Interface standards:
|
||||
|
||||
Don't make behaviour specific to the name used. Even if we were to add GNU awk
|
||||
compatibility to our awk implementation, don't make whether it uses GNU awk
|
||||
mode dependent on whether you call awk or gawk. Make it a flag. In general,
|
||||
though, this probably won't be an issue.
|
||||
|
||||
Don't make behaviour dependent on device. For instance, don't make output
|
||||
differ depending on whether we're using a VTTY or a proper physical TTY. Leave
|
||||
the specifics to the low-level interfaces (unless, of course, you're working on
|
||||
those low-level interface, in which case you should do what you need to do).
|
||||
The only exception is checking whether a thing is running interactively before
|
||||
printing out terminal control characters for things like color. Like, in cal,
|
||||
someone may just want to redirect that into a file. It shouldn't include the
|
||||
codes used to change the color of the current day. Also, if you're outputting
|
||||
binary data, maybe don't just send it to stdout. Ask, probably.
|
||||
|
||||
Finding executable and stuff:
|
||||
|
||||
Start with argv[0]. If it's a path name, it should be basename(argv[0]).
|
||||
|
||||
That's all I have to say, really.
|
||||
|
||||
GUIs:
|
||||
|
||||
We're not really concerned about them at this point in time. When we get to
|
||||
that, we're gonna probably be running off the X Window System. We'll probably
|
||||
have our own WM/DE. But, again, we're not worried about that right now. Right
|
||||
now, we're focused on just getting the console level stuff up.
|
||||
|
||||
CLIs:
|
||||
|
||||
Follow POSIX guidelines for options. I've broken this next rule quite a bit,
|
||||
but maybe make use of getopt() for parsing arguments, instead of what I've been
|
||||
doing in manually searching for them.
|
||||
|
||||
We're not doing long-named options. The idea is nice, but for now, we're
|
||||
sticking with the POSIX standard. Maybe we'll add 'em in one day. (I mean, I
|
||||
guess we could use them in non-POSIX utilities, but still.)
|
||||
|
||||
Memory usage:
|
||||
|
||||
Try to keep it reasonably low. Obviously, we don't need to keep it that low, but
|
||||
don't go using all the RAM just to print out a message.
|
||||
|
||||
Valgrind's not the worst tool to play with. You might not need to worry about
|
||||
all the messages it gives, but in general, try to keep it quiet, unless that
|
||||
would really fuck up things otherwise. In other words, if it bitches at you
|
||||
about not freeing up memory before exiting, add in the necessary free()s.
|
||||
|
||||
File usage:
|
||||
|
||||
In general, /etc is where configuration for system-level stuff goes. Runtime
|
||||
created files should go in /var/cache or /tmp. You can also use /var/lib.
|
||||
Files may be stored in /usr, but be prepared for a read-only /usr. Don't assume
|
||||
you can write to /usr.
|
||||
|
||||
Style and other important things about C
|
||||
========================================
|
||||
|
||||
Formatting:
|
||||
|
||||
Keep lines to 80 characters or less (especially since FENIX currently only
|
||||
supports 80 character lines).
|
||||
|
||||
The open brace for any code block should go on the line where it starts.
|
||||
int main(void) {
|
||||
if(test) {
|
||||
do{
|
||||
} while(test);
|
||||
}
|
||||
}
|
||||
|
||||
There should be a space before any open brace or after any close brace (if
|
||||
something follows said close brace).
|
||||
|
||||
Keep any function definitions to one line. Like this:
|
||||
int main(void)
|
||||
not:
|
||||
int
|
||||
main(void)
|
||||
If a function definition is longer than 80 character, you are allowed to split
|
||||
it.
|
||||
|
||||
For function bodies, use the following standards:
|
||||
|
||||
No space between function/keyword and paren. Like this:
|
||||
int main(void)
|
||||
if(test)
|
||||
for(int i = 0; i < 10; i++)
|
||||
printf("Hello, World!\n");
|
||||
|
||||
When splitting expressions, split after the operator:
|
||||
if(condition1 && condition2 &&
|
||||
condition3)
|
||||
|
||||
Do-whiles, as hinted at above, should have the end brace and while on the same
|
||||
line, like so:
|
||||
do {
|
||||
thing();
|
||||
} while(test);
|
||||
|
||||
For indentation, spaces, 2 spaces specifically. And, yeah. Use your brain on
|
||||
how to actually use that. (The exception to this rule is makefiles, which
|
||||
require tabs and a specific indentation style. Look up more on makefiles for
|
||||
that information.)
|
||||
|
||||
Comments:
|
||||
|
||||
Try to start programs and headers with a description of what it is. For example,
|
||||
from stdlib.h:
|
||||
/*
|
||||
* <stdlib.h> - standard library definitions
|
||||
*
|
||||
* This header is a part of the FENIX C Library and is free software.
|
||||
* You can redistribute and/or modify it subject to the terms of the
|
||||
* Clumsy Wolf Public License v4. For more details, see the file COPYING.
|
||||
*
|
||||
* The FENIX C Library is distributed WITH NO WARRANTY WHATSOEVER. See
|
||||
* The CWPL for more details.
|
||||
*/
|
||||
|
||||
In general, it should say what the file/program is, what it does, and include
|
||||
that free software, no warranty header.
|
||||
|
||||
Try to write comments in english, but if you can't then write it in your native
|
||||
language. I'd prefer you use english, since that's my native language. I also
|
||||
kinda understand spanish and know a small bit of danish. But, if you can't do
|
||||
english or spanish, write in what you know. Just romanize your comment. For
|
||||
instance, if you're writing in japanese, please use romaji instead of normal
|
||||
japanese script. Otherwise, things get weird.
|
||||
|
||||
Probably not the worst idea to have a comment on what functions do, but don't
|
||||
feel like you *need* to add them, especially if the function has a sensible
|
||||
name. Like, if they function is called do_x(), you don't need a comment that
|
||||
says that it "Does X".
|
||||
|
||||
Please only use one space after a sentence. It'll annoy me otherwise. Also, use
|
||||
complete sentences, capitalize sentences unless they start with a lowercase
|
||||
identifier, etc., etc.
|
||||
|
||||
When commenting code, try not to comment the obvious. In general, if you need
|
||||
to have a comment on what a variable does, you need to re-name the variable.
|
||||
If you need to comment on what a block of code is doing, consider whether it
|
||||
needs to be that complicated or if you can simplify it. This isn't always true,
|
||||
but in general, try to keep to that rule of thumb.
|
||||
|
||||
Using C constructs:
|
||||
|
||||
Always declare the type of objects. Explicitly declare all arguments to
|
||||
functions and declare the return type of the function. So, it's `int var`, not
|
||||
just `var`, and it's `int main(void)`, not `main()`.
|
||||
|
||||
When it comes to compiler options, use -Wall. Try to get rid of any warnings you
|
||||
can, but if you can't, don't fret about it.
|
||||
|
||||
Be careful with linting tools. In general, don't bother with them, unless you
|
||||
think it'll help with a bug. Don't just run linting tools, though. Like, you
|
||||
generally don't need to bother casting malloc(). If your tool is telling you
|
||||
that you need to, you can probably ignore it.
|
||||
|
||||
extern declarations should go at the start of a file or in a header. Don't put
|
||||
externs inside function if you can avoid it.
|
||||
|
||||
Declare as many variables as you need. Sure, you can just carefully reuse the
|
||||
same variable for different things, but it's probably better to just make
|
||||
another variable. (The exception to this rule is counters like i, j, etc,
|
||||
unless you specifically need to preserve the value of one of them.)
|
||||
|
||||
Avoid shadowing. If you declare a global variable, don't declare local variables
|
||||
with the same name.
|
||||
|
||||
Declarations should either be on the same line or in multiple declarations.
|
||||
So, do this:
|
||||
int foo, bar;
|
||||
Or this:
|
||||
int foo;
|
||||
int bar;
|
||||
Not this:
|
||||
int foo,
|
||||
bar;
|
||||
|
||||
In if-else statements, *always use brackets*. Please. It makes it much clearer
|
||||
as to what belongs to what. It'll keep you from ending up in a situation where
|
||||
you've accidentally got a function call outside of an if-statement that should
|
||||
actually be inside it. Also, single line else if.
|
||||
|
||||
Typedef your structs in the declaration. Basically, your structure declarations
|
||||
should probably look like this:
|
||||
typedef struct _f_foo {
|
||||
/* Stuff goes here */
|
||||
} foo;
|
||||
|
||||
Names:
|
||||
|
||||
Don't be terse in your naming. Give it a descriptive english name. Like, name
|
||||
it `do_ending_newline`, not just `d`.
|
||||
|
||||
If it's only used shortly for an obvious purpose, you can ignore this. Like,
|
||||
you can continue with for(int i = 0...). You don't need to name that variable
|
||||
something else like counter. We're programmers. We know what i is used for.
|
||||
|
||||
Try not to use too many abbreviations in names. You can if you need to, but
|
||||
you should try not to.
|
||||
|
||||
Use underscores to separate words in identifiers (snake_case), not CamelCase.
|
||||
|
||||
If a variable is supposed to be a command-line flag, maybe include a comment
|
||||
for what option it's supposed to be (i.e. /* -b */). If your flags are those
|
||||
kinda octal things (04 for -x, 02 for -y, 01 for -z), definitely include a
|
||||
comment on what corresponds to what (/* 04: -x, 02: -y, 01: -z */).
|
||||
|
||||
For names with constant values, you can probably decide on your own whether it's
|
||||
better to use an enum or #define. If you use a #define, the name should be in
|
||||
all uppercase ("DEFINE_CONST"). Enums should use lowercase ("enum_const").
|
||||
|
||||
For file names, I guess try to keep them short (14 characters), but don't feel
|
||||
like you need to. It's not the worst idea for working with older UNIXes. Just
|
||||
don't feel the need to be compatible with DOS 8.3 filenames. We don't care about
|
||||
Microsoft's shit. Just other UNIXes.
|
||||
|
||||
Portability:
|
||||
|
||||
FENIX doesn't necessarily need to be portable, but should be portable purely
|
||||
as a result of being completely to standard. So, our programs should be able to
|
||||
run on any system that is standards compliant.
|
||||
|
||||
In general, you should use features in the standard. Don't try to write an
|
||||
interface in a program if you can just use an interface in POSIX. Again, if
|
||||
you're doing kernel or libc dev, you can kinda fudge this, but for util dev,
|
||||
definitely use the standard interfaces.
|
||||
|
||||
Don't worry about supporting non-UNIX systems. Windows? Don't worry about it.
|
||||
If you can do something using pure ISO C, then it doesn't hurt, but don't
|
||||
overcomplicate it if you can just use a POSIX function instead.
|
||||
|
||||
Porting between CPU:
|
||||
|
||||
For a start, we're not worried about anything not 32-bit or higher. 16-bit?
|
||||
Not a concern. In general, though, anything architecture dependent should be
|
||||
in an arch dir. So, the actual low-level kernel code? arch/i386 (or whatever
|
||||
arch you're writing for).
|
||||
|
||||
In general, assume that your types will be as defined in limits.h.
|
||||
|
||||
Don't assume endianness. Be careful about that stuff.
|
||||
|
||||
Calling system functions:
|
||||
|
||||
Just call the POSIX functions. Use standard interfaces to stuff.
|
||||
|
||||
Internationalization:
|
||||
|
||||
Um. Not sure how to handle this right now. If you want to port into another
|
||||
language, get in touch.
|
||||
|
||||
Character set:
|
||||
|
||||
Try to stick to ASCII. (This is why I asked for romaji earlier.)
|
||||
|
||||
If you need non-ASCII, use UTF-8. Try to avoid this if possible.
|
||||
|
||||
Quote characters:
|
||||
|
||||
Be careful. Quotes are 0x22 ('"') and 0x27 ('''). Don't let your computer fuck
|
||||
that stuff up.
|
||||
|
||||
If you're internationalizing, though, use the right quotes for stuff. Like, in
|
||||
French, you'd want «».
|
||||
|
||||
mmap:
|
||||
|
||||
Don't assume it works for everything or fails for everything.
|
||||
|
||||
Try it on a file you want to use. If it fails, use read() and write().
|
||||
|
||||
Documentation
|
||||
=============
|
||||
|
||||
Man pages:
|
||||
|
||||
The primary method of documenting stuff for FENIX is in the man pages. All else
|
||||
is secondary. Man pages should have a fairly standard format of:
|
||||
Title
|
||||
Synopsis
|
||||
Description
|
||||
Options
|
||||
Examples
|
||||
Author
|
||||
Copyright
|
||||
|
||||
The title is fairly basic. Name the thing and give a short description. For
|
||||
instance: head - copy the first part of files.
|
||||
|
||||
The synopsis is just a listing of how the program works. So, for head, it's
|
||||
head [-n number] file ...
|
||||
Any optional arguments should be in square brackets. Mutually exclusive options
|
||||
should be in the same brackets separated by pipes ([-s|-v]). Options that don't
|
||||
take a further parameter (unlike -n in the above example) can be grouped
|
||||
togethers if the software can take them that way ([-elm] for you can do -e, -l
|
||||
and/or -m, -el, -lm or -em, or -elm). Variable type stuff, like file or number
|
||||
should be """italicized""" (/fIvariable/fR). If you want to note that you can
|
||||
repeat the last argument, e.g. take multiple files, use ellipses.
|
||||
|
||||
Description should give a full description of how it works. Tell what it does,
|
||||
note any oddities or behaviour to be noted, and give a quick rundown of how
|
||||
options change things.
|
||||
|
||||
Options should list each option one by one and explain it in detail.
|
||||
|
||||
Examples should give at least one example of how to use it and what the example
|
||||
would do.
|
||||
|
||||
The author should be whoever wrote the program. List the original author(s).
|
||||
So, you may notice plenty of utils have the author as "Kat", since I, Kat, am
|
||||
the one who wrote them.
|
||||
|
||||
Copyright should contain the following:
|
||||
Copyright (C) 2019 The FENIX Project
|
||||
|
||||
This software is free software. Feel free to modify it
|
||||
and/or pass it around.
|
||||
A lot of older man pages will list the copyright as:
|
||||
Copyright (C) 2019 Katlynn Richey
|
||||
Not the worst idea to update that if you see it.
|
||||
|
||||
You might also want to include the package of a thing, if it's got one. For
|
||||
example, utilities include:
|
||||
This <util> implementation is a part of the fenutils package.
|
||||
|
||||
Physical manual:
|
||||
|
||||
The secondary documentation is the manual written in roff. Practically, it
|
||||
documents the same kinda things. Name, synopsis, options, etc. It also includes
|
||||
some other stuff, like See Also, for related stuff; Diagnostics, for what all
|
||||
errors it can produce; and Bugs, for any bugs in a program. Additionally, the
|
||||
author field is different, and the details for that, along with other bits,
|
||||
can be found in the Intro.
|
||||
|
||||
License for the manuals:
|
||||
|
||||
The manuals are tentatively under CC-BY. Maybe I'll write my own license, like
|
||||
with the CWPL. For now, though, it's CC-BY.
|
||||
|
||||
Credits:
|
||||
|
||||
In man pages (including physical), include the original author of the program,
|
||||
not the author of the man page. (Generally, you should be writing the man
|
||||
pages yourself anyways.)
|
||||
|
||||
On the title page of the physical manual, all the folx behind the project should
|
||||
be named. For now, this is fairly small. If it gets too big, we'll give them
|
||||
all credit within the first few pages and have the author on the manual as
|
||||
"The FENIX Project Manual Team".
|
||||
|
||||
On other manuals:
|
||||
|
||||
Don't copy other people's manuals wholesale. Try to write it yourself. The
|
||||
exception is in working with the POSIX standard. Don't copy that wholesale, but
|
||||
you can base your man page on the relevant POSIX page. Just know that the
|
||||
POSIX page probably won't give you a complete man page!
|
||||
|
||||
Releases
|
||||
========
|
||||
|
||||
I'll worry about that later.
|
Loading…
Add table
Add a link
Reference in a new issue