fediglam/CONTRIBUTING.md
2022-10-11 20:21:44 -07:00

12 KiB

Overview

Packages

  • main: executable package, includes base server and controllers
    • TODO: Consider moving controllers to own package
  • api: primary package, implements server logic
  • util: utility packages
  • sql: SQL library
  • http: HTTP Server

Code Standards

  • We currently target zig 0.10

    • This is not currently released!
    • You need a recent-ish build of zig from git
    • When the stable release of 0.10 drops, we will pin to that and will only move up when a newer version comes out
  • Follow style guide

    • Except where it doesn't make sense to.
  • Line length: Try to keep it below 100 columns

    • Except where it would harm other aspects. Examples:
      • Keep full, copy-pastable URL's on a single line. Always. This lets you click on the url in your editor (if it supports it)
      • Keep error messages on a single line when possible, to promote greppability
  • Whenever possible, limit the scope of variables. Don't be afraid of subscopes

  • Prefer to pass by value when you're not conceptually modifying anything.

    • The compiler is smart enough to turn it into a *const T when needed
  • Also prefer to return by value (take advantage of Result Location Semantics).

  • RUN zig fmt BEFORE SUBMITTING A PR, IDEALLY ON EVERY SAVE

  • Unit tests should go in the same directory, as the code they are testing.

  • Integration tests should go in tests/

  • When making changes to DB queries, please test them against both SQLite and Postgres

    • Queries should, whenever possible, run on both database engines.
    • If you absolutely need to do something different, switch on db.engineType()
  • Document who owns memory/pointers/slices when they are created.

Read zig zen

Package Standards

As much as possible, packages (aside from main) should avoid depending on other packages aside from std (which is linked automatically) and util. This makes build.zig much simpler and avoids the problem of tangled dependency graphs.

Within a package, you should also avoid importing using relative upwards paths (so, avoid @import("../../my_thing.zig")). This is done both for simplicity's sake and because it makes refactoring less annoying. It's the intra-package thing of avoiding tangled dependency graphs.

Package Overview

main package

  • TODO: consider moving controllers and api into different packages
  • controllers/**.zig:
    • Transforms HTTP to/from API calls
    • Turns error codes into HTTP statuses
  • migrations.zig:
    • Defines database migrations to apply
    • Should be ran on startup

api package

  • lib.zig:
    • Makes sure API call is allowed with the given user/host context
  • services/**.zig: Performs action associated with API call
    • Transforms DB models into API models
    • Most data validation

util package

Components:

  • Uuid: UUID utils (random uuid generation, equality, parsing, printing)
    • Uuid.eql
    • Uuid.randV4
    • UUID's are serialized to their string representation for JSON, db
  • iters.zig
    • PathIter: Path segment iterator
    • QueryIter: Query parameter iterator
    • SqlStmtIter
  • Url: URL utils (parsing)
    • TODO: This isn't used anywhere yet and kinda sucks. Scrap?
  • ciutf8: case-insensitive UTF-8
    • TODO: Scrap this, replace with ICU library
  • DateTime: Time utils
  • deepClone(alloc, orig)/deepFree(alloc, to_free)
    • Utils for cloning and freeing basic data structs
    • Clones/frees any strings/sub structs within the value
  • getThreadPrng/seedThreadPrng
  • comptimeJoinWithPrefix: metaprogramming helper
  • jsonSerializeEnumAsString: json helper
    • add pub const jsonSerialize = util.jsonSerializeEnumAsString to your enum

sql package

Currently supports 2 SQL engines:

  • Postgres (Meant for production use)
  • SQLite (Meant for development use)

TODO: Allow compiling the program without support for one or the other engine using build options

There's nothing stopping you from using Postgres in dev and SQLite in prod, but the SQLite backend is more prone to panic if an error happens instead of returning proper errors.

Usage Example

// open transaction context
// only necessary if you need transactional sematics, otherwise
// just call query methods directly on the database and use the
// implied transaction
var tx = try db.beginOrSavepoint();
errdefer tx.rollback();
{
    var results = try tx.query(
        Tuple(&.{[]const u8}),
        "SELECT username FROM account WHERE community_id = $1",
        .{community_id},
        allocator,
    );
    defer results.finish();
    
    std.log.info("Listing users", .{});
    for (try results.row(allocator)) |row| {
        // Don't forget to free values if applicable!
        defer allocator.free(row[0]);
        // do some stuff with the username idk
        std.log.info("- {s}", .{row[0]};
    }
    std.log.info("Done", .{});
}
// commit transaction
try tx.commitOrRelease();

queryRow

If you need exactly one row, use queryRow. It calls finish() and returns the row for you

const row = try db.queryRow(
    Tuple(&.{[]const u8}),
    "SELECT username FROM account WHERE id = $1",
    .{user_id},
    allocator,
);
defer allocator.free(row[0]);
std.log.info("Username: {s}", .{row[0]});

queryRow returns error.NoRows or error.TooManyRows if the query does not return exactly one row. Consider adding a LIMIT 1 clause on all queries you use with queryRow:

const row = db.queryRow(
    Tuple(&.{[]const u8}),
    "SELECT username FROM account WHERE id = $1 LIMIT 1",
    .{user_id},
    allocator,
) catch |err| switch (err) {
    // early return on error
    error.NoRows => return error.UserNotFound,
    else => return err,
};
defer allocator.free(row[0]);
std.log.info("Username: {s}", .{row[0]});

Avoiding SQL Injection

DO *NOT*
UNDER *ANY* CIRCUMSTANCES
*EVER*
PUT USER-SUPPLIED TEXT DIRECTLY INTO THE QUERY TEXT

The SQL library intentionally does not provide an escape function. Because you should not be using it anyways.

Instead, use query arguments.

const row = try db.queryRow(
    std.meta.Tuple(&.{[]const u8, []const u8}),
    \\SELECT account.username, community.host
    \\FROM account JOIN community
    \\  ON account.community_id = community.id
    \\WHERE account.id = $1 AND community.id = $2
    \\LIMIT 1
,
    // In SQL, parameters are referenced by `$N` where N refers to the
    // *one*-indexed argument number. Note that this means that `args[0]`
    // will be bound to `$1` and `args[1]` will be bound to `$2`.
    .{user_id, community_id},
    allocator,
);

std.log.info("user handle: @{s}@{s}", .{row[0], row[1]});

In this, the db library will perform any necessary string escapes on the arguments before they are used in the query.

Additionally, while you cannot put user supplied text into the query, you can, in exceptional cases, put program supplied text into the query conditionally based on user input. For example, parsing the user text into an enum and then putting @tagName(val) into the query.
This should be used judiciously, and only where necessary to avoid code explosion (for example, in complicated query methods taking multiple optional arguments).

In general, query text should be calculated at comptime whenever possible.

Query Argument types

Query arguments must be one of:

  • A tuple containing the types to pass (shown above)
  • The monovalue {} (of type void), signifying zero parameters
    • Technically this has the same behavior as passing an empty tuple .{}. But because every tuple is a different type to the compiler, this could subvert memoization efforts, causing compilation to slow down and potentially bloating the binary. Use your helper methods!

Unused arguments

In general, arguments should be used in the query. While the postgres backend does not currently verify that all arguments are used in the query, the SQLite backend does.

Some queries may benefit from being able to pass in arguments that are not used in the actual query. For example, to prevent branching paths of different argument tuple types. In this case, you can use the queryWithOptions method:

const query_to_execute_that_probably_varies_at_runtime = "SELECT id FROM account WHERE username = $3";
var results = try db.queryWithOptions(
    std.meta.Tuple(&.{[]const u8}),
    query_to_execute_that_probably_varies_at_runtime,
    .{unused_arg_1, unused_arg_2, username},
    .{ .prep_allocator = allocator, .ignore_unused_arguments = true },
);
// use the query results

Row Result types

Query result types are somewhat more flexible than argument types. They can be one of:

  • A tuple type containing the types to retrieve (shown above)
  • A struct type containing the names of the columns to retrieve:
const handle = try db.queryRow(
    struct { username: []const u8, host: []const u8 },
    \\SELECT account.username AS username, community.host AS host
    \\FROM account JOIN community
    \\  ON account.community_id = community.id
    \\WHERE account.id = $1 AND community.id = $2
    \\LIMIT 1
,
    .{user_id, community_id},
    allocator,
);

std.log.info("user handle: @{s}@{s}", .{row.username, row.host}

Note that this may require you to rename some of your result columns using AS.

Helper Methods

Executable on db or tx
  • query(RowType, sql, args, allocator)
  • queryRow(RowType, sql, args, allocator)
  • exec(sql, args, allocator)
    • Used for queries that don't return any rows, calls finish() for you
  • insert("accounts", Account{ ... }, allocator)
    • Equivalent to exec("INSERT INTO accounts(.....) VALUES (......)", .{...}, allocator);
  • queryWithOptions(RowType, sql, args, .{ advanced options and stuff... });
  • const tx = try db.beginOrSavepoint()
    • Creates a transaction or savepoint (like a nested transaction) context
  • tx.commitOrRelease()
    • Commits this transaction or releases the savepoint as necessary
  • tx.rollback()
    • Note that this does not return an error union, and does not need to be prefixed with try. This is because rollback is supposed to be an operation that cannot fail (except in the case of connection losses, in which the DB should have done a rollback for us anyway. If connection errors do occur, then tx.rollback() will log them to the console and swallow the error value.
    • If you want to check that the rollback completed successfully, you need to know whether it is a savepoint or a transaction being rolled back. Then call either try tx.rollbackTx() or try tx.rollbackSavepoint() as appropriate.

In all cases, you can supply a null allocator if you know for certain that the query engine does not need to allocate a buffer to bind the arguments or save the return data. For example, if you are executing a query that does not have arguments.

http

Implements a basic HTTP/1.1 server.

This module sucks and needs refactoring

General organization:

  • request/parser.zig - parses incoming requests
  • server/response.zig:ResponseStream
    • Attempts to write response body in one go at the end if possible.
    • If the body is too big, it uses Transfer-Encoding: chunked and transfers the body piece-by-piece.
    • Create one using server/response.zig:open(...)
  • routing.zig - handles route matching. See usage (currently in main.zig) in the main package
    • Can currently match based on HTTP method and on path.
    • Path format:
      • "/path/to/my/endpoint" - static path
      • "/posts/:post_id/reacts/:react_id" - path with 2 arguments (post_id and react_id)
        • Arguments get passed to handler functions through the RouteArgs param
        • There is not currenlty a way of typing args
    • This whole thing needs a refactor