Union deserialization behavior is confusing #86

Closed
opened 2022-12-16 07:17:39 +00:00 by heartles · 5 comments
Owner

Right now, unions have the behavior where the union itself is treated as invisible, its members are dumped into its parent scope. For example, the type:

const Data = struct {
    foo: []const u8,
    bar: union(enum) {
        qux: []const u8,
        baz: usize,
    },
};

Is treated as if the parameter foo must be provided and exactly one of qux or baz must also be provided. So the json value {"foo": "abc", "qux": "xyz"} would be valid and would deserialize to .{ .foo = "abc", .bar = .{ .qux = "xyz" } }.

There are a few problems with this approach:

  • It's weird and implicit
  • It's counter to the way std.json handles unions, which is to treat its members as invisible

So for example, the json value {"foo": "abc", "bar": "xyz"} would deserialize to the same value.

Right now, unions have the behavior where the union itself is treated as invisible, its members are dumped into its parent scope. For example, the type: ``` const Data = struct { foo: []const u8, bar: union(enum) { qux: []const u8, baz: usize, }, }; ``` Is treated as if the parameter `foo` must be provided and exactly one of `qux` or `baz` must also be provided. So the json value `{"foo": "abc", "qux": "xyz"}` would be valid and would deserialize to `.{ .foo = "abc", .bar = .{ .qux = "xyz" } }`. There are a few problems with this approach: - It's weird and implicit - It's counter to the way `std.json` handles unions, which is to treat its members as invisible So for example, the json value `{"foo": "abc", "bar": "xyz"}` would deserialize to the same value.
heartles added the
backend
label 2022-12-16 07:17:39 +00:00
heartles reopened this issue 2022-12-16 07:19:04 +00:00
Author
Owner

This behavior was originally written for pagination code, which often has the following idiom:

const Args = struct {
    const OrderBy = enum { created_at, name, ... };
    ....
    order_by: OrderBy,
    prev: ?struct {
        id: []const u8,
        value: union(OrderBy) {
            created_at: DateTime,
            name: []const u8,
            ...
        },
    },
}

Where the idea is that prev.value must match order_by, and the two will be compared in sql like (id, <col>) > ($id, $value). This allows the user to pass parameters like (using query string syntax): ?order_by=created_at&prev.id=ID&prev.created_at=DATETIME

This could be rewritten to use the std.json format, with some duplication, to be like:

const Args = struct {
    const OrderBy = enum { created_at, ... };
    ....
    order_by: OrderBy,
    prev: ?union(OrderBy) {
        created_at: struct {
            id: []const u8,
            created_at: DateTime,
        },
        name: struct {
            id: []const u8,
            name: []const u8,
        },
    },
}
This behavior was originally written for pagination code, which often has the following idiom: ``` const Args = struct { const OrderBy = enum { created_at, name, ... }; .... order_by: OrderBy, prev: ?struct { id: []const u8, value: union(OrderBy) { created_at: DateTime, name: []const u8, ... }, }, } ``` Where the idea is that prev.value must match order_by, and the two will be compared in sql like `(id, <col>) > ($id, $value)`. This allows the user to pass parameters like (using query string syntax): `?order_by=created_at&prev.id=ID&prev.created_at=DATETIME` This could be rewritten to use the `std.json` format, with some duplication, to be like: ``` const Args = struct { const OrderBy = enum { created_at, ... }; .... order_by: OrderBy, prev: ?union(OrderBy) { created_at: struct { id: []const u8, created_at: DateTime, }, name: struct { id: []const u8, name: []const u8, }, }, } ```
Author
Owner

While this may seem important for parsing ActivityPub types, we will need to handle them specially without using the serialization library. This is because, while they do have defined fields in the spec, if we only used the fields in the spec we would lose any extension information.

While this may seem important for parsing ActivityPub types, we will need to handle them specially without using the serialization library. This is because, while they do have defined fields in the spec, if we only used the fields in the spec we would lose any extension information.
Author
Owner

Another issue is for the handling of some of the webpage POST calls (especially within the drive system).

There are a number of actions the user can take on a drive entry (delete, move, create subdir, edit, upload), all of which are forced to share the POST method and all of which have different parameters:

const Mkdir = struct { name: []const u8 };
const Upload = struct { file: FormFile, description: []const u8, sensitive: bool };
const Delete = void;
const Move = struct { dest: []const u8 };
const Edit = struct { description: []const u8, sensitive: bool, content_type: []const u8, filename: []const u8 };

Right now these are all mapped to the same POST /drive/:path* endpoint.

This raises some questions:

  • If we put these all in the same endpoint, how do we determine which to deserialize to?
  • Is it worth punting and just mapping these to different endpoints out-of-tree?
Another issue is for the handling of some of the webpage POST calls (especially within the drive system). There are a number of actions the user can take on a drive entry (delete, move, create subdir, edit, upload), all of which are forced to share the POST method and all of which have different parameters: ``` const Mkdir = struct { name: []const u8 }; const Upload = struct { file: FormFile, description: []const u8, sensitive: bool }; const Delete = void; const Move = struct { dest: []const u8 }; const Edit = struct { description: []const u8, sensitive: bool, content_type: []const u8, filename: []const u8 }; ``` Right now these are all mapped to the same `POST /drive/:path*` endpoint. This raises some questions: - If we put these all in the same endpoint, how do we determine which to deserialize to? - Is it worth punting and just mapping these to different endpoints out-of-tree?
Author
Owner

I fixed the deserialization syntax to work like std.json as of f47c1ee751

I fixed the deserialization syntax to work like std.json as of f47c1ee751aa1c6144d263d6f056179cf724c9c1
Author
Owner

The solution i went with for the webpge POST thing i mentioned was to put the action in the query string, then add a thing that lets the body deserialization type come from that.

The solution i went with for the webpge POST thing i mentioned was to put the action in the query string, then add a thing that lets the body deserialization type come from that.
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: heartles/fediglam#86
No description provided.