litespeed-quic/src/liblsquic/lsquic_packet_out.h

360 lines
15 KiB
C
Raw Normal View History

/* Copyright (c) 2017 - 2020 LiteSpeed Technologies Inc. See LICENSE. */
2017-09-22 21:00:03 +00:00
/*
* lsquic_packet_out.h -- Structure and routines dealing with packet_out
*/
#ifndef LSQUIC_PACKET_OUT_H
#define LSQUIC_PACKET_OUT_H 1
#include <sys/queue.h>
struct malo;
2018-08-15 19:06:31 +00:00
struct lsquic_conn;
2017-09-22 21:00:03 +00:00
struct lsquic_engine_public;
struct lsquic_mm;
struct lsquic_stream;
struct network_path;
2017-09-22 21:00:03 +00:00
struct parse_funcs;
struct bwp_state;
2017-09-22 21:00:03 +00:00
/* Each frame_rec is associated with one packet_out. packet_out can have
* zero or more frame_rec structures. frame_rec keeps a pointer to a stream
* that has STREAM, CRYPTO, or RST_STREAM frames inside packet_out.
* `fe_frame_type' specifies the type of the frame; if this value is zero
* (this happens when a frame is elided), values of the other struct members
* are not valid. `fe_off' indicates where inside packet_out->po_data the
* frame begins and `fe_len' is its length.
2017-09-22 21:00:03 +00:00
*
* We need this information for four reasons:
2017-09-22 21:00:03 +00:00
* 1. A stream is not destroyed until all of its STREAM and RST_STREAM
* frames are acknowledged. This is to make sure that we do not exceed
* maximum allowed number of streams.
* 2. When a packet is resubmitted, STREAM frames for a stream that has
* been reset are not to be resubmitted.
Latest changes - [API Change] Sendfile-like functionality is gone. The stream no longer opens files and deals with file descriptors. (Among other things, this makes the code more portable.) Three writing functions are provided: lsquic_stream_write lsquic_stream_writev lsquic_stream_writef (NEW) lsquic_stream_writef() is given an abstract reader that has function pointers for size() and read() functions which the user can implement. This is the most flexible way. lsquic_stream_write() and lsquic_stream_writev() are now both implemented as wrappers around lsquic_stream_writef(). - [OPTIMIZATION] When writing to stream, be it within or without the on_write() callback, place data directly into packet buffer, bypassing auxiliary data structures. This reduces amount of memory required, for the amount of data that can be written is limited by the congestion window. To support writes outside the on_write() callback, we keep N outgoing packet buffers per connection which can be written to by any stream. One half of these are reserved for the highest priority stream(s), the other half for all other streams. This way, low-priority streams cannot write instead of high-priority streams and, on the other hand, low-priority streams get a chance to send their packets out. The algorithm is as follows: - When user writes to stream outside of the callback: - If this is the highest priority stream, place it onto the reserved N/2 queue or fail. (The actual size of this queue is dynamic -- MAX(N/2, CWND) -- rather than N/2, allowing high-priority streams to write as much as can be sent.) - If the stream is not the highest priority, try to place the data onto the reserved N/2 queue or fail. - When tick occurs *and* more packets can be scheduled: - Transfer packets from the high N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for highest-priority streams, placing resulting packets directly onto the scheduled queue. - If more scheduling is allowed: - Transfer packets from the low N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for non-highest-priority streams, placing resulting packets directly onto the scheduled queue The number N is currently 20, but it could be varied based on resource usage. - If stream is created due to incoming headers, make headers readable from on_new. - Outgoing packets are no longer marked non-writeable to prevent placing more than one STREAM frame from the same stream into a single packet. This property is maintained via code flow and an explicit check. Packets for stream data are allocated using a special function. - STREAM frame elision is cheaper, as we only perform it if a reset stream has outgoing packets referencing it. - lsquic_packet_out_t is smaller, as stream_rec elements are now inside a union.
2017-10-31 13:35:58 +00:00
* 3. A buffered packet may have to be split before it is scheduled (this
* occurs if we guessed incorrectly the number of bytes required to
* encode the packet number and the actual number would make packet
* larger than the max).
* 4. A lost or scheduled packet may need to be resized (down) when path
* changes or MTU is reduced due to an RTO.
Latest changes - [API Change] Sendfile-like functionality is gone. The stream no longer opens files and deals with file descriptors. (Among other things, this makes the code more portable.) Three writing functions are provided: lsquic_stream_write lsquic_stream_writev lsquic_stream_writef (NEW) lsquic_stream_writef() is given an abstract reader that has function pointers for size() and read() functions which the user can implement. This is the most flexible way. lsquic_stream_write() and lsquic_stream_writev() are now both implemented as wrappers around lsquic_stream_writef(). - [OPTIMIZATION] When writing to stream, be it within or without the on_write() callback, place data directly into packet buffer, bypassing auxiliary data structures. This reduces amount of memory required, for the amount of data that can be written is limited by the congestion window. To support writes outside the on_write() callback, we keep N outgoing packet buffers per connection which can be written to by any stream. One half of these are reserved for the highest priority stream(s), the other half for all other streams. This way, low-priority streams cannot write instead of high-priority streams and, on the other hand, low-priority streams get a chance to send their packets out. The algorithm is as follows: - When user writes to stream outside of the callback: - If this is the highest priority stream, place it onto the reserved N/2 queue or fail. (The actual size of this queue is dynamic -- MAX(N/2, CWND) -- rather than N/2, allowing high-priority streams to write as much as can be sent.) - If the stream is not the highest priority, try to place the data onto the reserved N/2 queue or fail. - When tick occurs *and* more packets can be scheduled: - Transfer packets from the high N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for highest-priority streams, placing resulting packets directly onto the scheduled queue. - If more scheduling is allowed: - Transfer packets from the low N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for non-highest-priority streams, placing resulting packets directly onto the scheduled queue The number N is currently 20, but it could be varied based on resource usage. - If stream is created due to incoming headers, make headers readable from on_new. - Outgoing packets are no longer marked non-writeable to prevent placing more than one STREAM frame from the same stream into a single packet. This property is maintained via code flow and an explicit check. Packets for stream data are allocated using a special function. - STREAM frame elision is cheaper, as we only perform it if a reset stream has outgoing packets referencing it. - lsquic_packet_out_t is smaller, as stream_rec elements are now inside a union.
2017-10-31 13:35:58 +00:00
*
* In IETF, all frames are recorded. In gQUIC, only STREAM, RST_STREAM,
* ACK, and STOP_WAITING are recorded. The latter two are done so that
* ACK-deleting code in send controller (see po_regen_sz) is the same for
* both QUIC versions.
2017-09-22 21:00:03 +00:00
*/
struct frame_rec {
union {
struct lsquic_stream *stream;
uintptr_t data;
} fe_u;
#define fe_stream fe_u.stream
unsigned short fe_off,
fe_len;
enum quic_frame_type fe_frame_type;
2017-09-22 21:00:03 +00:00
};
#define frec_taken(frec) ((frec)->fe_frame_type)
2017-09-22 21:00:03 +00:00
struct frame_rec_arr {
TAILQ_ENTRY(frame_rec_arr) next_stream_rec_arr;
struct frame_rec frecs[
2017-09-22 21:00:03 +00:00
( 64 /* Efficient size for malo allocator */
- sizeof(TAILQ_ENTRY(frame_rec)) /* next_stream_rec_arr */
) / sizeof(struct frame_rec)
2017-09-22 21:00:03 +00:00
];
};
TAILQ_HEAD(frame_rec_arr_tailq, frame_rec_arr);
Latest changes - [API Change] Sendfile-like functionality is gone. The stream no longer opens files and deals with file descriptors. (Among other things, this makes the code more portable.) Three writing functions are provided: lsquic_stream_write lsquic_stream_writev lsquic_stream_writef (NEW) lsquic_stream_writef() is given an abstract reader that has function pointers for size() and read() functions which the user can implement. This is the most flexible way. lsquic_stream_write() and lsquic_stream_writev() are now both implemented as wrappers around lsquic_stream_writef(). - [OPTIMIZATION] When writing to stream, be it within or without the on_write() callback, place data directly into packet buffer, bypassing auxiliary data structures. This reduces amount of memory required, for the amount of data that can be written is limited by the congestion window. To support writes outside the on_write() callback, we keep N outgoing packet buffers per connection which can be written to by any stream. One half of these are reserved for the highest priority stream(s), the other half for all other streams. This way, low-priority streams cannot write instead of high-priority streams and, on the other hand, low-priority streams get a chance to send their packets out. The algorithm is as follows: - When user writes to stream outside of the callback: - If this is the highest priority stream, place it onto the reserved N/2 queue or fail. (The actual size of this queue is dynamic -- MAX(N/2, CWND) -- rather than N/2, allowing high-priority streams to write as much as can be sent.) - If the stream is not the highest priority, try to place the data onto the reserved N/2 queue or fail. - When tick occurs *and* more packets can be scheduled: - Transfer packets from the high N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for highest-priority streams, placing resulting packets directly onto the scheduled queue. - If more scheduling is allowed: - Transfer packets from the low N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for non-highest-priority streams, placing resulting packets directly onto the scheduled queue The number N is currently 20, but it could be varied based on resource usage. - If stream is created due to incoming headers, make headers readable from on_new. - Outgoing packets are no longer marked non-writeable to prevent placing more than one STREAM frame from the same stream into a single packet. This property is maintained via code flow and an explicit check. Packets for stream data are allocated using a special function. - STREAM frame elision is cheaper, as we only perform it if a reset stream has outgoing packets referencing it. - lsquic_packet_out_t is smaller, as stream_rec elements are now inside a union.
2017-10-31 13:35:58 +00:00
2017-09-22 21:00:03 +00:00
typedef struct lsquic_packet_out
{
/* `po_next' is used for packets_out, unacked_packets and expired_packets
* lists.
*/
TAILQ_ENTRY(lsquic_packet_out)
po_next;
lsquic_time_t po_sent; /* Time sent */
lsquic_packno_t po_packno;
lsquic_packno_t po_ack2ed; /* If packet has ACK frame, value of
* largest acked in it.
*/
struct lsquic_packet_out
*po_loss_chain; /* Circular linked list */
2017-09-22 21:00:03 +00:00
enum quic_ft_bit po_frame_types; /* Bitmask of QUIC_FRAME_* */
Latest changes - [API Change] lsquic_engine_connect() returns pointer to the connection object. - [API Change] Add lsquic_conn_get_engine() to get engine object from connection object. - [API Change] Add lsquic_conn_status() to query connection status. - [API Change] Add add lsquic_conn_set_ctx(). - [API Change] Add new timestamp format, e.g. 2017-03-21 13:43:46.671345 - [OPTIMIZATION] Process handshake STREAM frames as soon as packet arrives. - [OPTIMIZATION] Do not compile expensive send controller sanity check by default. - [OPTIMIZATION] Add fast path to gquic_be_gen_reg_pkt_header. - [OPTIMIZATION] Only make squeeze function call if necessary. - [OPTIMIZATION] Speed up Q039 ACK frame parsing. - [OPTIMIZATION] Fit most used elements of packet_out into first 64 bytes. - [OPTIMIZATION] Keep track of scheduled bytes instead of calculating. - [OPTIMIZATION] Prefetch next unacked packet when processing ACK. - [OPTIMIZATION] Leverage fact that ACK ranges and unacked list are. ordered. - [OPTIMIZATION] Reduce function pointer use for STREAM frame generation - Fix: reset incoming streams that arrive after we send GOAWAY. - Fix: delay client on_new_conn() call until connection is fully set up. - Fixes to buffered packets logic: splitting, STREAM frame elision. - Fix: do not dispatch on_write callback if no packets are available. - Fix WINDOW_UPDATE send and resend logic. - Fix STREAM frame extension code. - Fix: Drop unflushed data when stream is reset. - Switch to tracking CWND using bytes rather than packets. - Fix TCP friendly adjustment in cubic. - Fix: do not generate invalid STOP_WAITING frames during high packet loss. - Pacer fixes.
2018-02-26 21:01:16 +00:00
enum packet_out_flags {
/* TODO XXX Phase out PO_MINI in favor of a more specialized flag:
* we only need an indicator that a packet contains STREAM frames
* but no associated frecs. This type of packets in only created
* by GQUIC mini conn.
*/
PO_MINI = (1 << 0), /* Allocated by mini connection */
Latest changes - [API Change] lsquic_engine_connect() returns pointer to the connection object. - [API Change] Add lsquic_conn_get_engine() to get engine object from connection object. - [API Change] Add lsquic_conn_status() to query connection status. - [API Change] Add add lsquic_conn_set_ctx(). - [API Change] Add new timestamp format, e.g. 2017-03-21 13:43:46.671345 - [OPTIMIZATION] Process handshake STREAM frames as soon as packet arrives. - [OPTIMIZATION] Do not compile expensive send controller sanity check by default. - [OPTIMIZATION] Add fast path to gquic_be_gen_reg_pkt_header. - [OPTIMIZATION] Only make squeeze function call if necessary. - [OPTIMIZATION] Speed up Q039 ACK frame parsing. - [OPTIMIZATION] Fit most used elements of packet_out into first 64 bytes. - [OPTIMIZATION] Keep track of scheduled bytes instead of calculating. - [OPTIMIZATION] Prefetch next unacked packet when processing ACK. - [OPTIMIZATION] Leverage fact that ACK ranges and unacked list are. ordered. - [OPTIMIZATION] Reduce function pointer use for STREAM frame generation - Fix: reset incoming streams that arrive after we send GOAWAY. - Fix: delay client on_new_conn() call until connection is fully set up. - Fixes to buffered packets logic: splitting, STREAM frame elision. - Fix: do not dispatch on_write callback if no packets are available. - Fix WINDOW_UPDATE send and resend logic. - Fix STREAM frame extension code. - Fix: Drop unflushed data when stream is reset. - Switch to tracking CWND using bytes rather than packets. - Fix TCP friendly adjustment in cubic. - Fix: do not generate invalid STOP_WAITING frames during high packet loss. - Pacer fixes.
2018-02-26 21:01:16 +00:00
PO_HELLO = (1 << 1), /* Packet contains SHLO or CHLO data */
PO_SENT = (1 << 2), /* Packet has been sent (mini only) */
Latest changes - [API Change] lsquic_engine_connect() returns pointer to the connection object. - [API Change] Add lsquic_conn_get_engine() to get engine object from connection object. - [API Change] Add lsquic_conn_status() to query connection status. - [API Change] Add add lsquic_conn_set_ctx(). - [API Change] Add new timestamp format, e.g. 2017-03-21 13:43:46.671345 - [OPTIMIZATION] Process handshake STREAM frames as soon as packet arrives. - [OPTIMIZATION] Do not compile expensive send controller sanity check by default. - [OPTIMIZATION] Add fast path to gquic_be_gen_reg_pkt_header. - [OPTIMIZATION] Only make squeeze function call if necessary. - [OPTIMIZATION] Speed up Q039 ACK frame parsing. - [OPTIMIZATION] Fit most used elements of packet_out into first 64 bytes. - [OPTIMIZATION] Keep track of scheduled bytes instead of calculating. - [OPTIMIZATION] Prefetch next unacked packet when processing ACK. - [OPTIMIZATION] Leverage fact that ACK ranges and unacked list are. ordered. - [OPTIMIZATION] Reduce function pointer use for STREAM frame generation - Fix: reset incoming streams that arrive after we send GOAWAY. - Fix: delay client on_new_conn() call until connection is fully set up. - Fixes to buffered packets logic: splitting, STREAM frame elision. - Fix: do not dispatch on_write callback if no packets are available. - Fix WINDOW_UPDATE send and resend logic. - Fix STREAM frame extension code. - Fix: Drop unflushed data when stream is reset. - Switch to tracking CWND using bytes rather than packets. - Fix TCP friendly adjustment in cubic. - Fix: do not generate invalid STOP_WAITING frames during high packet loss. - Pacer fixes.
2018-02-26 21:01:16 +00:00
PO_ENCRYPTED= (1 << 3), /* po_enc_data has encrypted data */
PO_FREC_ARR = (1 << 4),
Latest changes - [API Change] lsquic_engine_connect() returns pointer to the connection object. - [API Change] Add lsquic_conn_get_engine() to get engine object from connection object. - [API Change] Add lsquic_conn_status() to query connection status. - [API Change] Add add lsquic_conn_set_ctx(). - [API Change] Add new timestamp format, e.g. 2017-03-21 13:43:46.671345 - [OPTIMIZATION] Process handshake STREAM frames as soon as packet arrives. - [OPTIMIZATION] Do not compile expensive send controller sanity check by default. - [OPTIMIZATION] Add fast path to gquic_be_gen_reg_pkt_header. - [OPTIMIZATION] Only make squeeze function call if necessary. - [OPTIMIZATION] Speed up Q039 ACK frame parsing. - [OPTIMIZATION] Fit most used elements of packet_out into first 64 bytes. - [OPTIMIZATION] Keep track of scheduled bytes instead of calculating. - [OPTIMIZATION] Prefetch next unacked packet when processing ACK. - [OPTIMIZATION] Leverage fact that ACK ranges and unacked list are. ordered. - [OPTIMIZATION] Reduce function pointer use for STREAM frame generation - Fix: reset incoming streams that arrive after we send GOAWAY. - Fix: delay client on_new_conn() call until connection is fully set up. - Fixes to buffered packets logic: splitting, STREAM frame elision. - Fix: do not dispatch on_write callback if no packets are available. - Fix WINDOW_UPDATE send and resend logic. - Fix STREAM frame extension code. - Fix: Drop unflushed data when stream is reset. - Switch to tracking CWND using bytes rather than packets. - Fix TCP friendly adjustment in cubic. - Fix: do not generate invalid STOP_WAITING frames during high packet loss. - Pacer fixes.
2018-02-26 21:01:16 +00:00
#define POBIT_SHIFT 5
PO_BITS_0 = (1 << 5), /* PO_BITS_0 and PO_BITS_1 encode the */
PO_BITS_1 = (1 << 6), /* packet number length. See macros below. */
PO_NONCE = (1 << 7), /* Use value in `po_nonce' to generate header */
PO_VERSION = (1 << 8), /* Use value in `po_ver_tag' to generate header */
PO_CONN_ID = (1 << 9), /* Include connection ID in public header */
PO_REPACKNO = (1 <<10), /* Regenerate packet number */
PO_NOENCRYPT= (1 <<11), /* Do not encrypt data in po_data */
PO_VERNEG = (1 <<12), /* Version negotiation packet. */
PO_STREAM_END
= (1 <<13), /* STREAM frame reaches the end of the packet: no
* further writes are allowed.
*/
PO_SCHED = (1 <<14), /* On scheduled queue */
Latest changes - [OPTIMIZATION] Merge series of ACKs if possible Parsed single-range ACK frames (that is the majority of frames) are saved in the connection and their processing is deferred until the connection is ticked. If several ACKs come in a series between adjacent ticks, we check whether the latest ACK is a strict superset of the saved ACK. If it is, the older ACK is not processed. If ACK frames can be merged, they are merged and only one of them is either processed or saved. - [OPTIMIZATION] Speed up ACK verification by simplifying send history. Never generate a gap in the sent packet number sequence. This reduces the send history to a single number instead of potentially a series of packet ranges and thereby speeds up ACK verification. By default, detecting a gap in the send history is not fatal: only a single warning is generated per connection. The connection can continue to operate even if the ACK verification code is not able to detect some inconsistencies. - [OPTIMIZATION] Rearrange the lsquic_send_ctl struct The first part of struct lsquic_send_ctl now consists of members that are used in lsquic_send_ctl_got_ack() (in the absense of packet loss, which is the normal case). To speed up reads and writes, we no longer try to save space by using 8- and 16-bit integers. Use regular integer width for everything. - [OPTIMIZATION] Cache size of sent packet. - [OPTIMIZATION] Keep track of the largest ACKed in packet_out Instead of parsing our own ACK frames when packet has been acked, use the value saved in the packet_out structure when the ACK frame was generated. - [OPTIMIZATION] Take RTT sampling conditional out of ACK loop - [OPTIMIZATION] ACK processing: only call clock_gettime() if needed - [OPTIMIZATION] Several code-level optimizations to ACK processing. - Fix: http_client: fix -I flag; switch assert() to abort()
2018-03-09 19:17:39 +00:00
PO_SENT_SZ = (1 <<15),
PO_LONGHEAD = (1 <<16),
#define POIPv6_SHIFT 20
PO_IPv6 = (1 <<20), /* Set if pmi_allocate was passed is_ipv6=1,
* otherwise unset.
*/
PO_MTU_PROBE= (1 <<21), /* Special loss and ACK rules apply */
#define POPNS_SHIFT 22
PO_PNS_HSK = (1 <<22), /* PNS bits contain the value of the */
PO_PNS_APP = (1 <<23), /* packet number space. */
PO_RETRY = (1 <<24), /* Retry packet */
PO_RETX = (1 <<25), /* Retransmitted packet: don't append to it */
PO_POISON = (1 <<26), /* Used to detect opt-ACK attack */
PO_LOSS_REC = (1 <<27), /* This structure is a loss record */
/* Only one of PO_SCHED, PO_UNACKED, or PO_LOST can be set. If pressed
* for room in the enum, we can switch to using two bits to represent
* this information.
*/
PO_UNACKED = (1 <<28), /* On unacked queue */
PO_LOST = (1 <<29), /* On lost queue */
#define POSPIN_SHIFT 30
PO_SPIN_BIT = (1 <<30), /* Value of the spin bit */
2018-08-15 19:06:31 +00:00
} po_flags;
Latest changes - [API Change] lsquic_engine_connect() returns pointer to the connection object. - [API Change] Add lsquic_conn_get_engine() to get engine object from connection object. - [API Change] Add lsquic_conn_status() to query connection status. - [API Change] Add add lsquic_conn_set_ctx(). - [API Change] Add new timestamp format, e.g. 2017-03-21 13:43:46.671345 - [OPTIMIZATION] Process handshake STREAM frames as soon as packet arrives. - [OPTIMIZATION] Do not compile expensive send controller sanity check by default. - [OPTIMIZATION] Add fast path to gquic_be_gen_reg_pkt_header. - [OPTIMIZATION] Only make squeeze function call if necessary. - [OPTIMIZATION] Speed up Q039 ACK frame parsing. - [OPTIMIZATION] Fit most used elements of packet_out into first 64 bytes. - [OPTIMIZATION] Keep track of scheduled bytes instead of calculating. - [OPTIMIZATION] Prefetch next unacked packet when processing ACK. - [OPTIMIZATION] Leverage fact that ACK ranges and unacked list are. ordered. - [OPTIMIZATION] Reduce function pointer use for STREAM frame generation - Fix: reset incoming streams that arrive after we send GOAWAY. - Fix: delay client on_new_conn() call until connection is fully set up. - Fixes to buffered packets logic: splitting, STREAM frame elision. - Fix: do not dispatch on_write callback if no packets are available. - Fix WINDOW_UPDATE send and resend logic. - Fix STREAM frame extension code. - Fix: Drop unflushed data when stream is reset. - Switch to tracking CWND using bytes rather than packets. - Fix TCP friendly adjustment in cubic. - Fix: do not generate invalid STOP_WAITING frames during high packet loss. - Pacer fixes.
2018-02-26 21:01:16 +00:00
unsigned short po_data_sz; /* Number of usable bytes in data */
unsigned short po_enc_data_sz; /* Number of usable bytes in data */
Latest changes - [OPTIMIZATION] Merge series of ACKs if possible Parsed single-range ACK frames (that is the majority of frames) are saved in the connection and their processing is deferred until the connection is ticked. If several ACKs come in a series between adjacent ticks, we check whether the latest ACK is a strict superset of the saved ACK. If it is, the older ACK is not processed. If ACK frames can be merged, they are merged and only one of them is either processed or saved. - [OPTIMIZATION] Speed up ACK verification by simplifying send history. Never generate a gap in the sent packet number sequence. This reduces the send history to a single number instead of potentially a series of packet ranges and thereby speeds up ACK verification. By default, detecting a gap in the send history is not fatal: only a single warning is generated per connection. The connection can continue to operate even if the ACK verification code is not able to detect some inconsistencies. - [OPTIMIZATION] Rearrange the lsquic_send_ctl struct The first part of struct lsquic_send_ctl now consists of members that are used in lsquic_send_ctl_got_ack() (in the absense of packet loss, which is the normal case). To speed up reads and writes, we no longer try to save space by using 8- and 16-bit integers. Use regular integer width for everything. - [OPTIMIZATION] Cache size of sent packet. - [OPTIMIZATION] Keep track of the largest ACKed in packet_out Instead of parsing our own ACK frames when packet has been acked, use the value saved in the packet_out structure when the ACK frame was generated. - [OPTIMIZATION] Take RTT sampling conditional out of ACK loop - [OPTIMIZATION] ACK processing: only call clock_gettime() if needed - [OPTIMIZATION] Several code-level optimizations to ACK processing. - Fix: http_client: fix -I flag; switch assert() to abort()
2018-03-09 19:17:39 +00:00
unsigned short po_sent_sz; /* If PO_SENT_SZ is set, real size of sent buffer. */
/* TODO Revisit po_regen_sz once gQUIC is dropped. Now that all frames
* are recorded, we have more flexibility where to place ACK frames; they
* no longer really have to be at the beginning of the packet, since we
* can locate them.
*/
Latest changes - [API Change] lsquic_engine_connect() returns pointer to the connection object. - [API Change] Add lsquic_conn_get_engine() to get engine object from connection object. - [API Change] Add lsquic_conn_status() to query connection status. - [API Change] Add add lsquic_conn_set_ctx(). - [API Change] Add new timestamp format, e.g. 2017-03-21 13:43:46.671345 - [OPTIMIZATION] Process handshake STREAM frames as soon as packet arrives. - [OPTIMIZATION] Do not compile expensive send controller sanity check by default. - [OPTIMIZATION] Add fast path to gquic_be_gen_reg_pkt_header. - [OPTIMIZATION] Only make squeeze function call if necessary. - [OPTIMIZATION] Speed up Q039 ACK frame parsing. - [OPTIMIZATION] Fit most used elements of packet_out into first 64 bytes. - [OPTIMIZATION] Keep track of scheduled bytes instead of calculating. - [OPTIMIZATION] Prefetch next unacked packet when processing ACK. - [OPTIMIZATION] Leverage fact that ACK ranges and unacked list are. ordered. - [OPTIMIZATION] Reduce function pointer use for STREAM frame generation - Fix: reset incoming streams that arrive after we send GOAWAY. - Fix: delay client on_new_conn() call until connection is fully set up. - Fixes to buffered packets logic: splitting, STREAM frame elision. - Fix: do not dispatch on_write callback if no packets are available. - Fix WINDOW_UPDATE send and resend logic. - Fix STREAM frame extension code. - Fix: Drop unflushed data when stream is reset. - Switch to tracking CWND using bytes rather than packets. - Fix TCP friendly adjustment in cubic. - Fix: do not generate invalid STOP_WAITING frames during high packet loss. - Pacer fixes.
2018-02-26 21:01:16 +00:00
unsigned short po_regen_sz; /* Number of bytes at the beginning
* of data containing bytes that are
* not to be retransmitted, e.g. ACK
* frames.
*/
unsigned short po_n_alloc; /* Total number of bytes allocated in po_data */
unsigned short po_token_len;
2018-08-15 19:06:31 +00:00
enum header_type po_header_type:8;
unsigned char po_dcid_len; /* If PO_ENCRYPTED is set */
enum {
POL_GQUIC = 1 << 0, /* Used for logging */
#define POLEV_SHIFT 1
POL_ELBIT_0 = 1 << 1, /* EL bits encode the crypto level. */
POL_ELBIT_1 = 1 << 2,
#define POKP_SHIFT 3
POL_KEY_PHASE= 1 << 3,
#define POECN_SHIFT 4
POL_ECNBIT_0 = 1 << 4,
POL_ECNBIT_1 = 1 << 5,
POL_LOG_QL_BITS = 1 << 6,
POL_SQUARE_BIT = 1 << 7,
POL_LOSS_BIT = 1 << 8,
#ifndef NDEBUG
POL_HEADER_PROT = 1 << 9, /* Header protection applied */
#endif
POL_LIMITED = 1 << 10, /* Used to credit sc_next_limit if needed. */
POL_FACKED = 1 << 11, /* Lost due to FACK check */
} po_lflags:16;
Latest changes - [API Change] lsquic_engine_connect() returns pointer to the connection object. - [API Change] Add lsquic_conn_get_engine() to get engine object from connection object. - [API Change] Add lsquic_conn_status() to query connection status. - [API Change] Add add lsquic_conn_set_ctx(). - [API Change] Add new timestamp format, e.g. 2017-03-21 13:43:46.671345 - [OPTIMIZATION] Process handshake STREAM frames as soon as packet arrives. - [OPTIMIZATION] Do not compile expensive send controller sanity check by default. - [OPTIMIZATION] Add fast path to gquic_be_gen_reg_pkt_header. - [OPTIMIZATION] Only make squeeze function call if necessary. - [OPTIMIZATION] Speed up Q039 ACK frame parsing. - [OPTIMIZATION] Fit most used elements of packet_out into first 64 bytes. - [OPTIMIZATION] Keep track of scheduled bytes instead of calculating. - [OPTIMIZATION] Prefetch next unacked packet when processing ACK. - [OPTIMIZATION] Leverage fact that ACK ranges and unacked list are. ordered. - [OPTIMIZATION] Reduce function pointer use for STREAM frame generation - Fix: reset incoming streams that arrive after we send GOAWAY. - Fix: delay client on_new_conn() call until connection is fully set up. - Fixes to buffered packets logic: splitting, STREAM frame elision. - Fix: do not dispatch on_write callback if no packets are available. - Fix WINDOW_UPDATE send and resend logic. - Fix STREAM frame extension code. - Fix: Drop unflushed data when stream is reset. - Switch to tracking CWND using bytes rather than packets. - Fix TCP friendly adjustment in cubic. - Fix: do not generate invalid STOP_WAITING frames during high packet loss. - Pacer fixes.
2018-02-26 21:01:16 +00:00
unsigned char *po_data;
/* A lot of packets contain only one frame. Thus, `one' is used first.
* If this is not enough, any number of frame_rec_arr structures can be
* allocated to handle more frame records.
2017-09-22 21:00:03 +00:00
*/
Latest changes - [API Change] Sendfile-like functionality is gone. The stream no longer opens files and deals with file descriptors. (Among other things, this makes the code more portable.) Three writing functions are provided: lsquic_stream_write lsquic_stream_writev lsquic_stream_writef (NEW) lsquic_stream_writef() is given an abstract reader that has function pointers for size() and read() functions which the user can implement. This is the most flexible way. lsquic_stream_write() and lsquic_stream_writev() are now both implemented as wrappers around lsquic_stream_writef(). - [OPTIMIZATION] When writing to stream, be it within or without the on_write() callback, place data directly into packet buffer, bypassing auxiliary data structures. This reduces amount of memory required, for the amount of data that can be written is limited by the congestion window. To support writes outside the on_write() callback, we keep N outgoing packet buffers per connection which can be written to by any stream. One half of these are reserved for the highest priority stream(s), the other half for all other streams. This way, low-priority streams cannot write instead of high-priority streams and, on the other hand, low-priority streams get a chance to send their packets out. The algorithm is as follows: - When user writes to stream outside of the callback: - If this is the highest priority stream, place it onto the reserved N/2 queue or fail. (The actual size of this queue is dynamic -- MAX(N/2, CWND) -- rather than N/2, allowing high-priority streams to write as much as can be sent.) - If the stream is not the highest priority, try to place the data onto the reserved N/2 queue or fail. - When tick occurs *and* more packets can be scheduled: - Transfer packets from the high N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for highest-priority streams, placing resulting packets directly onto the scheduled queue. - If more scheduling is allowed: - Transfer packets from the low N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for non-highest-priority streams, placing resulting packets directly onto the scheduled queue The number N is currently 20, but it could be varied based on resource usage. - If stream is created due to incoming headers, make headers readable from on_new. - Outgoing packets are no longer marked non-writeable to prevent placing more than one STREAM frame from the same stream into a single packet. This property is maintained via code flow and an explicit check. Packets for stream data are allocated using a special function. - STREAM frame elision is cheaper, as we only perform it if a reset stream has outgoing packets referencing it. - lsquic_packet_out_t is smaller, as stream_rec elements are now inside a union.
2017-10-31 13:35:58 +00:00
union {
struct frame_rec one;
struct frame_rec_arr_tailq arr;
} po_frecs;
2017-09-22 21:00:03 +00:00
/* If PO_ENCRYPTED is set, this points to the buffer that holds encrypted
* data.
*/
unsigned char *po_enc_data;
lsquic_ver_tag_t po_ver_tag; /* Set if PO_VERSION is set */
unsigned char *po_nonce; /* Use to generate header if PO_NONCE is set */
const struct network_path
*po_path;
#define po_token po_nonce
struct bwp_state *po_bwp_state;
2017-09-22 21:00:03 +00:00
} lsquic_packet_out_t;
/* This is to make sure these bit names are not used, they are only for
* convenience in gdb output.
*/
#define PO_PNS_HSK
#define PO_PNS_APP
2017-09-22 21:00:03 +00:00
/* The size of lsquic_packet_out_t could be further reduced:
*
* po_ver_tag could be encoded as a few bits representing enum lsquic_version
* in po_flags. The cost is a bit of complexity. This will save us four bytes.
*/
#define lsquic_packet_out_avail(p) ((unsigned short) \
((p)->po_n_alloc - (p)->po_data_sz))
#define lsquic_packet_out_packno_bits(p) (((p)->po_flags >> POBIT_SHIFT) & 0x3)
Latest changes - [API Change] lsquic_engine_connect() returns pointer to the connection object. - [API Change] Add lsquic_conn_get_engine() to get engine object from connection object. - [API Change] Add lsquic_conn_status() to query connection status. - [API Change] Add add lsquic_conn_set_ctx(). - [API Change] Add new timestamp format, e.g. 2017-03-21 13:43:46.671345 - [OPTIMIZATION] Process handshake STREAM frames as soon as packet arrives. - [OPTIMIZATION] Do not compile expensive send controller sanity check by default. - [OPTIMIZATION] Add fast path to gquic_be_gen_reg_pkt_header. - [OPTIMIZATION] Only make squeeze function call if necessary. - [OPTIMIZATION] Speed up Q039 ACK frame parsing. - [OPTIMIZATION] Fit most used elements of packet_out into first 64 bytes. - [OPTIMIZATION] Keep track of scheduled bytes instead of calculating. - [OPTIMIZATION] Prefetch next unacked packet when processing ACK. - [OPTIMIZATION] Leverage fact that ACK ranges and unacked list are. ordered. - [OPTIMIZATION] Reduce function pointer use for STREAM frame generation - Fix: reset incoming streams that arrive after we send GOAWAY. - Fix: delay client on_new_conn() call until connection is fully set up. - Fixes to buffered packets logic: splitting, STREAM frame elision. - Fix: do not dispatch on_write callback if no packets are available. - Fix WINDOW_UPDATE send and resend logic. - Fix STREAM frame extension code. - Fix: Drop unflushed data when stream is reset. - Switch to tracking CWND using bytes rather than packets. - Fix TCP friendly adjustment in cubic. - Fix: do not generate invalid STOP_WAITING frames during high packet loss. - Pacer fixes.
2018-02-26 21:01:16 +00:00
#define lsquic_packet_out_set_packno_bits(p, b) do { \
(p)->po_flags &= ~(0x3 << POBIT_SHIFT); \
(p)->po_flags |= ((b) & 0x3) << POBIT_SHIFT; \
} while (0)
#define lsquic_packet_out_ipv6(p) ((int)(((p)->po_flags >> POIPv6_SHIFT) & 1))
#define lsquic_packet_out_set_ipv6(p, b) do { \
(p)->po_flags &= ~(1 << POIPv6_SHIFT); \
(p)->po_flags |= ((b) & 1) << POIPv6_SHIFT; \
} while (0)
#define lsquic_packet_out_spin_bit(p) (((p)->po_flags & PO_SPIN_BIT) > 0)
#define lsquic_packet_out_square_bit(p) (((p)->po_lflags & POL_SQUARE_BIT) > 0)
#define lsquic_packet_out_loss_bit(p) (((p)->po_lflags & POL_LOSS_BIT) > 0)
#define lsquic_packet_out_set_spin_bit(p, b) do { \
(p)->po_flags &= ~PO_SPIN_BIT; \
(p)->po_flags |= ((b) & 1) << POSPIN_SHIFT; \
} while (0)
#define lsquic_po_header_length(lconn, po_flags, dcid_len, header_type) ( \
lconn->cn_pf->pf_packout_max_header_size(lconn, po_flags, dcid_len, \
header_type)) \
2017-09-22 21:00:03 +00:00
2018-08-15 19:06:31 +00:00
#define lsquic_packet_out_total_sz(lconn, p) (\
(lconn)->cn_pf->pf_packout_size(lconn, p))
Latest changes - [API Change] lsquic_engine_connect() returns pointer to the connection object. - [API Change] Add lsquic_conn_get_engine() to get engine object from connection object. - [API Change] Add lsquic_conn_status() to query connection status. - [API Change] Add add lsquic_conn_set_ctx(). - [API Change] Add new timestamp format, e.g. 2017-03-21 13:43:46.671345 - [OPTIMIZATION] Process handshake STREAM frames as soon as packet arrives. - [OPTIMIZATION] Do not compile expensive send controller sanity check by default. - [OPTIMIZATION] Add fast path to gquic_be_gen_reg_pkt_header. - [OPTIMIZATION] Only make squeeze function call if necessary. - [OPTIMIZATION] Speed up Q039 ACK frame parsing. - [OPTIMIZATION] Fit most used elements of packet_out into first 64 bytes. - [OPTIMIZATION] Keep track of scheduled bytes instead of calculating. - [OPTIMIZATION] Prefetch next unacked packet when processing ACK. - [OPTIMIZATION] Leverage fact that ACK ranges and unacked list are. ordered. - [OPTIMIZATION] Reduce function pointer use for STREAM frame generation - Fix: reset incoming streams that arrive after we send GOAWAY. - Fix: delay client on_new_conn() call until connection is fully set up. - Fixes to buffered packets logic: splitting, STREAM frame elision. - Fix: do not dispatch on_write callback if no packets are available. - Fix WINDOW_UPDATE send and resend logic. - Fix STREAM frame extension code. - Fix: Drop unflushed data when stream is reset. - Switch to tracking CWND using bytes rather than packets. - Fix TCP friendly adjustment in cubic. - Fix: do not generate invalid STOP_WAITING frames during high packet loss. - Pacer fixes.
2018-02-26 21:01:16 +00:00
Latest changes - [OPTIMIZATION] Merge series of ACKs if possible Parsed single-range ACK frames (that is the majority of frames) are saved in the connection and their processing is deferred until the connection is ticked. If several ACKs come in a series between adjacent ticks, we check whether the latest ACK is a strict superset of the saved ACK. If it is, the older ACK is not processed. If ACK frames can be merged, they are merged and only one of them is either processed or saved. - [OPTIMIZATION] Speed up ACK verification by simplifying send history. Never generate a gap in the sent packet number sequence. This reduces the send history to a single number instead of potentially a series of packet ranges and thereby speeds up ACK verification. By default, detecting a gap in the send history is not fatal: only a single warning is generated per connection. The connection can continue to operate even if the ACK verification code is not able to detect some inconsistencies. - [OPTIMIZATION] Rearrange the lsquic_send_ctl struct The first part of struct lsquic_send_ctl now consists of members that are used in lsquic_send_ctl_got_ack() (in the absense of packet loss, which is the normal case). To speed up reads and writes, we no longer try to save space by using 8- and 16-bit integers. Use regular integer width for everything. - [OPTIMIZATION] Cache size of sent packet. - [OPTIMIZATION] Keep track of the largest ACKed in packet_out Instead of parsing our own ACK frames when packet has been acked, use the value saved in the packet_out structure when the ACK frame was generated. - [OPTIMIZATION] Take RTT sampling conditional out of ACK loop - [OPTIMIZATION] ACK processing: only call clock_gettime() if needed - [OPTIMIZATION] Several code-level optimizations to ACK processing. - Fix: http_client: fix -I flag; switch assert() to abort()
2018-03-09 19:17:39 +00:00
#if __GNUC__
#if LSQUIC_EXTRA_CHECKS
2018-08-15 19:06:31 +00:00
#define lsquic_packet_out_sent_sz(lconn, p) ( \
Latest changes - [OPTIMIZATION] Merge series of ACKs if possible Parsed single-range ACK frames (that is the majority of frames) are saved in the connection and their processing is deferred until the connection is ticked. If several ACKs come in a series between adjacent ticks, we check whether the latest ACK is a strict superset of the saved ACK. If it is, the older ACK is not processed. If ACK frames can be merged, they are merged and only one of them is either processed or saved. - [OPTIMIZATION] Speed up ACK verification by simplifying send history. Never generate a gap in the sent packet number sequence. This reduces the send history to a single number instead of potentially a series of packet ranges and thereby speeds up ACK verification. By default, detecting a gap in the send history is not fatal: only a single warning is generated per connection. The connection can continue to operate even if the ACK verification code is not able to detect some inconsistencies. - [OPTIMIZATION] Rearrange the lsquic_send_ctl struct The first part of struct lsquic_send_ctl now consists of members that are used in lsquic_send_ctl_got_ack() (in the absense of packet loss, which is the normal case). To speed up reads and writes, we no longer try to save space by using 8- and 16-bit integers. Use regular integer width for everything. - [OPTIMIZATION] Cache size of sent packet. - [OPTIMIZATION] Keep track of the largest ACKed in packet_out Instead of parsing our own ACK frames when packet has been acked, use the value saved in the packet_out structure when the ACK frame was generated. - [OPTIMIZATION] Take RTT sampling conditional out of ACK loop - [OPTIMIZATION] ACK processing: only call clock_gettime() if needed - [OPTIMIZATION] Several code-level optimizations to ACK processing. - Fix: http_client: fix -I flag; switch assert() to abort()
2018-03-09 19:17:39 +00:00
__builtin_expect(((p)->po_flags & PO_SENT_SZ), 1) ? \
(assert(((p)->po_flags & PO_HELLO /* Avoid client DCID change */) \
|| (p)->po_sent_sz == lsquic_packet_out_total_sz(lconn, p)), \
2018-08-15 19:06:31 +00:00
(p)->po_sent_sz) : lsquic_packet_out_total_sz(lconn, p))
Latest changes - [OPTIMIZATION] Merge series of ACKs if possible Parsed single-range ACK frames (that is the majority of frames) are saved in the connection and their processing is deferred until the connection is ticked. If several ACKs come in a series between adjacent ticks, we check whether the latest ACK is a strict superset of the saved ACK. If it is, the older ACK is not processed. If ACK frames can be merged, they are merged and only one of them is either processed or saved. - [OPTIMIZATION] Speed up ACK verification by simplifying send history. Never generate a gap in the sent packet number sequence. This reduces the send history to a single number instead of potentially a series of packet ranges and thereby speeds up ACK verification. By default, detecting a gap in the send history is not fatal: only a single warning is generated per connection. The connection can continue to operate even if the ACK verification code is not able to detect some inconsistencies. - [OPTIMIZATION] Rearrange the lsquic_send_ctl struct The first part of struct lsquic_send_ctl now consists of members that are used in lsquic_send_ctl_got_ack() (in the absense of packet loss, which is the normal case). To speed up reads and writes, we no longer try to save space by using 8- and 16-bit integers. Use regular integer width for everything. - [OPTIMIZATION] Cache size of sent packet. - [OPTIMIZATION] Keep track of the largest ACKed in packet_out Instead of parsing our own ACK frames when packet has been acked, use the value saved in the packet_out structure when the ACK frame was generated. - [OPTIMIZATION] Take RTT sampling conditional out of ACK loop - [OPTIMIZATION] ACK processing: only call clock_gettime() if needed - [OPTIMIZATION] Several code-level optimizations to ACK processing. - Fix: http_client: fix -I flag; switch assert() to abort()
2018-03-09 19:17:39 +00:00
# else
2018-08-15 19:06:31 +00:00
#define lsquic_packet_out_sent_sz(lconn, p) ( \
Latest changes - [OPTIMIZATION] Merge series of ACKs if possible Parsed single-range ACK frames (that is the majority of frames) are saved in the connection and their processing is deferred until the connection is ticked. If several ACKs come in a series between adjacent ticks, we check whether the latest ACK is a strict superset of the saved ACK. If it is, the older ACK is not processed. If ACK frames can be merged, they are merged and only one of them is either processed or saved. - [OPTIMIZATION] Speed up ACK verification by simplifying send history. Never generate a gap in the sent packet number sequence. This reduces the send history to a single number instead of potentially a series of packet ranges and thereby speeds up ACK verification. By default, detecting a gap in the send history is not fatal: only a single warning is generated per connection. The connection can continue to operate even if the ACK verification code is not able to detect some inconsistencies. - [OPTIMIZATION] Rearrange the lsquic_send_ctl struct The first part of struct lsquic_send_ctl now consists of members that are used in lsquic_send_ctl_got_ack() (in the absense of packet loss, which is the normal case). To speed up reads and writes, we no longer try to save space by using 8- and 16-bit integers. Use regular integer width for everything. - [OPTIMIZATION] Cache size of sent packet. - [OPTIMIZATION] Keep track of the largest ACKed in packet_out Instead of parsing our own ACK frames when packet has been acked, use the value saved in the packet_out structure when the ACK frame was generated. - [OPTIMIZATION] Take RTT sampling conditional out of ACK loop - [OPTIMIZATION] ACK processing: only call clock_gettime() if needed - [OPTIMIZATION] Several code-level optimizations to ACK processing. - Fix: http_client: fix -I flag; switch assert() to abort()
2018-03-09 19:17:39 +00:00
__builtin_expect(((p)->po_flags & PO_SENT_SZ), 1) ? \
2018-08-15 19:06:31 +00:00
(p)->po_sent_sz : lsquic_packet_out_total_sz(lconn, p))
Latest changes - [OPTIMIZATION] Merge series of ACKs if possible Parsed single-range ACK frames (that is the majority of frames) are saved in the connection and their processing is deferred until the connection is ticked. If several ACKs come in a series between adjacent ticks, we check whether the latest ACK is a strict superset of the saved ACK. If it is, the older ACK is not processed. If ACK frames can be merged, they are merged and only one of them is either processed or saved. - [OPTIMIZATION] Speed up ACK verification by simplifying send history. Never generate a gap in the sent packet number sequence. This reduces the send history to a single number instead of potentially a series of packet ranges and thereby speeds up ACK verification. By default, detecting a gap in the send history is not fatal: only a single warning is generated per connection. The connection can continue to operate even if the ACK verification code is not able to detect some inconsistencies. - [OPTIMIZATION] Rearrange the lsquic_send_ctl struct The first part of struct lsquic_send_ctl now consists of members that are used in lsquic_send_ctl_got_ack() (in the absense of packet loss, which is the normal case). To speed up reads and writes, we no longer try to save space by using 8- and 16-bit integers. Use regular integer width for everything. - [OPTIMIZATION] Cache size of sent packet. - [OPTIMIZATION] Keep track of the largest ACKed in packet_out Instead of parsing our own ACK frames when packet has been acked, use the value saved in the packet_out structure when the ACK frame was generated. - [OPTIMIZATION] Take RTT sampling conditional out of ACK loop - [OPTIMIZATION] ACK processing: only call clock_gettime() if needed - [OPTIMIZATION] Several code-level optimizations to ACK processing. - Fix: http_client: fix -I flag; switch assert() to abort()
2018-03-09 19:17:39 +00:00
#endif
#else
2018-08-15 19:06:31 +00:00
# define lsquic_packet_out_sent_sz(lconn, p) ( \
Latest changes - [OPTIMIZATION] Merge series of ACKs if possible Parsed single-range ACK frames (that is the majority of frames) are saved in the connection and their processing is deferred until the connection is ticked. If several ACKs come in a series between adjacent ticks, we check whether the latest ACK is a strict superset of the saved ACK. If it is, the older ACK is not processed. If ACK frames can be merged, they are merged and only one of them is either processed or saved. - [OPTIMIZATION] Speed up ACK verification by simplifying send history. Never generate a gap in the sent packet number sequence. This reduces the send history to a single number instead of potentially a series of packet ranges and thereby speeds up ACK verification. By default, detecting a gap in the send history is not fatal: only a single warning is generated per connection. The connection can continue to operate even if the ACK verification code is not able to detect some inconsistencies. - [OPTIMIZATION] Rearrange the lsquic_send_ctl struct The first part of struct lsquic_send_ctl now consists of members that are used in lsquic_send_ctl_got_ack() (in the absense of packet loss, which is the normal case). To speed up reads and writes, we no longer try to save space by using 8- and 16-bit integers. Use regular integer width for everything. - [OPTIMIZATION] Cache size of sent packet. - [OPTIMIZATION] Keep track of the largest ACKed in packet_out Instead of parsing our own ACK frames when packet has been acked, use the value saved in the packet_out structure when the ACK frame was generated. - [OPTIMIZATION] Take RTT sampling conditional out of ACK loop - [OPTIMIZATION] ACK processing: only call clock_gettime() if needed - [OPTIMIZATION] Several code-level optimizations to ACK processing. - Fix: http_client: fix -I flag; switch assert() to abort()
2018-03-09 19:17:39 +00:00
(p)->po_flags & PO_SENT_SZ ? \
2018-08-15 19:06:31 +00:00
(p)->po_sent_sz : lsquic_packet_out_total_sz(lconn, p))
Latest changes - [OPTIMIZATION] Merge series of ACKs if possible Parsed single-range ACK frames (that is the majority of frames) are saved in the connection and their processing is deferred until the connection is ticked. If several ACKs come in a series between adjacent ticks, we check whether the latest ACK is a strict superset of the saved ACK. If it is, the older ACK is not processed. If ACK frames can be merged, they are merged and only one of them is either processed or saved. - [OPTIMIZATION] Speed up ACK verification by simplifying send history. Never generate a gap in the sent packet number sequence. This reduces the send history to a single number instead of potentially a series of packet ranges and thereby speeds up ACK verification. By default, detecting a gap in the send history is not fatal: only a single warning is generated per connection. The connection can continue to operate even if the ACK verification code is not able to detect some inconsistencies. - [OPTIMIZATION] Rearrange the lsquic_send_ctl struct The first part of struct lsquic_send_ctl now consists of members that are used in lsquic_send_ctl_got_ack() (in the absense of packet loss, which is the normal case). To speed up reads and writes, we no longer try to save space by using 8- and 16-bit integers. Use regular integer width for everything. - [OPTIMIZATION] Cache size of sent packet. - [OPTIMIZATION] Keep track of the largest ACKed in packet_out Instead of parsing our own ACK frames when packet has been acked, use the value saved in the packet_out structure when the ACK frame was generated. - [OPTIMIZATION] Take RTT sampling conditional out of ACK loop - [OPTIMIZATION] ACK processing: only call clock_gettime() if needed - [OPTIMIZATION] Several code-level optimizations to ACK processing. - Fix: http_client: fix -I flag; switch assert() to abort()
2018-03-09 19:17:39 +00:00
#endif
2017-09-22 21:00:03 +00:00
#define lsquic_packet_out_verneg(p) \
(((p)->po_flags & (PO_NOENCRYPT|PO_VERNEG|PO_RETRY)) == (PO_NOENCRYPT|PO_VERNEG))
2017-09-22 21:00:03 +00:00
#define lsquic_packet_out_pubres(p) \
(((p)->po_flags & (PO_NOENCRYPT|PO_VERNEG|PO_RETRY)) == PO_NOENCRYPT )
#define lsquic_packet_out_retry(p) \
(((p)->po_flags & (PO_NOENCRYPT|PO_VERNEG|PO_RETRY)) == (PO_NOENCRYPT|PO_RETRY) )
2017-09-22 21:00:03 +00:00
2018-08-15 19:06:31 +00:00
#define lsquic_packet_out_set_enc_level(p, level) do { \
(p)->po_lflags &= ~(3 << POLEV_SHIFT); \
(p)->po_lflags |= level << POLEV_SHIFT; \
} while (0)
#define lsquic_packet_out_enc_level(p) (((p)->po_lflags >> POLEV_SHIFT) & 3)
#define lsquic_packet_out_set_kp(p, kp) do { \
(p)->po_lflags &= ~(1 << POKP_SHIFT); \
(p)->po_lflags |= kp << POKP_SHIFT; \
} while (0)
#define lsquic_packet_out_kp(p) (((p)->po_lflags >> POKP_SHIFT) & 1)
#define lsquic_packet_out_set_pns(p, pns) do { \
(p)->po_flags &= ~(3 << POPNS_SHIFT); \
(p)->po_flags |= pns << POPNS_SHIFT; \
} while (0)
#define lsquic_packet_out_pns(p) (((p)->po_flags >> POPNS_SHIFT) & 3)
#define lsquic_packet_out_set_ecn(p, ecn) do { \
(p)->po_lflags &= ~(3 << POECN_SHIFT); \
(p)->po_lflags |= ecn << POECN_SHIFT; \
2018-08-15 19:06:31 +00:00
} while (0)
#define lsquic_packet_out_ecn(p) (((p)->po_lflags >> POECN_SHIFT) & 3)
2018-08-15 19:06:31 +00:00
struct packet_out_frec_iter {
2017-09-22 21:00:03 +00:00
lsquic_packet_out_t *packet_out;
struct frame_rec_arr *cur_frec_arr;
unsigned frec_idx;
Latest changes - [API Change] Sendfile-like functionality is gone. The stream no longer opens files and deals with file descriptors. (Among other things, this makes the code more portable.) Three writing functions are provided: lsquic_stream_write lsquic_stream_writev lsquic_stream_writef (NEW) lsquic_stream_writef() is given an abstract reader that has function pointers for size() and read() functions which the user can implement. This is the most flexible way. lsquic_stream_write() and lsquic_stream_writev() are now both implemented as wrappers around lsquic_stream_writef(). - [OPTIMIZATION] When writing to stream, be it within or without the on_write() callback, place data directly into packet buffer, bypassing auxiliary data structures. This reduces amount of memory required, for the amount of data that can be written is limited by the congestion window. To support writes outside the on_write() callback, we keep N outgoing packet buffers per connection which can be written to by any stream. One half of these are reserved for the highest priority stream(s), the other half for all other streams. This way, low-priority streams cannot write instead of high-priority streams and, on the other hand, low-priority streams get a chance to send their packets out. The algorithm is as follows: - When user writes to stream outside of the callback: - If this is the highest priority stream, place it onto the reserved N/2 queue or fail. (The actual size of this queue is dynamic -- MAX(N/2, CWND) -- rather than N/2, allowing high-priority streams to write as much as can be sent.) - If the stream is not the highest priority, try to place the data onto the reserved N/2 queue or fail. - When tick occurs *and* more packets can be scheduled: - Transfer packets from the high N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for highest-priority streams, placing resulting packets directly onto the scheduled queue. - If more scheduling is allowed: - Transfer packets from the low N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for non-highest-priority streams, placing resulting packets directly onto the scheduled queue The number N is currently 20, but it could be varied based on resource usage. - If stream is created due to incoming headers, make headers readable from on_new. - Outgoing packets are no longer marked non-writeable to prevent placing more than one STREAM frame from the same stream into a single packet. This property is maintained via code flow and an explicit check. Packets for stream data are allocated using a special function. - STREAM frame elision is cheaper, as we only perform it if a reset stream has outgoing packets referencing it. - lsquic_packet_out_t is smaller, as stream_rec elements are now inside a union.
2017-10-31 13:35:58 +00:00
int impl_idx;
2017-09-22 21:00:03 +00:00
};
struct frame_rec *
lsquic_pofi_first (struct packet_out_frec_iter *pofi, lsquic_packet_out_t *);
2017-09-22 21:00:03 +00:00
struct frame_rec *
lsquic_pofi_next (struct packet_out_frec_iter *pofi);
2017-09-22 21:00:03 +00:00
lsquic_packet_out_t *
lsquic_packet_out_new (struct lsquic_mm *, struct malo *, int use_cid,
const struct lsquic_conn *, enum packno_bits,
const lsquic_ver_tag_t *, const unsigned char *nonce,
const struct network_path *, enum header_type);
2017-09-22 21:00:03 +00:00
void
lsquic_packet_out_destroy (lsquic_packet_out_t *,
struct lsquic_engine_public *, void *peer_ctx);
2017-09-22 21:00:03 +00:00
int
lsquic_packet_out_add_frame (struct lsquic_packet_out *,
struct lsquic_mm *, uintptr_t data, enum quic_frame_type,
unsigned short off, unsigned short len);
2017-09-22 21:00:03 +00:00
int
lsquic_packet_out_add_stream (lsquic_packet_out_t *packet_out,
struct lsquic_mm *mm,
struct lsquic_stream *new_stream,
enum quic_frame_type,
Latest changes - [API Change] Sendfile-like functionality is gone. The stream no longer opens files and deals with file descriptors. (Among other things, this makes the code more portable.) Three writing functions are provided: lsquic_stream_write lsquic_stream_writev lsquic_stream_writef (NEW) lsquic_stream_writef() is given an abstract reader that has function pointers for size() and read() functions which the user can implement. This is the most flexible way. lsquic_stream_write() and lsquic_stream_writev() are now both implemented as wrappers around lsquic_stream_writef(). - [OPTIMIZATION] When writing to stream, be it within or without the on_write() callback, place data directly into packet buffer, bypassing auxiliary data structures. This reduces amount of memory required, for the amount of data that can be written is limited by the congestion window. To support writes outside the on_write() callback, we keep N outgoing packet buffers per connection which can be written to by any stream. One half of these are reserved for the highest priority stream(s), the other half for all other streams. This way, low-priority streams cannot write instead of high-priority streams and, on the other hand, low-priority streams get a chance to send their packets out. The algorithm is as follows: - When user writes to stream outside of the callback: - If this is the highest priority stream, place it onto the reserved N/2 queue or fail. (The actual size of this queue is dynamic -- MAX(N/2, CWND) -- rather than N/2, allowing high-priority streams to write as much as can be sent.) - If the stream is not the highest priority, try to place the data onto the reserved N/2 queue or fail. - When tick occurs *and* more packets can be scheduled: - Transfer packets from the high N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for highest-priority streams, placing resulting packets directly onto the scheduled queue. - If more scheduling is allowed: - Transfer packets from the low N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for non-highest-priority streams, placing resulting packets directly onto the scheduled queue The number N is currently 20, but it could be varied based on resource usage. - If stream is created due to incoming headers, make headers readable from on_new. - Outgoing packets are no longer marked non-writeable to prevent placing more than one STREAM frame from the same stream into a single packet. This property is maintained via code flow and an explicit check. Packets for stream data are allocated using a special function. - STREAM frame elision is cheaper, as we only perform it if a reset stream has outgoing packets referencing it. - lsquic_packet_out_t is smaller, as stream_rec elements are now inside a union.
2017-10-31 13:35:58 +00:00
unsigned short off, unsigned short len);
2017-09-22 21:00:03 +00:00
Latest changes - [API Change] lsquic_engine_connect() returns pointer to the connection object. - [API Change] Add lsquic_conn_get_engine() to get engine object from connection object. - [API Change] Add lsquic_conn_status() to query connection status. - [API Change] Add add lsquic_conn_set_ctx(). - [API Change] Add new timestamp format, e.g. 2017-03-21 13:43:46.671345 - [OPTIMIZATION] Process handshake STREAM frames as soon as packet arrives. - [OPTIMIZATION] Do not compile expensive send controller sanity check by default. - [OPTIMIZATION] Add fast path to gquic_be_gen_reg_pkt_header. - [OPTIMIZATION] Only make squeeze function call if necessary. - [OPTIMIZATION] Speed up Q039 ACK frame parsing. - [OPTIMIZATION] Fit most used elements of packet_out into first 64 bytes. - [OPTIMIZATION] Keep track of scheduled bytes instead of calculating. - [OPTIMIZATION] Prefetch next unacked packet when processing ACK. - [OPTIMIZATION] Leverage fact that ACK ranges and unacked list are. ordered. - [OPTIMIZATION] Reduce function pointer use for STREAM frame generation - Fix: reset incoming streams that arrive after we send GOAWAY. - Fix: delay client on_new_conn() call until connection is fully set up. - Fixes to buffered packets logic: splitting, STREAM frame elision. - Fix: do not dispatch on_write callback if no packets are available. - Fix WINDOW_UPDATE send and resend logic. - Fix STREAM frame extension code. - Fix: Drop unflushed data when stream is reset. - Switch to tracking CWND using bytes rather than packets. - Fix TCP friendly adjustment in cubic. - Fix: do not generate invalid STOP_WAITING frames during high packet loss. - Pacer fixes.
2018-02-26 21:01:16 +00:00
unsigned
lsquic_packet_out_elide_reset_stream_frames (lsquic_packet_out_t *,
lsquic_stream_id_t);
Latest changes - [API Change] Sendfile-like functionality is gone. The stream no longer opens files and deals with file descriptors. (Among other things, this makes the code more portable.) Three writing functions are provided: lsquic_stream_write lsquic_stream_writev lsquic_stream_writef (NEW) lsquic_stream_writef() is given an abstract reader that has function pointers for size() and read() functions which the user can implement. This is the most flexible way. lsquic_stream_write() and lsquic_stream_writev() are now both implemented as wrappers around lsquic_stream_writef(). - [OPTIMIZATION] When writing to stream, be it within or without the on_write() callback, place data directly into packet buffer, bypassing auxiliary data structures. This reduces amount of memory required, for the amount of data that can be written is limited by the congestion window. To support writes outside the on_write() callback, we keep N outgoing packet buffers per connection which can be written to by any stream. One half of these are reserved for the highest priority stream(s), the other half for all other streams. This way, low-priority streams cannot write instead of high-priority streams and, on the other hand, low-priority streams get a chance to send their packets out. The algorithm is as follows: - When user writes to stream outside of the callback: - If this is the highest priority stream, place it onto the reserved N/2 queue or fail. (The actual size of this queue is dynamic -- MAX(N/2, CWND) -- rather than N/2, allowing high-priority streams to write as much as can be sent.) - If the stream is not the highest priority, try to place the data onto the reserved N/2 queue or fail. - When tick occurs *and* more packets can be scheduled: - Transfer packets from the high N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for highest-priority streams, placing resulting packets directly onto the scheduled queue. - If more scheduling is allowed: - Transfer packets from the low N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for non-highest-priority streams, placing resulting packets directly onto the scheduled queue The number N is currently 20, but it could be varied based on resource usage. - If stream is created due to incoming headers, make headers readable from on_new. - Outgoing packets are no longer marked non-writeable to prevent placing more than one STREAM frame from the same stream into a single packet. This property is maintained via code flow and an explicit check. Packets for stream data are allocated using a special function. - STREAM frame elision is cheaper, as we only perform it if a reset stream has outgoing packets referencing it. - lsquic_packet_out_t is smaller, as stream_rec elements are now inside a union.
2017-10-31 13:35:58 +00:00
2017-09-22 21:00:03 +00:00
void
lsquic_packet_out_chop_regen (lsquic_packet_out_t *);
Latest changes - [API Change] Sendfile-like functionality is gone. The stream no longer opens files and deals with file descriptors. (Among other things, this makes the code more portable.) Three writing functions are provided: lsquic_stream_write lsquic_stream_writev lsquic_stream_writef (NEW) lsquic_stream_writef() is given an abstract reader that has function pointers for size() and read() functions which the user can implement. This is the most flexible way. lsquic_stream_write() and lsquic_stream_writev() are now both implemented as wrappers around lsquic_stream_writef(). - [OPTIMIZATION] When writing to stream, be it within or without the on_write() callback, place data directly into packet buffer, bypassing auxiliary data structures. This reduces amount of memory required, for the amount of data that can be written is limited by the congestion window. To support writes outside the on_write() callback, we keep N outgoing packet buffers per connection which can be written to by any stream. One half of these are reserved for the highest priority stream(s), the other half for all other streams. This way, low-priority streams cannot write instead of high-priority streams and, on the other hand, low-priority streams get a chance to send their packets out. The algorithm is as follows: - When user writes to stream outside of the callback: - If this is the highest priority stream, place it onto the reserved N/2 queue or fail. (The actual size of this queue is dynamic -- MAX(N/2, CWND) -- rather than N/2, allowing high-priority streams to write as much as can be sent.) - If the stream is not the highest priority, try to place the data onto the reserved N/2 queue or fail. - When tick occurs *and* more packets can be scheduled: - Transfer packets from the high N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for highest-priority streams, placing resulting packets directly onto the scheduled queue. - If more scheduling is allowed: - Transfer packets from the low N/2 queue to the scheduled queue. - If more scheduling is allowed: - Call on_write callbacks for non-highest-priority streams, placing resulting packets directly onto the scheduled queue The number N is currently 20, but it could be varied based on resource usage. - If stream is created due to incoming headers, make headers readable from on_new. - Outgoing packets are no longer marked non-writeable to prevent placing more than one STREAM frame from the same stream into a single packet. This property is maintained via code flow and an explicit check. Packets for stream data are allocated using a special function. - STREAM frame elision is cheaper, as we only perform it if a reset stream has outgoing packets referencing it. - lsquic_packet_out_t is smaller, as stream_rec elements are now inside a union.
2017-10-31 13:35:58 +00:00
void
lsquic_packet_out_ack_streams (struct lsquic_packet_out *);
void
lsquic_packet_out_zero_pad (struct lsquic_packet_out *);
size_t
lsquic_packet_out_mem_used (const struct lsquic_packet_out *);
int
lsquic_packet_out_turn_on_fin (struct lsquic_packet_out *,
const struct parse_funcs *, const struct lsquic_stream *);
int
lsquic_packet_out_equal_dcids (const struct lsquic_packet_out *,
const struct lsquic_packet_out *);
void
lsquic_packet_out_pad_over (struct lsquic_packet_out *packet_out,
enum quic_ft_bit frame_types);
2017-09-22 21:00:03 +00:00
#endif