Test case and fix for lsquic rechist list problem (#410)

* Test case and debug for lsquic rechist list problem

Add:
- test case
- debug rechist print function (not called)
Modify:
- rechist_test_sanity to be more robust

Test case:
Added test case which produces a corrupt rechist.  The rechist
grows and then is emptied, then a rechist is produced that is nominally
empty (rh_n_used == 0) but has a circular list in rh_elems.  E.g.  (using
debug print at the problem site):

lsquic_rechist_received: insert before done
0x7ffe65ce75a0: cutoff 9 l. acked 12087061905995 masks 1 alloced 4 used 1 max ranges 0 head 0
  (0)[9-9] (0)[9-9] (0)[9-9] (0)[9-9] (0)[9-9] (0)[9-9] (0)[9-9]
test_rechist: /tmp/lsquic/src/liblsquic/lsquic_rechist.c:272: rechist_test_sanity: Assertion `rechist->rh_n_used == n_elems' failed.

Debug print:
Added 'rechist_dump' for ease of debugging, marked as unused.
Useful for calling from gdb or adding to code temporarily.

rechist_test_sanity:
Make it more robust to bad rechists by limiting
how many list items it will follow, to prevent infinite loops or walking off
the end of the list.

* rechist: Fix bug in lsquic_rechist_received causing circular list to be created

Fix bug in lsquic_rechist_received demonstrated by the test6 test-case.

When the list is empty, i.e.  rh_n_used == 0, the 'first_elem' path should
be used.  However, the test for this was using rh_n_alloced, which would
cause the code to continue on incorrectly.  One possibility is that it goes
to insert_before and creates a circular list in the rechist with the head at
index 0 having a next of 0.

This causes history to be lost I think.

The list at this point is still 'empty', however a following call could go
to the insert_before path and then insert an elem pointing to the circular
list, and bump up rh_n_used. Now you have a circular list.

Weird things can happen now.  Notably, ACK generation will exhaust the
packet buffer and generate an error, and so cause connections to be
prematurely aborted.

Fix lsquic_rechist_received to use rh_n_used. Also modify
rechist_alloc_elem to NULL the next pointer for robustness.

It would be nice for rechist_free_elem to also invalidate the elem, but it
is meant for relinking elems in the list and should preserve the next
pointer.
This commit is contained in:
Paul Jakma 2022-08-26 14:15:35 +01:00 committed by GitHub
parent 0aa00f2369
commit 188c055a9c
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
3 changed files with 92 additions and 6 deletions

View file

@ -195,6 +195,44 @@ test5 (void)
lsquic_rechist_cleanup(&rechist);
}
/* Regression test for bug where rechist ends up empty (rh_used == 0)
* with invalid head entry with a self-referential next, and
* lsquic_rechist_received would follow it because it didn't check rh_used.
*/
static void
test6 (void)
{
lsquic_rechist_t rechist;
char buf[256];
long int time = 12087061905875;
lsquic_rechist_init(&rechist, 0, 0);
for (int i = 0; i <= 3; i++)
lsquic_rechist_received(&rechist, i, (time += (i*10)));
lsquic_rechist_stop_wait(&rechist, 2);
lsquic_rechist_received(&rechist, 4, (time += 10));
lsquic_rechist_stop_wait(&rechist, 3);
lsquic_rechist_received(&rechist, 5, (time += 10));
lsquic_rechist_stop_wait(&rechist, 3);
lsquic_rechist_received(&rechist, 6, (time += 10));
lsquic_rechist_stop_wait(&rechist, 9);
lsquic_rechist_received(&rechist, 7, (time += 10));
lsquic_rechist_received(&rechist, 8, (time += 10));
lsquic_rechist_received(&rechist, 9, (time += 10));
rechist2str(&rechist, buf, sizeof(buf));
lsquic_rechist_cleanup(&rechist);
}
static void
test_rand_sequence (unsigned seed, unsigned max)
@ -393,6 +431,8 @@ main (void)
test4();
test5();
test6();
for (i = 0; i < 10; ++i)
test_rand_sequence(i, 0);