IMO this model is simpler to understand (borrowed from the LR0 patch D127357).
It also makes error recovery easier to implement, as we have a simple list of
head nodes lying around to recover from when needed.
(It's not quite as nice as LR0 in this respect though).
It's slightly slower (2.24 -> 2.12 MB/S on my machine = 5%) but nothing close
to as bad as LR0.
However
- I think we'd have to eat a litle performance loss otherwise to implement error recovery.
- this frees up some complexity budget for optimizations like fastpath push/pop (this + fastpath is already faster than head)
- I haven't changed the data structure here and it's now pretty dumb, we can make it faster
nit: change NewHeads to a pointer? it seems clearer that NewHeads is the output of the function from the caller side glrShift(OldHeads, ..., &NewHeads).
I think it would be clearer if glrShift just returns NewHeads, but I understand we want to avoid temporary object for performance reasons.