This avoids temporary and memcpy call when computing large expressions.
It's basically some kind of poor man's expression template, but it seems easier
to maintain to have a single generic apply call instead of the whole
expression template machinery here.
Shouldn't this be max_element rather than min_element?
(Or, can we change this class to make operations on differently-sized vectors illegal? This is hard to reason about.)