I've been trying to come up with a simple and clean implementation for
ReLU. TOSA uses clamp which is probably the goal, but that means
table-gen to make it efficient (attributes, only lower min or max).
For now, max is a reasonable named op despite ReLU, so we can start
using it for tiling and fusion, and upon success, we create a more
complete op clamp that doesn't need a whole tensor filled with zeroes
or ones to implement the different activation functions.
As with other named ops, we start "requiring" type casts and broadcasts,
and zero filled constant tensors to a more complex pattern-matcher, and
can slowly simplify with attributes or structured matchers (ex. PDL) in
the future.