This revision generalizes lowering the sparse_tensor.insert op into actual code that directly operates on the memrefs of a sparse storage scheme. The current insertion strategy does *not* rely on a cursor anymore, with introduces some testing overhead for each insertion (but still proportional to the rank, as before). Over time, we can optimize the code generation, but this version enables us to finish the effort to migrate from library to actual codegen.
Things to do:
(1) carefully deal with (un)ordered and (not)unique
(2) omit overhead when not needed
(3) optimize and specialize
(4) try to avoid the pointer "cleanup" (at HasInserts), and make sure the storage scheme is consistent at every insertion point (so that it can "escape" without concerns).
So, the current implementation only works for sorted dimension, right?
Otherwise you will need a loop the check presence here, right?