Add a basic implementation of getGatherScatterOpCost to BasicTTIImpl.
The implementation estimates the cost of scalarizing the loads/stores,
the cost of packing/extracting the individual lanes and the cost of
only selecting enabled lanes.
This more accurately reflects the current cost on targets like AArch64.