This change adds constant value detection to the assign methods, which optimizes the assigment of compile time known constant string values.
The compiler elides either to the non-constant or constant code execution. The assign() methods are removed from the external template for the unstable ABI.
All assignment calls now fallback to the __assign_inlined() implementation code, which allows future changes where we do not call into the 'grow_by_and_replace' but specialize for SSO --> non SSO and non SSO --> non SSO growths.
Notice that this change does NOT take an early branch in the assign() methods. We may consider adding early branches, however, the compiler's inlining heuristics then go negative on inlining the code even as the constant optimization should be pretty clean cut. (this is likely only an issue for non FDO builds, but for now we use a single external implementation).
Benchmark STABLE / V1:
----------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------- BM_StringAssignAsciiz_Empty_Opaque 6.61 ns 6.62 ns 105451520 BM_StringAssignAsciiz_Empty_Transparent 6.43 ns 6.43 ns 108793856 BM_StringAssignAsciiz_Small_Opaque 8.61 ns 8.61 ns 81317888 BM_StringAssignAsciiz_Small_Transparent 8.33 ns 8.33 ns 84029440 BM_StringAssignAsciiz_Large_Opaque 19.7 ns 19.7 ns 35692544 BM_StringAssignAsciiz_Large_Transparent 19.2 ns 19.2 ns 36343808 BM_StringAssignAsciiz_Huge_Opaque 1696 ns 1696 ns 401408 BM_StringAssignAsciiz_Huge_Transparent 1743 ns 1743 ns 397312 BM_StringAssignAsciizMix_Opaque 11.8 ns 11.8 ns 59224064 BM_StringAssignAsciizMix_Transparent 11.9 ns 11.9 ns 58802176
Benchmark UNSTABLE / V2 without this change:
----------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------- BM_StringAssignAsciiz_Empty_Opaque 6.24 ns 6.25 ns 109424640 BM_StringAssignAsciiz_Empty_Transparent 5.92 ns 5.93 ns 118013952 BM_StringAssignAsciiz_Small_Opaque 7.60 ns 7.60 ns 92065792 BM_StringAssignAsciiz_Small_Transparent 7.80 ns 7.81 ns 89632768 BM_StringAssignAsciiz_Large_Opaque 19.9 ns 19.9 ns 35254272 BM_StringAssignAsciiz_Large_Transparent 19.1 ns 19.1 ns 36786176 BM_StringAssignAsciiz_Huge_Opaque 1717 ns 1717 ns 389120 BM_StringAssignAsciiz_Huge_Transparent 1723 ns 1722 ns 397312 BM_StringAssignAsciizMix_Opaque 11.2 ns 11.2 ns 62922752 BM_StringAssignAsciizMix_Transparent 11.1 ns 11.1 ns 62586880
Benchmark UNSTABLE / V2 with this change:
----------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------- BM_StringAssignAsciiz_Empty_Opaque 4.92 ns 4.93 ns 136773632 BM_StringAssignAsciiz_Empty_Transparent 0.834 ns 0.835 ns 837238784 BM_StringAssignAsciiz_Small_Opaque 7.02 ns 7.03 ns 99655680 BM_StringAssignAsciiz_Small_Transparent 1.10 ns 1.10 ns 634368000 BM_StringAssignAsciiz_Large_Opaque 18.3 ns 18.3 ns 38576128 BM_StringAssignAsciiz_Large_Transparent 14.3 ns 14.3 ns 48939008 BM_StringAssignAsciiz_Huge_Opaque 1716 ns 1716 ns 393216 BM_StringAssignAsciiz_Huge_Transparent 1688 ns 1688 ns 409600 BM_StringAssignAsciizMix_Opaque 9.96 ns 9.95 ns 70467584 BM_StringAssignAsciizMix_Transparent 4.34 ns 4.34 ns 164167680
Nitpick: and__assign_maybe_alias is missing a space