For an example of this, see the Intel Optimization Reference Manual,
2.5.2.2 ("Instruction Fetch Unit"):
An instruction fetch is a 16-byte aligned lookup through the ITLB into the instruction cache and instruction prefetch buffers. A hit in the instruction cache causes 16 bytes to be delivered to the instruction predecoder.
The model here (fetching exactly 16 bytes every cycle) is incomplete, as
it assumes every instruction is aligned and in icache, but it's an
improvement.
This is usually not a bottleneck, as stalls elsewhere will hide it, but
it can make a difference in some cases. Further work would be to
classify these cases in the report output - if there are unutilised
issue slots while there is no backend stall, we are frontend bound.