1

When supported, compressed instructions (RVC) bring a relaxation of code address alignment from 4 to 2 bytes. They also bring the possibility to intermix compressed and non-compressed instructions.

Does it make any sense, in terms of transfer performance from RAM to i-cache (instruction cache), to layout the code to keep an even number of subsequent RVC instructions?

Or is it totally irrelevant? Any reference?

1 Answer 1

3

Branch targets should be aligned, but otherwise there is no difference generally.

From the FU740-C000 manual (that's the SoC in the SiFive Unmatched)

3.2.5 Instruction Fetch Unit

The S7 instruction fetch unit is responsible for keeping the pipeline fed with instructions from memory. The instruction fetch unit delivers up to 8 bytes of instructions per clock cycle to support superscalar instruction execution. Fetches are always word-aligned and there is a one-cycle penalty for branching to a 32-bit instruction that is not word-aligned.

The S7 implements the standard Compressed (C) extension to the RISC‑V architecture, which allows for 16-bit RISC‑V instructions. As four 16-bit instructions can be fetched per cycle, the instruction fetch unit can be idle when executing programs comprised mostly of compressed 16-bit instructions. This reduces memory accesses and power consumption.

(...)

3.2.6 Branch Prediction

(...)

The BHT is a correlating predictor that supports long branch histories. The BTB has one-cycle latency, so that correctly predicted branches and direct jumps result in no penalty, provided the target is 8-byte aligned.

It is thus advisable that branch targets be aligned to 8 bytes or to at least align 32-bit instructions that are branch targets to 4 bytes. No penalty is mentioned for unaligned 32-bit instructions in an instruction stream, so there most likely is none.

Sign up to request clarification or add additional context in comments.

3 Comments

@EnzoR I have reverted your edit as it is incorrect. The number of bytes fetched from RAM to instruction cache is usually one cache line, which is in the ballpark of 64 or 128 bytes. It makes no sense to align to that size usually.
"Branch target should be aligned" to 16 bit for RVC and 32 for RV?
I explain this in detail in the last paragraph. Read again.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.