| |
|
Summary of changes from 5.61 to 5.64
Summary of changes from 5.61 to 5.64
- Added instruction scheduling.
- Various other performance enhancements in low-level code generation.
- Various CSE enhancements.
- Copy-propagation optimisation added to register allocator.
- Now defaults to tuning performance for XScale, while remaining ARMv2
- compatible (ie -cpu XScale -arch 2).
- BLX opcode added to inline assembler.
- Enums can now use smaller containers rather than always being int-sized.
- Some hard-coded function size limits increased.
- Various bug fixes.
Extra notes
Scheduling
The scheduler re-orders instructions to tunes performance for the chosen CPU.
Regardless of the precise CPU code is tuned for, any scheduling is generally
preferable to none for all CPUs after the ARM7. It will also schedule for the
FPA, if an ARM7 is selected.
The compiler presently has detailed knowledge of scheduling for the ARM7,
ARM7M, ARM9, ARM9E, StrongARM and XScale cores. The CPU to optimise for is
selected via the -cpu command-line option; the default is XScale. If
backwards compatibility is required, the -arch option should be used in
conjunction with -cpu.
Scheduling can be disabled on a per-function basis with
#pragma no_optimise_schedule
or on a per-file basis with the command-line option -zpl0.
Scheduling is also disabled if debugging is enabled with -g.
Enums
Enums can now be different sizes, depending on their values - they can be
contained in a signed char, unsigned char, signed short, unsigned short,
signed int or unsigned int.
For backwards compatibility, this is not the default. The default remains
to have all enums be stored as ints. The command-line option -fy selects
this explicitly.
To enable variable-sized enums, use the option -ft.
BLX
The BLX opcode is similar to BL, except that it permits calls via a pointer.
The output will either be a BLX instruction, or a MOV LR,PC;
MOV/BX PC, pair, depending on whether the selected CPU/architecture
supports BLX and or BX. Like BL, it takes optional register sets {input
regs},{output regs},{corrupt regs}.
It should not be used for non-pointer calls - BL will be more efficient.
|