Hongyu Wang 70bccee5fa [SPR] Enable small loop unrolling for O2
Modern processors has multiple way instruction decoders
For x86, icelake/zen3 has 5 uops, so for small loop with <= 4
instructions (usually has 3 uops with a cmp/jmp pair that can be
macro-fused), the decoder would have 2 uops bubble for each iteration
and the pipeline could not be fully utilized.

Therefore, this patch enables loop unrolling for small size loop at O2
to fullfill the decoder as much as possible. It turns on rtl loop
unrolling when targetm.loop_unroll_adjust exists and O2 plus speed only.
In x86 backend the default behavior is to unroll small loops with less
than 4 insns by 1 time.

This improves 548.exchange2 by 9% on icelake and 7.4% on zen3 with
0.9% codesize increment. For other benchmarks the variants are minor
and overall codesize increased by 0.2%.

The kernel image size increased by 0.06%, and no impact on eembc.

gcc/ChangeLog:

	* common/config/i386/i386-common.cc (ix86_optimization_table):
	Enable small loop unroll at O2 by default.
	* config/i386/i386.cc (ix86_loop_unroll_adjust): Adjust unroll
	factor if -munroll-only-small-loops enabled and -funroll-loops/
	-funroll-all-loops are disabled.
	* config/i386/i386.h (struct processor_costs): Add 2 field
	small_unroll_ninsns and small_unroll_factor.
	* config/i386/i386.opt: Add -munroll-only-small-loops.
	* doc/invoke.texi: Document -munroll-only-small-loops.
	* loop-init.cc (pass_rtl_unroll_loops::gate): Enable rtl
	loop unrolling for -O2-speed and above if target hook
	loop_unroll_adjust exists.
	(pass_rtl_unroll_loops::execute): Set UAP_UNROLL flag
	when target hook loop_unroll_adjust exists.
	* config/i386/x86-tune-costs.h: Update all processor costs
	with small_unroll_ninsns = 4 and small_unroll_factor = 2.

gcc/testsuite/ChangeLog:

	# Please ety/loop-1.c: Add additional option
	-mno-unroll-only-small-loops.
	* gcc.target/i386/pr86270.c: Add -mno-unroll-only-small-loops.
	* gcc.target/i386/pr93002.c: Likewise.

url:https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=071e428c24ee8c1ed062597a093708bba29509c9
2022-11-30 09:16:59 +08:00
2022-11-02 10:28:23 +08:00
2022-08-01 14:28:55 +00:00
2022-08-01 14:28:55 +00:00

gcc-12

Description

gcc multi-version toolset for openEuler

Software Architecture

Software architecture description

Installation

  1. xxxx
  2. xxxx
  3. xxxx

Instructions

  1. xxxx
  2. xxxx
  3. xxxx

Contribution

  1. Fork the repository
  2. Create Feat_xxx branch
  3. Commit your code
  4. Create Pull Request

Gitee Feature

  1. You can use Readme_XXX.md to support different languages, such as Readme_en.md, Readme_zh.md
  2. Gitee blog blog.gitee.com
  3. Explore open source project https://gitee.com/explore
  4. The most valuable open source project GVP
  5. The manual of Gitee https://gitee.com/help
  6. The most popular members https://gitee.com/gitee-stars/
Description
No description provided
Readme 245 MiB
Languages
Diff 100%