In Kunpeng-920, the performance of strcmp deteriorates only
when the 16 to 23 characters are different.Or the string is
only 16-23 characters.That shows 2 misses per iteration which
means this is a branch predictor issue indeed.
In the preceding scenario, strcmp performance is 300% worse than expected.
Fortunately, this problem can be solved by modifying the alignment of the functions.
Signed-off-by: Yang Yanchao <yangyanchao6@huawei.com>