[Zlib-devel] [2/8][RFC V3 Patch] Add PPC Altivec Adler32 implementation

Jan Seiffert kaffeemonster at googlemail.com
Sun Apr 24 12:22:21 EDT 2011


This adds an PPC Altivec version of Adler32 to zlib.

It is coded in intrinsic as in the Altivec PIM, so it should work with
any PPC compiler which knows altivec.
But since i do not have tested it i restrict it for the moment to GCC.

This is not only interesting for PowerPC G4 & G5, but also the IBM POWER6
and POWER7 which again feature an Altivec unit, only now it's called
VSX.

Here some numbers from an PPC G4 Powerbook:
-------- orig ------
     a: 0x0CB4B676, 10000 * 160000 bytes   t: 22200 ms
     a: 0x25BEB273, 10000 * 159999 bytes   t: 20000 ms
     a: 0x733CB174, 10000 * 159998 bytes   t: 20800 ms
     a: 0x1144AF76, 10000 * 159996 bytes   t: 21200 ms
     a: 0x3F4ECB8A, 10000 * 159992 bytes  t: 21200 ms
     a: 0x1902A382, 10000 * 159984 bytes  t: 21100 ms
-------- altivec ------
     a: 0x0CB4B676, 10000 * 160000 bytes  t: 3400 ms
     a: 0x25BEB273, 10000 * 159999 bytes  t: 3400 ms
     a: 0x733CB174, 10000 * 159998 bytes  t: 3300 ms
     a: 0x1144AF76, 10000 * 159996 bytes  t: 3400 ms
     a: 0x3F4ECB8A, 10000 * 159992 bytes  t: 3400 ms
     a: 0x1902A382, 10000 * 159984 bytes  t: 3300 ms
speedup: 6.52941

The raw engine speed is insane.

Unfortunately the FSB/Memory can't keep up:
-------- orig ------
     a: 0x01A71FA6, 100 * 16000000 bytes  t: 47600 ms
     a: 0x2DEB1BA3, 100 * 15999999 bytes  t: 46100 ms
     a: 0x12481AA4, 100 * 15999998 bytes  t: 47500 ms
     a: 0xDDF018A6, 100 * 15999996 bytes  t: 47600 ms
     a: 0xC43634BA, 100 * 15999992 bytes  t: 47500 ms
     a: 0xF7D70CB2, 100 * 15999984 bytes  t: 47500 ms
-------- altivec ------
     a: 0x01A71FA6, 100 * 16000000 bytes  t: 36000 ms
     a: 0x2DEB1BA3, 100 * 15999999 bytes  t: 36000 ms
     a: 0x12481AA4, 100 * 15999998 bytes  t: 36000 ms
     a: 0xDDF018A6, 100 * 15999996 bytes  t: 36000 ms
     a: 0xC43634BA, 100 * 15999992 bytes  t: 35900 ms
     a: 0xF7D70CB2, 100 * 15999984 bytes  t: 36000 ms
speedup: 1.3222

Still we can squeeze some cycles from it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 02-ppc_altivec.patch
Type: text/x-patch
Size: 10627 bytes
Desc: not available
URL: <http://madler.net/pipermail/zlib-devel_madler.net/attachments/20110424/129e561b/attachment.bin>


More information about the Zlib-devel mailing list