[Zlib-devel] [8/6][RFC V2.1 Patch] ia64 implementation
Jan Seiffert
kaffeemonster at googlemail.com
Thu Apr 7 17:07:49 EDT 2011
Thanks to Mike Frysinger i could torture some real IA64 HW (and myself too...).
Throw away the first post of this patch, it's broken, in several ways.
(Note to self: Don't try to be a cool kid and save on parenthesis)
And even if fixed, it's half as slow as the generic code.
Counting instructions is a bad move on IA64.
So here it is, new and improved, with 400% more unrolling:
an IA64 (McKinley)
-------- orig ------
a: 0x0CB4B676, 10000 * 160000 bytes t: 1912 ms
a: 0x25BEB273, 10000 * 159999 bytes t: 1916 ms
a: 0x733CB174, 10000 * 159998 bytes t: 1912 ms
a: 0x1144AF76, 10000 * 159996 bytes t: 1916 ms
a: 0x3F4ECB8A, 10000 * 159992 bytes t: 1916 ms
a: 0x1902A382, 10000 * 159984 bytes t: 1912 ms
-------- vec ------
a: 0x0CB4B676, 10000 * 160000 bytes t: 760 ms
a: 0x25BEB273, 10000 * 159999 bytes t: 764 ms
a: 0x733CB174, 10000 * 159998 bytes t: 760 ms
a: 0x1144AF76, 10000 * 159996 bytes t: 760 ms
a: 0x3F4ECB8A, 10000 * 159992 bytes t: 808 ms
a: 0x1902A382, 10000 * 159984 bytes t: 760 ms
speedup: 2.515789
next stop, blackfin, then working on the ARM iWMMXt version for XScale
(N.B.: does someone have a link handy to the instruction reference?),
and when some time has passed a complete repost, there are little
changes here and there.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 08-ia64.patch
Type: text/x-patch
Size: 24101 bytes
Desc: not available
URL: <http://madler.net/pipermail/zlib-devel_madler.net/attachments/20110407/1e3c2243/attachment.bin>
More information about the Zlib-devel
mailing list