It's all fun and games until you have to figure out if the endianness bug is in your code or in QEMU's s390x emulation.
rurban
Haven't found any bug in QEMU's s390x, but lots in endian code.
AKSF_Ackermann
> When programming, it is still important to write code that runs correctly on systems with either byte order
What you should do instead is write all your code so it is little-endian only, as the only relevant big-endian architecture is s390x, and if someone wants to run your code on s390x, they can afford a support contract.
bear8642
> the only relevant big-endian architecture is s390x
The adjacent POWER architecture is also still relevant - but as you say, they too can afford a support contract.
EPWN3D
I mostly agree, but network byte ordering is still a thing.
j16sdiz
If you comes to low level network protocol (e.g. writing a TCP stack), the "network byte order" is always big-endian.
nyrikki
The linked to blog post in the OP explains this better IMHO [0]:
If the data stream encodes values with byte order B, then the algorithm to decode the value on computer with byte order C should be about B, not about the relationship between B and C.
One cannot just ignore the big/little data interchange problem MacOS[1], Java, TCP/IP, Jpeg etc...
The point (for me) is not that your code runs on a s390, it is that you abstract your personal local implementation details from the data interchange formats. And unfortunately almost all of the processors are little, and many of the popular and unavoidable externalization are big...
There's still at least one relevant big-endian-only ARM chip out there, the TI Hercules. While in the past five or ten years we've gone from having very few options for lockstep microcontrollers (with the Hercules being a very compelling option) to being spoiled for choice, the Hercules is still a good fit for some applications, and is a pretty solid chip.
GandalfHN
Outsourcing endianness pain to your customers is an easy way to teach them about segfaults and silent data corruption. s390x is niche, endian bugs are not.
Network protocols and file formats still need a defined byte order, and the first time your code talks to hardware or reads old data, little-endian assumptions leak all over the place. Ignoring portability buys you a pile of vendor-specific hacks later, because your team will meet those 'irrelevant' platforms in appliances, embedded boxes, or somebody else's DB import path long before a sales rep waves a support contract at you.
jcalvinowens
Don't ignore endianness. But making little endian the default is the right thing to do, it is so much more ubiquitous in the modern world.
The vast majority of modern network protocols use little endian byte ordering. Most Linux filesystems use little endian for their on-disk binary representations.
There is absolutely no good reason for networking protocols to be defined to use big endian. It's an antiquated arbitrary idea: just do what makes sense.
electroly
> When programming, it is still important to write code that runs correctly on systems with either byte order
I contend it's almost never important and almost nobody writing user software should bother with this. Certainly, people who didn't already know they needed big-endian should not start caring now because they read an article online. There are countless rare machines that your code doesn't run on--what's so special about big endian? The world is little endian now. Big endian chips aren't coming back. You are spending your own time on an effort that will never pay off. If big endian is really needed, IBM will pay you to write the s390x port and they will provide the machine.
Retr0id
> There are countless rare machines that your code doesn't run on--what's so special about big endian?
One difference is that when your endian-oblivious code runs on a BE system, it can be subtly wrong in a way that's hard to diagnose, which is a whole lot worse than not working at all.
CJefferson
You are correct, honestly, I couldn't disagree more with th article. At this point I can't imagine why it's important.
It's also increasingly hard to test. Particularly when you have large expensive testsuites which run incredibly slowly on this simulated machines.
Is there any benefit in edge cases to using big-endian these days?
bluGill
What I really want is memory order emulation. X86 as strong memory order guarantees, ARM has much weaker guarantees. Which means the multi-threaded queue I'm working on works all the time on development x86 machine even if I forget to put in the correct memory-order schematics, but it might or might not work on ARM (which is what my of my users have). (I am in the habit of running all my stress tests 1000 times before I'm willing to send them out, but that doesn't mean the code is correct, it means it works on x86 and passed my review which might miss something)
newpavlov
For Rust we have Loom [0], but do not expect for it to work on your whole application.
On Linux it's really as simple as installing QEMU binfmt and doing:
GOARCH=s390x go test
susam
I wrote a similar post [1] some 16 years ago. My solution back then was to install Debian for PowerPC on QEMU using qemu-system-ppc.
But Hans's post uses user-mode emulation with qemu-mips, which avoids having to set up a whole big-endian system in QEMU. It is a very interesting approach I was unaware of. I'm pretty sure qemu-mips was available back in 2010, but I'm not sure if the gcc-mips-linux-gnu cross-compiler was readily available back then. I suspect my PPC-based solution might have been the only convenient way to solve this problem at the time.
Thanks for sharing it here. It was nice to go down memory lane and also learn a new way to solve the same problem.
It's all fun and games until you have to figure out if the endianness bug is in your code or in QEMU's s390x emulation.
rurban
Haven't found any bug in QEMU's s390x, but lots in endian code.
AKSF_Ackermann
> When programming, it is still important to write code that runs correctly on systems with either byte order
What you should do instead is write all your code so it is little-endian only, as the only relevant big-endian architecture is s390x, and if someone wants to run your code on s390x, they can afford a support contract.
bear8642
> the only relevant big-endian architecture is s390x
The adjacent POWER architecture is also still relevant - but as you say, they too can afford a support contract.
EPWN3D
I mostly agree, but network byte ordering is still a thing.
j16sdiz
If you comes to low level network protocol (e.g. writing a TCP stack), the "network byte order" is always big-endian.
nyrikki
The linked to blog post in the OP explains this better IMHO [0]:
If the data stream encodes values with byte order B, then the algorithm to decode the value on computer with byte order C should be about B, not about the relationship between B and C.
One cannot just ignore the big/little data interchange problem MacOS[1], Java, TCP/IP, Jpeg etc...
The point (for me) is not that your code runs on a s390, it is that you abstract your personal local implementation details from the data interchange formats. And unfortunately almost all of the processors are little, and many of the popular and unavoidable externalization are big...
There's still at least one relevant big-endian-only ARM chip out there, the TI Hercules. While in the past five or ten years we've gone from having very few options for lockstep microcontrollers (with the Hercules being a very compelling option) to being spoiled for choice, the Hercules is still a good fit for some applications, and is a pretty solid chip.
GandalfHN
Outsourcing endianness pain to your customers is an easy way to teach them about segfaults and silent data corruption. s390x is niche, endian bugs are not.
Network protocols and file formats still need a defined byte order, and the first time your code talks to hardware or reads old data, little-endian assumptions leak all over the place. Ignoring portability buys you a pile of vendor-specific hacks later, because your team will meet those 'irrelevant' platforms in appliances, embedded boxes, or somebody else's DB import path long before a sales rep waves a support contract at you.
jcalvinowens
Don't ignore endianness. But making little endian the default is the right thing to do, it is so much more ubiquitous in the modern world.
The vast majority of modern network protocols use little endian byte ordering. Most Linux filesystems use little endian for their on-disk binary representations.
There is absolutely no good reason for networking protocols to be defined to use big endian. It's an antiquated arbitrary idea: just do what makes sense.
electroly
> When programming, it is still important to write code that runs correctly on systems with either byte order
I contend it's almost never important and almost nobody writing user software should bother with this. Certainly, people who didn't already know they needed big-endian should not start caring now because they read an article online. There are countless rare machines that your code doesn't run on--what's so special about big endian? The world is little endian now. Big endian chips aren't coming back. You are spending your own time on an effort that will never pay off. If big endian is really needed, IBM will pay you to write the s390x port and they will provide the machine.
Retr0id
> There are countless rare machines that your code doesn't run on--what's so special about big endian?
One difference is that when your endian-oblivious code runs on a BE system, it can be subtly wrong in a way that's hard to diagnose, which is a whole lot worse than not working at all.
CJefferson
You are correct, honestly, I couldn't disagree more with th article. At this point I can't imagine why it's important.
It's also increasingly hard to test. Particularly when you have large expensive testsuites which run incredibly slowly on this simulated machines.
Is there any benefit in edge cases to using big-endian these days?
bluGill
What I really want is memory order emulation. X86 as strong memory order guarantees, ARM has much weaker guarantees. Which means the multi-threaded queue I'm working on works all the time on development x86 machine even if I forget to put in the correct memory-order schematics, but it might or might not work on ARM (which is what my of my users have). (I am in the habit of running all my stress tests 1000 times before I'm willing to send them out, but that doesn't mean the code is correct, it means it works on x86 and passed my review which might miss something)
newpavlov
For Rust we have Loom [0], but do not expect for it to work on your whole application.
On Linux it's really as simple as installing QEMU binfmt and doing:
GOARCH=s390x go test
susam
I wrote a similar post [1] some 16 years ago. My solution back then was to install Debian for PowerPC on QEMU using qemu-system-ppc.
But Hans's post uses user-mode emulation with qemu-mips, which avoids having to set up a whole big-endian system in QEMU. It is a very interesting approach I was unaware of. I'm pretty sure qemu-mips was available back in 2010, but I'm not sure if the gcc-mips-linux-gnu cross-compiler was readily available back then. I suspect my PPC-based solution might have been the only convenient way to solve this problem at the time.
Thanks for sharing it here. It was nice to go down memory lane and also learn a new way to solve the same problem.
It's all fun and games until you have to figure out if the endianness bug is in your code or in QEMU's s390x emulation.
Haven't found any bug in QEMU's s390x, but lots in endian code.
> When programming, it is still important to write code that runs correctly on systems with either byte order
What you should do instead is write all your code so it is little-endian only, as the only relevant big-endian architecture is s390x, and if someone wants to run your code on s390x, they can afford a support contract.
> the only relevant big-endian architecture is s390x
The adjacent POWER architecture is also still relevant - but as you say, they too can afford a support contract.
I mostly agree, but network byte ordering is still a thing.
If you comes to low level network protocol (e.g. writing a TCP stack), the "network byte order" is always big-endian.
The linked to blog post in the OP explains this better IMHO [0]:
One cannot just ignore the big/little data interchange problem MacOS[1], Java, TCP/IP, Jpeg etc...The point (for me) is not that your code runs on a s390, it is that you abstract your personal local implementation details from the data interchange formats. And unfortunately almost all of the processors are little, and many of the popular and unavoidable externalization are big...
[0] https://commandcenter.blogspot.com/2012/04/byte-order-fallac... [1] https://github.com/apple/darwin-xnu/blob/main/EXTERNAL_HEADE...
There's still at least one relevant big-endian-only ARM chip out there, the TI Hercules. While in the past five or ten years we've gone from having very few options for lockstep microcontrollers (with the Hercules being a very compelling option) to being spoiled for choice, the Hercules is still a good fit for some applications, and is a pretty solid chip.
Outsourcing endianness pain to your customers is an easy way to teach them about segfaults and silent data corruption. s390x is niche, endian bugs are not.
Network protocols and file formats still need a defined byte order, and the first time your code talks to hardware or reads old data, little-endian assumptions leak all over the place. Ignoring portability buys you a pile of vendor-specific hacks later, because your team will meet those 'irrelevant' platforms in appliances, embedded boxes, or somebody else's DB import path long before a sales rep waves a support contract at you.
Don't ignore endianness. But making little endian the default is the right thing to do, it is so much more ubiquitous in the modern world.
The vast majority of modern network protocols use little endian byte ordering. Most Linux filesystems use little endian for their on-disk binary representations.
There is absolutely no good reason for networking protocols to be defined to use big endian. It's an antiquated arbitrary idea: just do what makes sense.
> When programming, it is still important to write code that runs correctly on systems with either byte order
I contend it's almost never important and almost nobody writing user software should bother with this. Certainly, people who didn't already know they needed big-endian should not start caring now because they read an article online. There are countless rare machines that your code doesn't run on--what's so special about big endian? The world is little endian now. Big endian chips aren't coming back. You are spending your own time on an effort that will never pay off. If big endian is really needed, IBM will pay you to write the s390x port and they will provide the machine.
> There are countless rare machines that your code doesn't run on--what's so special about big endian?
One difference is that when your endian-oblivious code runs on a BE system, it can be subtly wrong in a way that's hard to diagnose, which is a whole lot worse than not working at all.
You are correct, honestly, I couldn't disagree more with th article. At this point I can't imagine why it's important.
It's also increasingly hard to test. Particularly when you have large expensive testsuites which run incredibly slowly on this simulated machines.
I did that many years back, but with MIPS and MIPSel: https://youtu.be/BGzJp1ybpHo?si=eY_Br8BalYzKPJMG&t=1130
presented at Embedded Linux Conf
Is there any benefit in edge cases to using big-endian these days?
What I really want is memory order emulation. X86 as strong memory order guarantees, ARM has much weaker guarantees. Which means the multi-threaded queue I'm working on works all the time on development x86 machine even if I forget to put in the correct memory-order schematics, but it might or might not work on ARM (which is what my of my users have). (I am in the habit of running all my stress tests 1000 times before I'm willing to send them out, but that doesn't mean the code is correct, it means it works on x86 and passed my review which might miss something)
For Rust we have Loom [0], but do not expect for it to work on your whole application.
[0]: https://github.com/tokio-rs/loom
If you're using Go on GitHub (and doing stuff where this actually matters) adding this to your CI can be as simple as this: https://github.com/ncruces/wasm2go/blob/v0.3.0/.github/workf...
On Linux it's really as simple as installing QEMU binfmt and doing:
I wrote a similar post [1] some 16 years ago. My solution back then was to install Debian for PowerPC on QEMU using qemu-system-ppc.
But Hans's post uses user-mode emulation with qemu-mips, which avoids having to set up a whole big-endian system in QEMU. It is a very interesting approach I was unaware of. I'm pretty sure qemu-mips was available back in 2010, but I'm not sure if the gcc-mips-linux-gnu cross-compiler was readily available back then. I suspect my PPC-based solution might have been the only convenient way to solve this problem at the time.
Thanks for sharing it here. It was nice to go down memory lane and also learn a new way to solve the same problem.
[1] https://susam.net/big-endian-on-little-endian.html
It's all fun and games until you have to figure out if the endianness bug is in your code or in QEMU's s390x emulation.
Haven't found any bug in QEMU's s390x, but lots in endian code.
> When programming, it is still important to write code that runs correctly on systems with either byte order
What you should do instead is write all your code so it is little-endian only, as the only relevant big-endian architecture is s390x, and if someone wants to run your code on s390x, they can afford a support contract.
> the only relevant big-endian architecture is s390x
The adjacent POWER architecture is also still relevant - but as you say, they too can afford a support contract.
I mostly agree, but network byte ordering is still a thing.
If you comes to low level network protocol (e.g. writing a TCP stack), the "network byte order" is always big-endian.
The linked to blog post in the OP explains this better IMHO [0]:
One cannot just ignore the big/little data interchange problem MacOS[1], Java, TCP/IP, Jpeg etc...The point (for me) is not that your code runs on a s390, it is that you abstract your personal local implementation details from the data interchange formats. And unfortunately almost all of the processors are little, and many of the popular and unavoidable externalization are big...
[0] https://commandcenter.blogspot.com/2012/04/byte-order-fallac... [1] https://github.com/apple/darwin-xnu/blob/main/EXTERNAL_HEADE...
There's still at least one relevant big-endian-only ARM chip out there, the TI Hercules. While in the past five or ten years we've gone from having very few options for lockstep microcontrollers (with the Hercules being a very compelling option) to being spoiled for choice, the Hercules is still a good fit for some applications, and is a pretty solid chip.
Outsourcing endianness pain to your customers is an easy way to teach them about segfaults and silent data corruption. s390x is niche, endian bugs are not.
Network protocols and file formats still need a defined byte order, and the first time your code talks to hardware or reads old data, little-endian assumptions leak all over the place. Ignoring portability buys you a pile of vendor-specific hacks later, because your team will meet those 'irrelevant' platforms in appliances, embedded boxes, or somebody else's DB import path long before a sales rep waves a support contract at you.
Don't ignore endianness. But making little endian the default is the right thing to do, it is so much more ubiquitous in the modern world.
The vast majority of modern network protocols use little endian byte ordering. Most Linux filesystems use little endian for their on-disk binary representations.
There is absolutely no good reason for networking protocols to be defined to use big endian. It's an antiquated arbitrary idea: just do what makes sense.
> When programming, it is still important to write code that runs correctly on systems with either byte order
I contend it's almost never important and almost nobody writing user software should bother with this. Certainly, people who didn't already know they needed big-endian should not start caring now because they read an article online. There are countless rare machines that your code doesn't run on--what's so special about big endian? The world is little endian now. Big endian chips aren't coming back. You are spending your own time on an effort that will never pay off. If big endian is really needed, IBM will pay you to write the s390x port and they will provide the machine.
> There are countless rare machines that your code doesn't run on--what's so special about big endian?
One difference is that when your endian-oblivious code runs on a BE system, it can be subtly wrong in a way that's hard to diagnose, which is a whole lot worse than not working at all.
You are correct, honestly, I couldn't disagree more with th article. At this point I can't imagine why it's important.
It's also increasingly hard to test. Particularly when you have large expensive testsuites which run incredibly slowly on this simulated machines.
I did that many years back, but with MIPS and MIPSel: https://youtu.be/BGzJp1ybpHo?si=eY_Br8BalYzKPJMG&t=1130
presented at Embedded Linux Conf
Is there any benefit in edge cases to using big-endian these days?
What I really want is memory order emulation. X86 as strong memory order guarantees, ARM has much weaker guarantees. Which means the multi-threaded queue I'm working on works all the time on development x86 machine even if I forget to put in the correct memory-order schematics, but it might or might not work on ARM (which is what my of my users have). (I am in the habit of running all my stress tests 1000 times before I'm willing to send them out, but that doesn't mean the code is correct, it means it works on x86 and passed my review which might miss something)
For Rust we have Loom [0], but do not expect for it to work on your whole application.
[0]: https://github.com/tokio-rs/loom
If you're using Go on GitHub (and doing stuff where this actually matters) adding this to your CI can be as simple as this: https://github.com/ncruces/wasm2go/blob/v0.3.0/.github/workf...
On Linux it's really as simple as installing QEMU binfmt and doing:
I wrote a similar post [1] some 16 years ago. My solution back then was to install Debian for PowerPC on QEMU using qemu-system-ppc.
But Hans's post uses user-mode emulation with qemu-mips, which avoids having to set up a whole big-endian system in QEMU. It is a very interesting approach I was unaware of. I'm pretty sure qemu-mips was available back in 2010, but I'm not sure if the gcc-mips-linux-gnu cross-compiler was readily available back then. I suspect my PPC-based solution might have been the only convenient way to solve this problem at the time.
Thanks for sharing it here. It was nice to go down memory lane and also learn a new way to solve the same problem.
[1] https://susam.net/big-endian-on-little-endian.html