One of the most bizarre bugs I ever "fixed"

Started by Blizzard, November 25, 2016, 01:39:10 pm

Previous topic - Next topic

Blizzard

So, today I "fixed" a very bizarre bug.

We're using etcpak, an open source piece of code for conversion of images to the ETC1 format (all other implementations are much slower). It only had libpng so I integrated libjpeg in there as well since we kinda need it (it's especially useful since ETC1 is only RGB anyway). We made a container format called ETCX that uses 2 RGB images where the second (optional) image is the alpha channel and then we merge it all together in the shader. This is because we need texture compression for our newest game, but not a lot of devices support ETC2 yet.

I was batch converting images today and it would keep crashing on JPEG images. After I found the bug and fixed it (a race condition with data that libjpeg uses), it seemed to work fine. Well, except for those how_to_play_X.jpg images (X going from 0 to 9, we have 10 images). And it would keep crashing and crashing.

So I pull out the source again, compile, nothing. No crash. O_o I try the release configuration, no crash. O_o But etcpak.exe it crashes every single time when called from 2 layers of python (the main script calls a utility script as another process which then again calls etcpak for conversion before merging 2 ETC1 images into ETCX).

I start removing the libjpeg code, still keeps crashing. In the end I completely removed every trace of it, it still crashes. Then I removed the fopen() and fclose() calls and it suddenly stops. There is no threading problem here, the file is opened and closed in the main thread. And then I start suspecting the JPEG files themselves. There has to be some nasty corruption going on, but the images seem fine.
I open them in GIMP, just resave, nothing else and it works. It stops crashing. O_o

Long story short: "Corrupted" JPEG files would cause fopen() and fclose() would cause an exe to crash when it's done regardless of whether the exe had any JPEG code for handling or not. Merely accessing the file would cause the exe to be doomed to crash. And I was able to reproduce it consistently on another PC that pulled the JPEG files via SVN client. So it was definitely the files.

O_o
Check out Daygames and our games:

King of Booze 2      King of Booze: Never Ever
Drinking Game for Android      Never have I ever for Android
Drinking Game for iOS      Never have I ever for iOS


Quote from: winkioI do not speak to bricks, either as individuals or in wall form.

Quote from: Barney StinsonWhen I get sad, I stop being sad and be awesome instead. True story.

Ryex

That seems indicative of a problem with how fopen was handled. but the very concept of such a bug existing is hard to comprehend. fopen is implemented both at the kernel level and at the clib level and both implementations are heavily used, tested, and vetted. have been for years. What's more interesting is that the contents of a file have absolutely no baring on the operation of fopen. that knowledge would lead me to believe that is was file system corruption. except, you say this was reproducible after pulling the file from svn on another computer. it's a bit baffling. could it be that file system corruption that existed when the files we're originally commuted caused svn to set some weird state flags that when cheeked out reproduced that file-system corruption? like misaligned file size/start-end position? something like that would cause the kernel to say "hey your not supposed to be reading there!" and halt the process.
I no longer keep up with posts in the forum very well. If you have a question or comment, about my work, or in general I welcome PM's. if you make a post in one of my threads and I don't reply with in a day or two feel free to PM me and point it out to me.<br /><br />DropBox, the best free file syncing service there is.<br />

Blizzard

Check out Daygames and our games:

King of Booze 2      King of Booze: Never Ever
Drinking Game for Android      Never have I ever for Android
Drinking Game for iOS      Never have I ever for iOS


Quote from: winkioI do not speak to bricks, either as individuals or in wall form.

Quote from: Barney StinsonWhen I get sad, I stop being sad and be awesome instead. True story.