Tuesday, 29 May 2012

That's how to fix a bug?

A friend of mine ran in to a snag with some code using mysql client libraries. Asked me for help. valgrind explains it is a memory access deep under mysql_real_connect. The odd bit is that the code worked on older machines, and even moving the binary from an older machine to a new one worked. Also, other programs worked even though using the same library and same function.

So, I explained SEP fields, and LMGTFY, and he went off to investigate. I am impressed, he found what was different. The compiler arguments!

The difference was that mysql_config appears to include -rdynamic. Without this, the code works. That alone is a tad odd, and worthy of more investigation.

However he also found a bug report http://bugs.mysql.com/bug.php?id=39175

And that got me puzzled. Do have a look at it.

The bug report is 2008, but this seems to be a recent problem. However, oracle (or was it not them back then?) confirmed the report within 40 minutes. yay!

Over a year later (yes, a year) they say that to use his contribution they need the original reporter to sign a contribution form. A month after that they suspend the bug report.

I thought "What contribution?", and then I realised the report does indeed include a "suggested fix" changing @LDFLAGS@ to @SAVE_LDFLAGS@ to reverse some previous change.

It is clear from the report (saying that between one version and another they changed from @SAVE_LDFLAGS@ to @LDFLAGS@, and that was wrong) what the fix should be, but I have do wonder... If the original reporter had left out the helpful "suggested fix" part, and just made the report, then would that not have counted as a "contribution" and meant the bug was actually fixed?

It sounds like sending a "suggested fix" is actually less helpful than not!

Way to go oracle - classy bug report handling.

Now to work out how we work around this cleanly and why -rdynamic is breaking things!


  1. The obvious thing to check for with -rdynamic is unintended symbol aliasing - it's most likely that you have defined a symbol in your program that's also defined in and used by a shared object you're loading, and with -rdynamic on your code, the ELF linking rules are causing the shared object to pick up your definition of the symbol, not the shared object's definition.

    1. Ah, interesting - I could not figure our how it could break something.

    2. Is there a simple way to locate such symbol clashes, as we could "fix" by simply changing the name of something in our code.

    3. I've never found a simple way to detect it - I've always used the LD_VERBOSE and LD_DEBUG environment variables to get the dynamic linker to tell me what's going on. Look for differences caused by -rdynamic, and you'll find the problem symbol.

  2. We used to get that with HPUX which does that kind of thing by default with its shared libararies. Using the gcc visibility stuff cured 99% of it.

    Of course you probably can't recompile mysql with visibility enabled (unless it already has the requisite macros to export its public API) so it may not help..

    btw. Digium do the same stuff with asterisk.. it's all about retaining control. You have to sign away copyright so they can take the code at any point and produce a closed source version, whilst using the GPL to stop anyone else doing the same (which is why they didn't use BSD for example). Doesn't surprise me that Oracle would extend it to single line fixes to a makefile, given their current shenanigans against Google.