Fighting DocShell and DOMWindow leaks

In my post Leak hunting in browser-chrome mochitests I wrote about the measures we were considering to prevent regressing efforts to get rid of leaks in Firefox. Now that bug 683953 has landed we finally have a way to detect the leakage of whole DocShells and DOMWindows for the lifetime of the browser when running the browser-chrome mochitest suite.

How does it work?

While our browser-chrome mochitest suite runs we parse stdout to track starting and ending tests as well as the creation and removal of DocShells and DOMWindows. Just before the test suite shuts down we schedule a precise GC and wait until it’s completed. Any DOMWindows and DocShells still active are now counted as leaks and assigned to the tests that created them. Additionally we collect the URLs of DOMWindows to help debugging a bit.

How does this prevent new leaks?

We implemented a threshold of (currently) 130 leaks that must not be exceeded. If a test run leaks more than the limit we configured it goes orange and the patch should be backed out from the tree. These are the current numbers:

Linux (64): 116 (116) leaks
OS X (64): 79 (89) leaks
Windows (XP): 120 (118) leaks

Additionally, I filed bug 730797 to integrate these leaks statistics into our Talos infrastructure. So the leak count for each push will be recorded and compared to previous runs to make sure the numbers don’t regress. As the leak numbers differ quite heavily between OSes it makes sense to apply a custom threshold per OS, this will be implemented in bug 730800.

Why is there even a threshold?

First, there are DocShells and DOMWindows that are intentionally kept alive until the browser closes. Second, it’s nearly impossible to bring all these leaks down to “zero” at once. It’s a list of bugs that have to be addressed and we will slowly decrease the threshold to approach “zaroo”.

Thanks to Dão who has been doing great work in bug 658738 discovering all those leaks manually, which in the first place gave me the idea of automating it.