Paweł--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
Why do we not just use /PDBALTPATH:%_PDB%.%_EXT% to force the PDB path to be relative (no absolute paths at all)
I thought the issue was more about getting PDBs from builders to testers.
Note that if you are running into ZIP file limits (I think xusydoc@ suggested this was an issue?) You can precompress the PDB using makecab to generate .PD_ files
FYI, PDB support is currently re-enabled, with the following notes:1. With fastbuild=1 (which bots use), only linker generates debug info and not the compiler. This is the minimum needed to get stack trace symbolization to work. See https://codereview.chromium.org/12038100 for implementation.2. So far nobody complained about things, so if possible I'd like to avoid "commit wars" with people reverting things without notice or anything like that. I can fix outstanding issues if there are complaints.
3. This is not so much about debugging issues on the try server or on the bots, but just getting a meaningful data from say chromium-build-logs.appspot.com . If you are fixing a test that was disabled because of a crash, often one look at the stack trace is very helpful to diagnose the issue, it can actually lead to a simple and straightforward fix. We also have stack traces for assertion failures, it's important to keep them symbolized as well. Compared to that, taking a trybot for debugging is much more work, and actually doesn't always result in a repro.4. Performance measurements:4.1. On "Win Builder" package_build went from 3 minutes to 6 minutes. Total cycle time of the bot is 21 minutes.4.2. On "Win Builder (dbg)" package_build went from 2 minutes to 3 minutes. Total cycle time is 9 minutes.4.3. On "Win 7 Tests (1)" extract_build went from 0.5 minute to 1 minute. Total cycle time is over 30 minutes.4.4. On "Win 7 Tests (dbg)(1)" extract_build went from 1 minute to 2 minutes. Total cycle time is 30 minutes.4.5. My conclusion is that this is a fair tradeoff. 1-2 minute build time differences are nothing compared to time spent on retrying flaky browser tests. Two retries are all it takes to break even (and we often retry more). Having symbolized stack traces will help fixing flakiness. I guess some people may disagree, but please take into account that disabling important features (symbolized stack traces in this case) for build speed is not necessarily the best idea. It is a bit similar to using unsafe compiler flags that may result in broken program because it runs faster (the analogy is obviously far from perfect).
4.6. This can be switched off on buildbot using package_pdb_files=False factory property. The cost seems higher on "Win Builder", so I'm fine with for example disabling pdbs for Release builds but preserving them for Debug as a middle-ground solution. I can make necessary changes if needed.
5. I've fixed zip file issues, see https://code.google.com/p/chromium/issues/detail?id=168411 . This will be a benefit to our build infrastructure anyway.
--
This might be the right conclusion, but it does not seem obvious to me. Symbolization will only help fix flakiness if people are actively looking at the traces.
A few notes here;
- Reenabling PDB support for try jobs could be done as an optional flag, if someone wants it badly. In practice it wouldn't be very hard to implement. Generating PDB at all, even if they are not archived, has a significant performance cost so it needs to be off by default.
- For try jobs, archival are going to be done on isolateserver.appspot.com, as described in the test isolation design doc.
- The PDB's content will need to be independent of the time of the day, phase of the moon, etc. Chris forgot to say but he wrote a tool to do it. It's just not wired in the chromium build process yet.
- Then it'll be possible to archive the PDB on isolateserver along the executable. This is separate from a standard symbol server but this is better; the 7 days caching is done properly and no maintenance / clean up overhead is needed.
- My goal is to kill zip files, but it's not done yet so sorry for the maintenance crap window still left.
Note that "I don't have a workstation <OS>" is kinda lame reason for debugging an issue on the Try Server. Get over it and get one. The only good reason is "I can't reproduce locally" and this is usually (but not always) a side-effect of a race condition because your workstation is too fast. Then anyway the answer is http://go/chrometryserver and takes a try slave off the network.
Note that I'd be fine with a fastbuild=0,1,2 where 2 is no symbol at all. This could be dynamically set depending on the testfilter.