-Werror is something Benjamin and I have wasted hours discussing, but that's not the point here.-fno-common
-Wall
-Wdeclaration-after-statement
-Wextra
-Wformat=2
-Winit-self
-Winline
-Wpacked
-Wp,-D_FORTIFY_SOURCE=2
-Wpointer-arith
-Wlarger-than-65500
-Wmissing-declarations
-Wmissing-format-attribute
-Wmissing-noreturn
-Wmissing-prototypes
-Wnested-externs
-Wold-style-definition
-Wredundant-decls
-Wsign-compare
-Wstrict-aliasing=2
-Wstrict-prototypes
-Wswitch-enum
-Wundef
-Wunreachable-code
-Wunsafe-loop-optimizations
-Wwrite-strings-Wbad-function-cast
-Werror-implicit-function-declaration
-Wfloat-equal
-Wmissing-field-initializers
-Wmissing-include-dirs
-Wswitch-default-fno-common is not a warning flag strictly speaking, but it's as good as one. -Wp,-D_FORTIFY_SOURCE=2 is not a warning flag either, but it enables some extra checks in glibc headers.man gcc is your friend.-Werror is evil. Warning flags are a very personal thing. I don't like cluttering code I maintain with them.99 files changed, 6317 insertions(+), 14094 deletions(-)
For this assignment I updated my home machine to latest Fedora 10, created a new user, and bootcharted gnome-session in a warm login. I got the chart at right (click to enlarge). Lets see how it looks:To transition to the next phase, the current phase should either complete or time out. A phase is complete when all apps associated with the phase signal completion. An app can signal completion in a variety of ways, the simplest of which being that the process terminates. A phase times out after ten seconds. Description for the phases as well as a more a longer version of this condensed overview is available in gnome-session/gnome-session/README.With that in mind, the two second idle looks a hell lot like a phase timeout. Inspecting
~/.xsession-errors and searching for gnome-session confirms that:gnome-session: Application 'libcanberra-login-sound' failed to register before timeoutOh oh!
/usr/share/gnome/autostart/libcanberra-login-sound.desktop is where this application lives, and it contains these lines:Exec=/usr/bin/canberra-gtk-play --id="desktop-login" --description="GNOME Login"That is, it runs
AutostartCondition=GNOME /desktop/gnome/sound/event_sounds
X-GNOME-Autostart-Phase=Desktop
canberra-gtk-play in the Desktop phase of gnome-session. So what's happening? gnome-session is waiting for canberra-gtk-play to finish playing login sound, how cool is that!? Not only gnome-session waits for it to finish playing, even when it does (after five seconds), gnome-session fails to notice that. So it waits another five seconds until the phase times out and it proceeds to the next phase. To verify this, note that in the graph, canberra-gtk-play starts at about 3.2 seconds in, and the CPU picks up activity again, at 13.2. if (manager->priv->phase == GSM_MANAGER_PHASE_INITIALIZATION) {
/* Applications from Initialization phase are considered
* registered when they exit normally. This is because
* they are expected to just do "something" and exit */
app_registered (app, manager);
}
So, only Initialization phase was handled that way. Not hard to fix. Bug filed, patch submitted. Then I also patched libcanberra to add a --background option to canberra-gtk-play. Reported and patch submitted. There is an alternate, much shorter, fix: move login sound to Application phase which is the last phase. Applications in the Application phase are not required to notify the session of their startup. That's a one line fix. The gnome-session fix is still a good thing to do. Anyway, with that fixed I got down from 18 to 13 seconds. Not bad!
Lets see what's making the login take that long. The last batch of processes to start are the panel applets (clock-applet, wcnk-applet, trashapplet, gdm-user-switch, and notification-area) that start around 8 seconds in. The panel on the other hand starts at 3. What is the panel doing for those 5 seconds?! Time to switch tools and take a look inside panel. Federico looked at it in February, so I reused his patch and added more annotations as needed. The image at right is the first timeline view of what's happening inside gnome-panel. There are three large gaps there. The first one is the linker working hard. The second one is gnome_program_init(). I have absolutely no idea why those two are taking so long. Those two are very unstable though and go up and down in different runs. I will look into them later.strace log, it becomes clear that fontconfig is opening three East Asian fonts. Why? It's not supposed to. Well, at least not if its cache is up to date. I then run fc-match and observe that it takes over a second. I try fc-cache and it errs about not being able to update cache for two East Asian font directories on my system. Humm, this is a first for me, although I confess I get reports about situations like this all the time... Reading more log, it appeared that opening cache files in ~/.fontconfig was failing as Permission denied! Weird stuff. Inspecting ~/.fontconfig revealed the reason: It had the following permissions: drw-rw-r--. Where did the x permissions go? This was a new user account, so the first process to call into fontconfig would have been gnome-settings-daemon. Oh my! In one of my patches from the last round, I had replaced a call to the glibc function daemon() with custom code that I lifted from preload, which I had lifted from some other daemon back in 2005. Anyway, seems like the daemonizing code there had a umask (0177); call in it, and no one's eyes ever caught the bug there. That mask is fine for files, but directories created under that are absolutely useless. And since Nov 25 we backported those changes into Fedora 10! Chaos! I had to go clean that up. Filed and fixed the bug in g-s-d first, and pushed fixed versions in F10 and rawhide. Then filed bug with fontconfig to fix it there too. Will look into that this week.umask 0133 before calling fc-cache. Oops. Removed those and pushed updates out. However, that does *not* justify it as fc-cache doesn't need to create any directory when updating system-wide caches. That's still a mystery. Got to reinstall the broken RPMs to see what's going on.
Oh, the timing. Fixing these, the huge gap in the panel shrank drastically, from 1.4 seconds to 0.13 seconds. 130ms is still a long time to spend in fontconfig. Seems like about 60% of that is just opening and reading/mmap'ing the config and cache files. That can be improved by reducing the number of files: packaging more font files in the same directory, and avoiding micro config files. Fortunately Nicolas is on the right track to fix these in Fedora. The other 40% is spent on the first FcFontSort() call. I have ideas about how to improve that, and will look into it in the coming days.gtk_window_set_default_icon_name() call the gap shrinks considerably, but then when you ctrl+alt+tab you won't see the cute gnome-panel icon. Donno, maybe we can set the icon later to avoid holding the entire session back for 90ms.fc-cache --compact is invoked... I'll think about it.
With this mess cleaned, login time is back down to a sane 5.5s. gnome-panel is not taking much CPU time anymore. Nautilus some. Other than those, two things stand out: python and sealert. sealert is an applet to "View & interpret SELinux denials". It's written in python. Worse, when starting, it calls over D-BUS to start a server instance of itself, in another python process, and the two wait for SELinux denials to happen so they can report it to you using notification-daemon bubbles... for you to click and dismiss. Srsly, what the heck is Aunt Tillie to do with them? Forget Aunt Tillie, what is even a tech-savvy user to do with them? We were at the Fedora 10 release party in Seneca a couple weeks ago and a user asked me what these bubbles are. I explained to him, and he said "I just close them because they happen all the time". Which is true. It's great if you are debugging why your HTTP server does not serve your files, but is it good for anything else? Certainly not enough to be enabled by default. Please, disable sealert by default and help save the planet. In the mean time, I also proposed adding an additional phase to gnome-session which would run after the Application phase and is for starting processes that do not render to the desktop immediately, because those can start at the end, to give a better perception of desktop having loaded completely. sealert technically falls into that category. In practice however, it pops up a bubble before the login is done. In many setups at least.kerneloops-applet. Maybe enable these in Rawhide so people report bugs, but disable them in stable release. I'm pretty sure that's the right thing to do. kerneloops-applet is quite light on the CPU though. Removing sealert, kerneloops-applet, and some other lightweight tasks that I don't need, I get down to 4.5 seconds. The biggest CPU consumer now is Nautilus, and it's taking three seconds to settle down. Federico looked into it before, and I will too at some point. But I first want to fix fontconfig. The ROI on that is huge: every application startup can become something between 50ms to 100ms faster if I my ideas work out.
To keep things in perspective, lets have a quick look at cold login to see what's there to optimize. The image at right shows a cold login of the same desktop I used above. Lots of I/O, and 14 seconds login time. The CPU goes way down for extend periods of time, when panel, Nautilus, pulseaudio, and python are all doing heavy I/O. No idea what they are reading, but looks like something that should be possible to fix.
This next one is with preload running. Down to 12 seconds. Not much really. Looks like it failed to preload anything that those four I/O-heavy processes use. No wonder, because preload can only preload stuff that are mmap()ed. That mostly translates to shared libraries. A more advanced prefetching solution, like Harold's readahead_collector that monitors all I/O will be able to solve this. Oh, Harold by the way blogged recently about using systemtap to measure boot-time I/O. He's actually looking at me to fix gconf I/O issues, and that's exactly what I'm going to do with his script when I get to that stage.Labels: gnome, gnome-panel, gnome-session, login, performance