McEs, A Hacker Life: Avoiding extra work

Thursday, May 03, 2007

Avoiding extra work

When working on justification support in Pango, I found a loop that could use some optimization.

The loop in question is to decide how much of a piece of text (called item. For normal text it typically is the rest of the paragraph) we can fit into the current line. The implementation was removing one character at a time from the end until we find a break spot and the width of remaining chars fits the available width.

Think about it. If the entire item fits the available width the loop is not reached at all, but otherwise, unless the paragraph fits in exactly two lines, we are doing a lot of extra work. Say the paragraph takes 10 lines. Then for the first line we have to remove 9 lines worth of text before finding the right break point. For next line we have to remove 8 lines... and so on. It's quadratic in the number of lines, so to speak.

What I did was to start inserting chars from the beginning of the item until no more fits.

For a really long paragraph (pango/pango-view/test-long-paragraph.txt) the number of times the loop contents are run came down from ~1,000,000 to about 13,000. Not bad.

This was all in my quest to make gedit run faster when viewing a REALLY LONG paragraph. I was under the impression that some silly quadratic algorithms in Pango are responsible for the slow speed, but apparently I'm wrong. In short, Gtk+ recreates the Pango layout on EVERY CURSOR MOVEMENT OR CURSOR BLINKING. This is mostly a result of having a very extensible and powerful text object in Gtk+ that supports multiple marks, some of them may be cursors. I've already attached a patch to improve it for cursor blinking, but that's not very interesting. I want to be able to put my finger on the left or right arrow and don't see a gedit that is frozen as a result. Way to go...

Labels: gtk+, pango, performance

¶ 8:25 PM

Comments:

Does this also affect text rendering that is not justified? I have also been bothered by gedit's sluggishness with regards to what I thought was line-wrapping. Checking out SQL dumps in gedit always seemd to bring it down onto its kneews. Would be really sweet if those times are over now and I don't have to find alternatives to view SQL dumps with

# posted by

Anonymous : May 03, 2007 9:50 PM

What I tried to describe is not limited to justified text, no. However, as I said, the bottleneck doesn't seem to be Pango per se, but the fact that Gtk+ voids Pango's caching by creating new Pango layouts all the time. So, in your use case, there is still work to do, and that is why I opened bug 435405.

# posted by

behdad : May 03, 2007 10:22 PM

Behdad, I always love reading about your performance and Pango work. It makes me happy :-).

I'm always promoting Gtk+ with my fellow devs, so it's always nice to see people improving it further.

# posted by

Anonymous : May 04, 2007 5:38 AM

you are a hero, as usual :)

That said as you note problems with long lines are not related to line wrapping (well, at least line wrapping isn't the top of the iceberg)... in fact rendering long lines is even worse when line wrapping is off.

I am pretty sure there are a couple of bugs already open about the fact that the text widgets abuses pango layouts, but another one will not hurt :)

/me adds himself to CC

# posted by

Anonymous : May 04, 2007 6:37 AM

What the Fxxx? Isn't it a typical bisect searching problem? Why do you have to search char by char? Did I miss something here?

# posted by

Anonymous : May 04, 2007 7:54 AM

You have to compute the commulative width of the characters, so bisecting doesn't help. Also not any position is suitable for a break.

# posted by

behdad : May 04, 2007 8:16 AM

Very interesting!, fixing this bug could lead to me using gedit instead of gvim. Thanks.

# posted by

Anonymous : May 04, 2007 12:55 PM

Is state->log_widths[] an array of the width of each character? If so, have you thought about storing the cumulative width instead of the individual width? I don't know if that makes sense but then you could do binary searches or (possibly) even faster searches based on heuristics.

# posted by

Unknown : May 04, 2007 1:28 PM

I don't get it why binary search won't work with commulative width. It still has to end somewhere in the middle, be it evenly spread or not. At least you reach the right neighborhood much faster. With binary search, you still need to test for breakability or not at every binary point.

# posted by

Anonymous : May 05, 2007 12:49 AM

What's the point of using binary search if you can't get faster than linear? I have to populate the array first (commulative or not), and then find where to break. What you suggest neither makes the algorithm asymptotically faster, not any measured speedup.

# posted by

behdad : May 05, 2007 2:10 PM

I guess I thought that state->log_widths[] is populated with the width of each character. If it was instead populated with the cumulative width (i.e. state->log_widths[n] = state->log_widths[n-1] + char_width[n]) then you *might* be able to very quickly find a good starting point. For example:

start_index = (int)(N * (float)state->log_widths[N-1]/line_width);

If state->log_widths[start_index] < line_width ) {
/* something */
} else {
/* something else */
}

Or something.

Or maybe a straight binary search on cumulative width would be quite fast. Or maybe it's just a dumb thought...

# posted by

Unknown : May 05, 2007 8:41 PM

Of course I meant

start_index = (int)(N * (float)line_width/state->log_widths[N-1]);

# posted by

Unknown : May 05, 2007 8:43 PM

Hi,

I've downloaded pango to see if I understood what was going on. This may not be *the* big bottle neck, but a cumulative width stored in "state" definitely lends efficiency.

For what your doing there you don't really even care about individual character width (of course, I might have missed something).

Anyway, I guess if you don't care about it I'll submit a patch for this.

Pat.

# posted by

Unknown : May 07, 2007 4:53 AM

About Me

Twitter Updates