Behdad Esfahbod's daily notes on GNOME, Pango, Fedora, Persian Computing, Bob Dylan, and Dan Bern!

My Photo
Location: Toronto, Ontario, Canada

Ask Google.

Contact info
Hacker Emblem Become a Friend of GNOME I Power Blogger
follow me on Twitter
July 2003
August 2003
October 2003
November 2003
December 2003
March 2004
April 2004
May 2004
July 2004
August 2004
September 2004
November 2004
March 2005
April 2005
May 2005
June 2005
July 2005
August 2005
September 2005
October 2005
November 2005
December 2005
January 2006
February 2006
March 2006
April 2006
May 2006
June 2006
July 2006
August 2006
September 2006
October 2006
November 2006
December 2006
January 2007
February 2007
March 2007
April 2007
May 2007
June 2007
July 2007
August 2007
September 2007
October 2007
November 2007
December 2007
January 2008
February 2008
March 2008
April 2008
May 2008
June 2008
July 2008
August 2008
October 2008
November 2008
December 2008
January 2009
March 2009
April 2009
May 2009
June 2009
July 2009
August 2009
November 2009
December 2009
March 2010
April 2010
May 2010
June 2010
July 2010
October 2010
November 2010
April 2011
May 2011
August 2011
September 2011
October 2011
November 2011
November 2012
June 2013
January 2014
Current Posts
McEs, A Hacker Life
Thursday, May 03, 2007
 Avoiding extra work

When working on justification support in Pango, I found a loop that could use some optimization.

The loop in question is to decide how much of a piece of text (called item. For normal text it typically is the rest of the paragraph) we can fit into the current line. The implementation was removing one character at a time from the end until we find a break spot and the width of remaining chars fits the available width.

Think about it. If the entire item fits the available width the loop is not reached at all, but otherwise, unless the paragraph fits in exactly two lines, we are doing a lot of extra work. Say the paragraph takes 10 lines. Then for the first line we have to remove 9 lines worth of text before finding the right break point. For next line we have to remove 8 lines... and so on. It's quadratic in the number of lines, so to speak.

What I did was to start inserting chars from the beginning of the item until no more fits.

For a really long paragraph (pango/pango-view/test-long-paragraph.txt) the number of times the loop contents are run came down from ~1,000,000 to about 13,000. Not bad.

This was all in my quest to make gedit run faster when viewing a REALLY LONG paragraph. I was under the impression that some silly quadratic algorithms in Pango are responsible for the slow speed, but apparently I'm wrong. In short, Gtk+ recreates the Pango layout on EVERY CURSOR MOVEMENT OR CURSOR BLINKING. This is mostly a result of having a very extensible and powerful text object in Gtk+ that supports multiple marks, some of them may be cursors. I've already attached a patch to improve it for cursor blinking, but that's not very interesting. I want to be able to put my finger on the left or right arrow and don't see a gedit that is frozen as a result. Way to go...

Labels: , ,

Does this also affect text rendering that is not justified? I have also been bothered by gedit's sluggishness with regards to what I thought was line-wrapping. Checking out SQL dumps in gedit always seemd to bring it down onto its kneews. Would be really sweet if those times are over now and I don't have to find alternatives to view SQL dumps with
What I tried to describe is not limited to justified text, no. However, as I said, the bottleneck doesn't seem to be Pango per se, but the fact that Gtk+ voids Pango's caching by creating new Pango layouts all the time. So, in your use case, there is still work to do, and that is why I opened bug 435405.
Behdad, I always love reading about your performance and Pango work. It makes me happy :-).

I'm always promoting Gtk+ with my fellow devs, so it's always nice to see people improving it further.
you are a hero, as usual :)

That said as you note problems with long lines are not related to line wrapping (well, at least line wrapping isn't the top of the iceberg)... in fact rendering long lines is even worse when line wrapping is off.

I am pretty sure there are a couple of bugs already open about the fact that the text widgets abuses pango layouts, but another one will not hurt :)

/me adds himself to CC
What the Fxxx? Isn't it a typical bisect searching problem? Why do you have to search char by char? Did I miss something here?
You have to compute the commulative width of the characters, so bisecting doesn't help. Also not any position is suitable for a break.
Very interesting!, fixing this bug could lead to me using gedit instead of gvim. Thanks.
Is state->log_widths[] an array of the width of each character? If so, have you thought about storing the cumulative width instead of the individual width? I don't know if that makes sense but then you could do binary searches or (possibly) even faster searches based on heuristics.
I don't get it why binary search won't work with commulative width. It still has to end somewhere in the middle, be it evenly spread or not. At least you reach the right neighborhood much faster. With binary search, you still need to test for breakability or not at every binary point.
What's the point of using binary search if you can't get faster than linear? I have to populate the array first (commulative or not), and then find where to break. What you suggest neither makes the algorithm asymptotically faster, not any measured speedup.
I guess I thought that state->log_widths[] is populated with the width of each character. If it was instead populated with the cumulative width (i.e. state->log_widths[n] = state->log_widths[n-1] + char_width[n]) then you *might* be able to very quickly find a good starting point. For example:

start_index = (int)(N * (float)state->log_widths[N-1]/line_width);

If state->log_widths[start_index] < line_width ) {
/* something */
} else {
/* something else */

Or something.

Or maybe a straight binary search on cumulative width would be quite fast. Or maybe it's just a dumb thought...
Of course I meant

start_index = (int)(N * (float)line_width/state->log_widths[N-1]);

I've downloaded pango to see if I understood what was going on. This may not be *the* big bottle neck, but a cumulative width stored in "state" definitely lends efficiency.

For what your doing there you don't really even care about individual character width (of course, I might have missed something).

Anyway, I guess if you don't care about it I'll submit a patch for this.

Post a Comment

Links to this post:

Create a Link

<< Archive
<< Home