Ticket #91 (fixed enhancement)
Pretty significant performance regressions in 1.4.X when searching
- Created: 2010-08-02 22:17:42
- Reported by: toofishes
- Assigned to: Franz
- Milestone: 1.4.6
- Component: database
- Priority: high
We're noticing significant performance differences in FluxBB 1.4.X vs. the 1.2.X series. I'm going to attach a few things that come to mind, and I'd be glad to chip in more, but here we go.
$ mysqladmin status
Uptime: 309733 Threads: 1 Questions: 31691959 Slow queries: 1110 Opens: 1146 Flush tables: 1 Open tables: 291 Queries per second avg: 102.320$ mysqld --version
mysqld Ver 5.1.47-log for unknown-linux-gnu on x86_64 (Source distribution)$ uname -a
Linux gudrun 2.6.34-ARCH #1 SMP PREEMPT Mon Jul 5 22:12:11 CEST 2010 x86_64 Intel(R) Xeon(TM) CPU 2.80GHz GenuineIntel GNU/LinuxAll FluxBB tables are InnoDB.
mysql> SELECT TABLE_SCHEMA, count(*) TABLES, concat(round(sum(table_rows)/1000000,2),'M') rows, concat(round(sum(data_length)/(1024*1024*1024),2),'G') DATA, concat(round(sum(index_length)/(1024*1024*1024),2),'G') idx, concat(round(sum(data_length+index_length)/(1024*1024*1024),2),'G') total_size, round(sum(index_length)/sum(data_length),2) idxfrac FROM information_schema.TABLES WHERE table_schema = 'fluxbb' GROUP BY TABLE_SCHEMA;
+--------------+--------+--------+-------+-------+------------+---------+
| TABLE_SCHEMA | TABLES | rows | DATA | idx | total_size | idxfrac |
+--------------+--------+--------+-------+-------+------------+---------+
| fluxbb | 17 | 23.14M | 1.34G | 0.91G | 2.26G | 0.68 |
+--------------+--------+--------+-------+-------+------------+---------+
1 row in set (0.84 sec)History
toofishes 2010-08-02 22:30:46
- Description changed. (Diff)
Apparently I can't attach files. How about this instead then:
http://dev.archlinux.org/~dan/mysql_queries-week.png
http://dev.archlinux.org/~dan/mysql_slo … s-week.png
The upgrade happened on the 29th during the big insert spike there. You can see from there on out, the slow queries come a lot more frequently and the overall query count gets higher (nearly 20/second).
This box does run other applications and databases as well, but nothing has changed with those in this timeframe.
Finally, a slow query log if you want to run mysqldumpslow or just look through it:
http://dev.archlinux.org/~dan/fluxbb-slow.log
Reines 2010-08-02 23:16:08
- Milestone set to 1.4.2.
- Owner set to Reines.
- Priority changed from normal to high.
Thanks for the info, I'll take a look into it.
Reines 2010-08-02 23:22:05
Just to save anyone else time, the queries in question are:
post.php, line 198:
SELECT u.id, u.email, u.notify_with_post, u.language FROM users AS u INNER JOIN subscriptions AS s ON u.id=s.user_id LEFT JOIN forum_perms AS fp ON (fp.forum_id=23 AND fp.group_id=u.group_id) LEFT JOIN online AS o ON u.id=o.user_id LEFT JOIN bans AS b ON u.username=b.username WHERE b.username IS NULL AND COALESCE(o.logged, u.last_visit)>1280633364 AND (fp.read_forum IS NULL OR fp.read_forum=1) AND s.topic_id=102056 AND u.id!=2672;and search.php, line 180:
SELECT m.post_id, p.topic_id, t.last_post AS sort_by FROM search_words AS w INNER JOIN search_matches AS m ON m.word_id = w.id INNER JOIN posts AS p ON p.id=m.post_id INNER JOIN topics AS t ON t.id=p.topic_id LEFT JOIN forum_perms AS fp ON (fp.forum_id=t.forum_id AND fp.group_id=3) WHERE w.word LIKE 'uuid' AND (fp.read_forum IS NULL OR fp.read_forum=1);Franz 2010-08-03 08:09:14
By the way, the higher query count might have to do with the fact that we split up a few queries into two in viewtopic.php and viewforum.php (and other places, too) to increase performance.
These pages are probably the most requested ones in a forum application.
toofishes 2010-08-24 01:32:23
The query at post.php, line 198 can be improved quite a bit by adding an index (it appears MySQL is too stupid to be able to use the covering index that already exists which is a shame):
mysql> explain SELECT u.id, u.email, u.notify_with_post, u.language FROM users AS u INNER JOIN subscriptions AS s ON u.id=s.user_id LEFT JOIN forum_perms AS fp ON (fp.forum_id=23 AND fp.group_id=u.group_id) LEFT JOIN online AS o ON u.id=o.user_id LEFT JOIN bans AS b ON u.username=b.username WHERE b.username IS NULL AND COALESCE(o.logged, u.last_visit)>1280633364 AND (fp.read_forum IS NULL OR fp.read_forum=1) AND s.topic_id=102056 AND u.id!=2672;
+----+-------------+-------+--------+--------------------------+-------------------+---------+-------------------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+--------------------------+-------------------+---------+-------------------------+-------+-------------+
| 1 | SIMPLE | u | range | PRIMARY | PRIMARY | 4 | NULL | 22281 | Using where |
| 1 | SIMPLE | fp | eq_ref | PRIMARY | PRIMARY | 8 | fluxbb.u.group_id,const | 1 | Using where |
| 1 | SIMPLE | o | ALL | online_user_id_ident_idx | NULL | NULL | NULL | 187 | Using where |
| 1 | SIMPLE | b | ref | bans_username_idx | bans_username_idx | 78 | fluxbb.u.username | 1 | Using where |
| 1 | SIMPLE | s | eq_ref | PRIMARY | PRIMARY | 8 | fluxbb.u.id,const | 1 | Using index |
+----+-------------+-------+--------+--------------------------+-------------------+---------+-------------------------+-------+-------------+
5 rows in set (0.03 sec)
mysql> ALTER TABLE online ADD INDEX online_user_id_idx (user_id);
mysql> explain SELECT u.id, u.email, u.notify_with_post, u.language FROM users AS u INNER JOIN subscriptions AS s ON u.id=s.user_id LEFT JOIN forum_perms AS fp ON (fp.forum_id=23 AND fp.group_id=u.group_id) LEFT JOIN online AS o ON u.id=o.user_id LEFT JOIN bans AS b ON u.username=b.username WHERE b.username IS NULL AND COALESCE(o.logged, u.last_visit)>1280633364 AND (fp.read_forum IS NULL OR fp.read_forum=1) AND s.topic_id=102056 AND u.id!=2672;
+----+-------------+-------+--------+---------------------------------------------+--------------------+---------+-------------------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------------------------------+--------------------+---------+-------------------------+-------+-------------+
| 1 | SIMPLE | u | range | PRIMARY | PRIMARY | 4 | NULL | 21767 | Using where |
| 1 | SIMPLE | fp | eq_ref | PRIMARY | PRIMARY | 8 | fluxbb.u.group_id,const | 1 | Using where |
| 1 | SIMPLE | o | ref | online_user_id_ident_idx,online_user_id_idx | online_user_id_idx | 4 | fluxbb.u.id | 3 | Using where |
| 1 | SIMPLE | b | ref | bans_username_idx | bans_username_idx | 78 | fluxbb.u.username | 1 | Using where |
| 1 | SIMPLE | s | eq_ref | PRIMARY | PRIMARY | 8 | fluxbb.u.id,const | 1 | Using index |
+----+-------------+-------+--------+---------------------------------------------+--------------------+---------+-------------------------+-------+-------------+
5 rows in set (0.00 sec)Franz 2011-07-23 22:22:43
I am not an expert on these things. Would adding the index be enough to fix this, @toofishes? If so, on which columns?
toofishes 2011-07-28 13:35:47
It definitely helped in our case. I would definitely suggest adding such an index. The reason the normal covering index principles don't work here is due to the fact that indexes on memory-based tables are created as hash indexes by default, so there is no B-tree traversal.
In the more general "does this completely fix search" category, no, it does not. We have a very large forum (40K users, 120K topics, 940K posts), so the search tables are far and away the cause of nearly all lookup problems for a few reasons:
1. Shouldn't be doing search in a database anyway, but this seems to be being addressed in issue #421 (search engine modules).
2. The search tables are not scaling. We have 730K search words, but 26 million search matches.
3. Search words are heavily overloaded. Our top ten words have counts ranging from 80K - 120K matches *each*.
4. When searching, the full results are sent back to the PHP code (30K plus rows at times), only to be stored back in the database in the search cache. There should be two things fixed here- get the results straight into the DB with no round trip, and have a way of limiting the maximum number of search results if one wishes to do so.
toofishes 2011-07-28 14:16:27
It looks like that is due to someone making the mistake that the two-part index on user (user_id_ident_index, user_id and ident columns) would work. As I noted in my last comment, it does not (at least in MySQL; other databases will likely not create a HASH index by default) due to the index being of HASH type vs BTREE.
http://dev.mysql.com/doc/refman/5.5/en/ … -hash.html
The solution is either to add the other index as I suggested (which will work really fast and well for equality joins, which is all you do), or explicitly create the two-column index as a BTREE index. I'd suggest just adding the additional index.
toofishes 2011-07-28 14:19:52
It depends. You will need to look for all queries that involve the online table and see what they join on.
However, because the two column index is unique, it is likely there to enforce uniqueness meaning it shouldn't be removed even if unused by queries.
Franz 2011-07-28 14:29:52
There we are. This should fix the problem (or, at least, improve performance) in post.php.
Is there a similar fix we could apply to search.php? As long as we don't revamp the whole search system, bigger changes would be ok, too.
toofishes 2011-07-30 05:05:24
Re: adding that index back in- I would probably do that only for MySQL, unless other database engines are also doing funny things with the online table. Or is it just our database that has this set up using Engine=MEMORY? In which case it shouldn't be needed anywhere and this is just something in our setup that differs from the now default, and the index won't be needed for anyone else.
Other thoughts, including that other query. It looks like the shit performance comes down to a few things. MySQL has a terrible optimizer, and looking at the slow query log today, every time we get a slow hit on that query (the only slow one in there), it is because it is returning 50K+ rows. More importantly, running one of these through EXPLAIN shows that MySQL thinks only about 12 rows will be returned, which is obviously way off. I'm not sure there is much you can fix with this.
Something else of note is the non-uniqueness of some unique fields- namely id in search_words.
$ tail /var/lib/mysql/gudrun-slow.log
# Time: 110730 0:32:00
# User@Host: fluxbb[fluxbb] @ localhost []
# Query_time: 8.111481 Lock_time: 0.000206 Rows_sent: 74483 Rows_examined: 226858
SET timestamp=1312000320;
SELECT m.post_id, p.topic_id, t.last_post AS sort_by FROM search_words AS w INNER JOIN search_matches AS m ON m.word_id = w.id INNER JOIN posts AS p ON p.id=m.post_id INNER JOIN topics AS t ON t.id=p.topic_id LEFT JOIN forum_perms AS fp ON (fp.forum_id=t.forum_id AND fp.group_id=4) WHERE w.word LIKE 'system' AND (fp.read_forum IS NULL OR fp.read_forum=1);
$ mysqldumpslow /var/lib/mysql/gudrun-slow.log
Reading mysql slow query log from /var/lib/mysql/gudrun-slow.log
Count: 46 Time=10.43s (479s) Lock=0.00s (0s) Rows=63602.8 (2925728), fluxbb[fluxbb]@localhost
SELECT m.post_id, p.topic_id, t.last_post AS sort_by FROM search_words AS w INNER JOIN search_matches AS m ON m.word_id = w.id INNER JOIN posts AS p ON p.id=m.post_id INNER JOIN topics AS t ON t.id=p.topic_id LEFT JOIN forum_perms AS fp ON (fp.forum_id=t.forum_id AND fp.group_id=N) WHERE w.word LIKE 'S' AND (fp.read_forum IS NULL OR fp.read_forum=N)
mysql> EXPLAIN SELECT m.post_id, p.topic_id, t.last_post AS sort_by FROM search_words AS w INNER JOIN search_matches AS m ON m.word_id = w.id INNER JOIN posts AS p ON p.id=m.post_id INNER JOIN topics AS t ON t.id=p.topic_id LEFT JOIN forum_perms AS fp ON (fp.forum_id=t.forum_id AND fp.group_id=4) WHERE w.word LIKE 'system' AND (fp.read_forum IS NULL OR fp.read_forum=1);
+----+-------------+-------+--------+-------------------------------------------------------+----------------------------+---------+-------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------------------------------------+----------------------------+---------+-------------------------+------+-------------+
| 1 | SIMPLE | w | range | PRIMARY,search_words_id_idx | PRIMARY | 62 | NULL | 1 | Using where |
| 1 | SIMPLE | m | ref | search_matches_post_id_idx,search_matches_word_id_idx | search_matches_word_id_idx | 4 | fluxbb.w.id | 12 | |
| 1 | SIMPLE | p | eq_ref | PRIMARY,posts_topic_id_idx | PRIMARY | 4 | fluxbb.m.post_id | 1 | |
| 1 | SIMPLE | t | eq_ref | PRIMARY | PRIMARY | 4 | fluxbb.p.topic_id | 1 | |
| 1 | SIMPLE | fp | eq_ref | PRIMARY | PRIMARY | 8 | const,fluxbb.t.forum_id | 1 | Using where |
+----+-------------+-------+--------+-------------------------------------------------------+----------------------------+---------+-------------------------+------+-------------+
5 rows in set (0.00 sec)Pierre 2011-07-31 14:11:22
I just had a look at the install script. The online table is created as HEAP unless you choose InnoDB in which case InnoDB is also used here.
We migrated to InnoDB later so we probably missed to change the enigne of that table.
What was the rationale behind this decision?
Franz 2011-07-31 20:34:25
Here's what Smartys told me on Twitter: https://twitter.com/NealPoole/status/97699418672873472
So that probably means we can change the engine again, can we?
Franz 2011-07-31 21:05:45
Ok, I've got two commits prepared for this.
One changes the engine of the online table to MyISAM for normal MySQL(i) or to InnoDB for those who use that.
I am not sure whether we would still need the extra index in that case - should it be removed again or still stay?
@toofishes and @pierre: Could you test these two variants, please?
Franz 2011-08-02 12:54:13
There we go. This should do the necessary. Please test this!
The remaining question is whether the index on online.user_id can be removed again (if I understood everything correctly, it should be removed).
Franz 2011-08-03 08:17:55
- Component changed from search to database.
- Status changed from open to fixed.
I believe this should be fixed now.
Pierre 2011-08-03 09:52:53
I have converted our online DB to innodb and removed the index on user_id. Let's see how it does.

