Ticket #510 (fixed bug)
New BBCode checking in search_idx.php
- Created: 2011-10-13 04:42:45
- Reported by: quy
- Assigned to: daris
- Milestone: 1.4.8
- Component: search
- Priority: normal
Add checking of new BBCode (topic, post, forum, user) to split_words function.
Add checking to strip_bbcode function???
daris 2011-12-01 11:18:03
First one is done
Franz 2011-12-01 18:27:56
- Owner set to daris.
Yes, please remove them in strip_bbcode(), too.
daris 2011-12-01 20:28:47
Is it really needed in strip_bbcode?
Shouldn't the code:
// Remove BBCode $text = preg_replace('%\[/?(b|u|s|ins|del|em|i|h|colou?r|quote|code|img|url|email|list|topic|post|forum|user)(?:\=[^\]]*)?\]%', ' ', $text);
be in strip_bbcode() function instead of split_words() ?
Both functions are called on the $message
Franz 2011-12-01 20:47:14
Oh, that's what you mean.
Well, since both functions are only called from update_search_index(), feel free to refactor.
The issue is that the ID will be indexed. In this example, 1000 will be indexed in the search tables.
The ID should be stripped.
Franz 2011-12-01 21:33:32
That's a good implementation - that function is the right place to strip all the BBCodes anyway.
That means we're done, right?
daris 2011-12-01 21:36:47
Another point, as a code cleanup - move last line of the strip_bbcode function into $patterns array?
Franz 2011-12-01 21:45:37
Very good point.
daris 2011-12-01 21:46:07
- Status changed from open to fixed.
Please confirm that this can be removed since the last patterns entry will take care of this.
'%\[(img|url|email)\]([^*+(?:(?!\[/\1\])\[[^*+)*)\[/\1\]%' => '$2', // Keep the url
daris 2011-12-02 07:58:35
yoorick 2012-01-12 08:01:36
> Well, since both functions are only called from update_search_index(), feel free to refactor.
Actually, split_words() is used in search.php to form the array of keywords. And originally it removed any bb-codes from it.
Was there a specific reason to remove them from user input? Or maybe that behaviour was excessive and the fact that bb-codes are not in 'search_words' table is quite enough?
Franz 2012-01-12 10:09:07
Yes, it's all about avoiding BBCode in search tables.
yoorick 2012-01-12 12:51:06
I meant that before that regexp was moved from split_words() to strip_bbcode() if a user asked the search script to look for a string like that:
[quote] something [/quote]
it was treated as just
Now it is treated as
quote and something
for any bbcode that is more than PUN_SEARCH_MIN_WORD characters long.
I just think that previous behaviour is more correct.
If it's not, sorry for wasting your time
Franz 2012-01-12 13:05:54
Don't worry. I'm glad you noticed it.
Would you mind opening a new ticket for this?