Forums

Unfortunately no one can be told what FluxBB is - you have to see it for yourself.

You are not logged in.

#1 2010-03-09 13:39:05

Reines
Lead developer
From: Scotland
Registered: 2008-05-11
Posts: 3,165
Website

CJK search support

FluxBB v1.4 currently has problems indexing and searching for CJK (Chinese/Japanese/Korean) words properly due to the fact they don't use spaces to separate words.

I have written a patch to correct this, however it isn't 100% perfect because it assumes every character is a word (which I don't believe is always the case?).

I was hoping some people could give the patch a try and let me know how it performs (both for CJK languages and if it has any effect on "normal" languages).

Download patch: http://home.jamierf.co.uk/~jamie/misc/cjk.patch
If you don't know how/watch to apply a patch, I've zipped up the changed files (only search.php and include/search_idx.php): http://home.jamierf.co.uk/~jamie/misc/cjk.zip

Offline

#2 2010-03-09 14:14:32

FSX
Developer
From: NL
Registered: 2008-05-09
Posts: 805
Website

Re: CJK search support

I can't speak or write the language, but I know that in Japanese, characters are not always words.

Offline

#3 2010-03-09 15:23:14

qie
Member
Registered: 2008-06-02
Posts: 376

Re: CJK search support

let me test it

edit: a long time rebuild index start..
edit: it seems the rebuild stoped from the start.
stop here:
stop

Last edited by qie (2010-03-09 15:35:30)


now show:光宇游戏

Offline

#4 2010-03-09 17:17:33

Meow
Member
From: Taipei, Taiwan
Registered: 2008-05-10
Posts: 672
Website

Re: CJK search support

I use Chinese, and I could read a little Japanese.

Reines wrote:

it assumes every character is a word (which I don't believe is always the case?).

If it means Chinese characters (漢字: Hanzi, Kanji, Hanja), the answer is yes.

Last edited by Meow (2010-03-09 17:22:42)


Enjoy the chosen furry artworks on Chita every day.

Offline

#5 2010-03-09 23:51:01

Reines
Lead developer
From: Scotland
Registered: 2008-05-11
Posts: 3,165
Website

Re: CJK search support

Okay that one seems to have been getting stuck on some characters for some reason, I've tried a new version using regex instead. I'm not sure how this is performance wise, but that can be tested after if it actually works.

Patch: http://home.jamierf.co.uk/~jamie/misc/cjk2.patch
Changed files: http://home.jamierf.co.uk/~jamie/misc/cjk2.zip

Offline

#6 2010-03-10 00:04:47

Smartys
Former Developer
Registered: 2008-04-27
Posts: 3,135
Website

Re: CJK search support

Reines wrote:

t assumes every character is a word (which I don't believe is always the case?).

Correct. I believe it's true for Chinese but not true for Japanese (and I couldn't say definitively for Korean).

Offline

#7 2010-03-10 02:58:55

qie
Member
Registered: 2008-06-02
Posts: 376

Re: CJK search support

Reines wrote:

Okay that one seems to have been getting stuck on some characters for some reason, I've tried a new version using regex instead. I'm not sure how this is performance wise, but that can be tested after if it actually works.

Patch: http://home.jamierf.co.uk/~jamie/misc/cjk2.patch
Changed files: http://home.jamierf.co.uk/~jamie/misc/cjk2.zip

after i rebuild index my database size changed from 10M to 100M,the table of "search_matches got 100M size

i test it and find problem:

if I search a word "球球", it will give all hits with "球球" and "球"; that's beyond what a word of "球球" mean.

if i search a word "在饭店", it will give all posts contain  "在" or "饭" or "店" or "在饭" or "在店" or "在饭店" or "饭店" , that's also beyond what the search's purpose. you can hardly find something you want from the result. because I say "在饭店" means I'm in a hotel. but "在" means a prep,"in", "饭" means "a meal", "店" means "something like shop", "饭店" means "hotel", so you will know why that contain too much not concered results.


i don't find problem of english charactor search.

Last edited by qie (2010-03-10 03:18:49)


now show:光宇游戏

Offline

#8 2010-03-10 08:45:50

Franz
Lead developer
From: Germany
Registered: 2008-05-13
Posts: 4,072
Website

Re: CJK search support

Question: Are there spaces between words in Chinese etc.?


fluxbb.de | develoPHP

"As code is more often read than written it's really important to write clean code."

Online

#9 2010-03-10 08:56:26

qie
Member
Registered: 2008-06-02
Posts: 376

Re: CJK search support

Franz wrote:

Question: Are there spaces between words in Chinese etc.?

most of time is no. no spaces. sometimes someone would make spaces. but most of the time the meaning of the expression is showed by punctuation.

let me say something :  "我是一个非常爱好骑自行车的人,我每天都要骑车外出锻炼。“
                                   "I am a fan of riding bicycle, I would do riding excercise every day."
there is no space.

Last edited by qie (2010-03-10 08:59:13)


now show:光宇游戏

Offline

#10 2010-03-10 09:12:48

Reines
Lead developer
From: Scotland
Registered: 2008-05-11
Posts: 3,165
Website

Re: CJK search support

qie wrote:

if i search a word "在饭店", it will give all posts contain  "在" or "饭" or "店" or "在饭" or "在店" or "在饭店" or "饭店" , that's also beyond what the search's purpose.

Be careful using the word "word".

Is "在饭店" a single word, or is "在" a word, "饭" a word and "店" another word, that put together make a sentence? If you searched for the phrase "I'm in a hotel" in English, you would get posts that contain the words "I'm" "in" "a" and "hotel" (actually you wouldn't because it only indexes words 3 chars or more, but for example...), but not necessarily next to each other or in the same order. Similarly if you search for "在饭店" you get posts that contain "在", "饭" and "店", but not necessarily next to each other or in the same order.

If you want to match an exact sentence (i.e. the words at in the exact order) that is a different feature, and isn't supported even for English at the moment.

Offline

#11 2010-03-10 09:17:18

qie
Member
Registered: 2008-06-02
Posts: 376

Re: CJK search support

just take "在饭店" the example:

"在饭店" is a word that make sense,  "在” is a word make no sense, "饭", a word make no sense, "店" a word make no sense,


now show:光宇游戏

Offline

#12 2010-03-10 09:17:48

Franz
Lead developer
From: Germany
Registered: 2008-05-13
Posts: 4,072
Website

Re: CJK search support

I guess the problem with these, though, is that the changed order sometimes gives them completely different meaning (even the same symbols)...

Last edited by Franz (2010-03-10 09:17:58)


fluxbb.de | develoPHP

"As code is more often read than written it's really important to write clean code."

Online

#13 2010-03-10 09:21:12

Franz
Lead developer
From: Germany
Registered: 2008-05-13
Posts: 4,072
Website

Re: CJK search support

qie wrote:

just take "在饭店" the example:

"在饭店" is a word that make sense,  "在” is a word make no sense, "饭", a word make no sense, "店" a word make no sense,

Well, then probably noone will have them in his posts. The question then just remains is what are the meanings of these symbols (sorry) when they are arranged in another order, like so:

店在饭


fluxbb.de | develoPHP

"As code is more often read than written it's really important to write clean code."

Online

#14 2010-03-10 09:27:08

qie
Member
Registered: 2008-06-02
Posts: 376

Re: CJK search support

Franz wrote:

I guess the problem with these, though, is that the changed order sometimes gives them completely different meaning (even the same symbols)...

change order give them different meaning,  “在饭店”,there is three character , 在 饭 店  , 在 means "in", 饭店 means "hotel". but split "饭店“ there will be another meaning , 饭 is one meal,something like food.  店 means something like a shop,a hotel, a resaurant..

if i search 在饭店‘ ,there should not give the result of the prep word ,single 在 , 在 is a prep word like " in, at , "

Last edited by qie (2010-03-10 09:28:07)


now show:光宇游戏

Offline

#15 2010-03-10 09:30:19

qie
Member
Registered: 2008-06-02
Posts: 376

Re: CJK search support

Franz wrote:
qie wrote:

just take "在饭店" the example:

"在饭店" is a word that make sense,  "在” is a word make no sense, "饭", a word make no sense, "店" a word make no sense,

Well, then probably noone will have them in his posts. The question then just remains is what are the meanings of these symbols (sorry) when they are arranged in another order, like so:

店在饭


店在饭 is not a logical chinese word, if give it a translation forcely, maybe "the restaurant cookers are preparing launch"

Last edited by qie (2010-03-10 09:31:44)


now show:光宇游戏

Offline

#16 2010-03-10 11:08:20

Reines
Lead developer
From: Scotland
Registered: 2008-05-11
Posts: 3,165
Website

Re: CJK search support

Wikipedia wrote:

Modern Chinese has often been erroneously classed as a "monosyllabic" language. While most of the morphemes are single syllable, modern Chinese today is much less a monosyllabic language in that nouns, adjectives and verbs are largely di-syllabic. The tendency to create disyllabic words in the modern Chinese languages, particularly in Mandarin, has been particularly pronounced when compared to Classical Chinese. Classical Chinese is a highly isolating language, with each morpheme generally corresponding to a single syllable and a single character; Modern Chinese though, has the tendency to form new words through disyllabic, trisyllabic and tetra-character agglutination.

Based on that, and what qie said, it sounds like assuming 1 character = 1 word is actually more of an issue than I was hoping. qie didn't you say searching for Chinese words worked in phpBB3 though? They also assume 1 word = 1 character, so I'm not sure how it can work properly there but not with this patch.

Offline

#17 2010-03-10 11:50:41

qie
Member
Registered: 2008-06-02
Posts: 376

Re: CJK search support

although phpbb3 is a widely used php forum but I didn't use that,i'm a newbie with phpbb , i just know bbpress,and mylittleforum ,and usebb , and vBB can performance chinese search.

more : Mybb can do this ,i just tested.

Last edited by qie (2010-03-10 12:04:29)


now show:光宇游戏

Offline

#18 2010-03-10 12:30:58

Reines
Lead developer
From: Scotland
Registered: 2008-05-11
Posts: 3,165
Website

Re: CJK search support

The problem with that is a lot of those that you listed do not perform search indexing, meaning while their searches are much simpler they are also much less efficient.

I guess we could possibly have the ability to switch between indexed search and the less efficient version, or possibly check if the search terms have CJK characters in them and if they do automatically switch to the less efficient version.

I haven't checked all that you listed yet, but so far:

  • phpBB3 - Treats CJK characters as individual words and indexes them separately (like this patch does).

  • bbPress

  • MyLittleForum

  • UseBB - Has a min length of 3 chars and uses a very inefficient search algorithm (SELECT ... WHERE p.content LIKE '%keyword%').

  • vBulletin

  • MyBB - Also uses inefficient search algorithm (SELECT ... WHERE t.message LIKE '%keyword%');

Offline

#19 2010-03-10 12:35:34

qie
Member
Registered: 2008-06-02
Posts: 376

Re: CJK search support

ok, i think just use phpbb3 's method?

but after i performance search index rebuild , it gots so big size database as thy search_matches table grows to 100Mb .


now show:光宇游戏

Offline

#20 2010-03-10 12:36:31

Reines
Lead developer
From: Scotland
Registered: 2008-05-11
Posts: 3,165
Website

Re: CJK search support

How many topics/posts do you have in your database?

Offline

#21 2010-03-10 12:48:34

qie
Member
Registered: 2008-06-02
Posts: 376

Re: CJK search support

30000/70000 about.

btw, i rebuild index with the original search.php and search_idx.php, i got 9Mb size of search_matches table.

but use this index i still can not search something like " http " , or  "http://" or "www.xxx.net" which contains in some posts  , it because the url not contain space?

Last edited by qie (2010-03-10 12:53:55)


now show:光宇游戏

Offline

#22 2010-03-10 12:57:15

Reines
Lead developer
From: Scotland
Registered: 2008-05-11
Posts: 3,165
Website

Re: CJK search support

It looks like maybe splitting the characters up like this isn't the best plan then. I knew it would obviously increase the search tables size in forums using CJK, but from 9Mb to 100Mb is rather dramatic.

You're not meant to be able to search for http because is in the stopword (the default English one at least), plus bbcode is stripped out to prevent the search_words table getting filled with URLs (from links & images).

Offline

#23 2010-03-10 13:06:03

qie
Member
Registered: 2008-06-02
Posts: 376

Re: CJK search support

there is a "search.php" file from a native chinese language based forum "discuz"

<?php

/*
[Discuz!] (C)2001-2009 Comsenz Inc.
This is NOT a freeware, use is subject to license terms

$Id: search.php 20900 2009-10-29 02:49:38Z tiger $
*/

define('NOROBOT', TRUE);
define('CURSCRIPT', 'search');

require_once './include/common.inc.php';
require_once DISCUZ_ROOT.'./include/forum.func.php';
require_once DISCUZ_ROOT.'./forumdata/cache/cache_forums.php';
require_once DISCUZ_ROOT.'./forumdata/cache/cache_icons.php';

$discuz_action = 111;

$cachelife_time = 300;        // Life span for cache of searching in specified range of time
$cachelife_text = 3600;        // Life span for cache of text searching

$sdb = loadmultiserver('search');

$srchtype = empty($srchtype) ? '' : trim($srchtype);
$checkarray = array('posts' => '', 'trade' => '', 'qihoo' => '', 'threadsort' => '');

$searchid = isset($searchid) ? intval($searchid) : 0;

if($srchtype == 'trade' || $srchtype == 'threadsort' || $srchtype == 'qihoo') {
    $checkarray[$srchtype] = 'checked';
} elseif($srchtype == 'title' || $srchtype == 'fulltext') {
    $checkarray['posts'] = 'checked';
} else {
    $srchtype = '';
    $checkarray['posts'] = 'checked';
}

$keyword = isset($srchtxt) ? htmlspecialchars(trim($srchtxt)) : '';

$threadsorts = '';
if($srchtype == 'threadsort') {
    $query = $db->query("SELECT * FROM {$tablepre}threadtypes WHERE special='1' ORDER BY displayorder");
    while($type = $db->fetch_array($query)) {
        $threadsorts .= '<option value="'.$type['typeid'].'" '.($type['typeid'] == intval($sortid) ? 'selected=selected' : '').'>'.$type['name'].'</option>';
    }
}

$forumselect = forumselect('', '', '', TRUE);
if(!empty($srchfid) && !is_numeric($srchfid)) {
    $forumselect = str_replace('<option value="'.$srchfid.'">', '<option value="'.$srchfid.'" selected="selected">', $forumselect);
}

$disabled = array();
$disabled['title'] = !$allowsearch ? 'disabled' : '';
$disabled['fulltext'] = $allowsearch != 2 ? 'disabled' : '';

if(!submitcheck('searchsubmit', 1)) {

    include template('search');

} else {

    if($srchtype == 'qihoo') {

        require DISCUZ_ROOT.'./include/search_qihoo.inc.php';
        exit();

    } elseif(!$allowsearch) {

        showmessage('group_nopermission', NULL, 'NOPERM');

    } elseif($srchtype == 'trade') {

        require DISCUZ_ROOT.'./include/search_trade.inc.php';
        exit;

    } elseif($srchtype == 'threadsort' && $sortid) {

        require DISCUZ_ROOT.'./include/search_sort.inc.php';
        exit;

    }

    $orderby = in_array($orderby, array('dateline', 'replies', 'views')) ? $orderby : 'lastpost';
    $ascdesc = isset($ascdesc) && $ascdesc == 'asc' ? 'asc' : 'desc';

    if(!empty($searchid)) {

        require_once DISCUZ_ROOT.'./include/misc.func.php';

        $page = max(1, intval($page));
        $start_limit = ($page - 1) * $tpp;

        $index = $sdb->fetch_first("SELECT searchstring, keywords, threads, tids FROM {$tablepre}searchindex WHERE searchid='$searchid'");
        if(!$index) {
            showmessage('search_id_invalid');
        }

        $keyword = htmlspecialchars($index['keywords']);
        $keyword = $keyword != '' ? str_replace('+', ' ', $keyword) : '';

        $index['keywords'] = rawurlencode($index['keywords']);
        $index['searchtype'] = preg_replace("/^([a-z]+)\|.*/", "\\1", $index['searchstring']);

        $threadlist = array();
        $query = $sdb->query("SELECT * FROM {$tablepre}threads WHERE tid IN ($index[tids]) AND displayorder>='0' ORDER BY $orderby $ascdesc LIMIT $start_limit, $tpp");
        while($thread = $sdb->fetch_array($query)) {
            $threadlist[] = procthread($thread);
        }

        $multipage = multi($index['threads'], $tpp, $page, "search.php?searchid=$searchid&orderby=$orderby&ascdesc=$ascdesc&searchsubmit=yes");

        $url_forward = 'search.php?'.$_SERVER['QUERY_STRING'];

        if($prompts['newbietask'] && $newbietaskid && $newbietasks[$newbietaskid]['scriptname'] == 'search'){
            require_once DISCUZ_ROOT.'./include/task.func.php';
            task_newbie_complete();
        }

        include template('search');

    } else {

        !($exempt & 2) && checklowerlimit($creditspolicy['search'], -1);

        $srchuname = isset($srchuname) ? trim($srchuname) : '';

        if($allowsearch == 2 && $srchtype == 'fulltext') {
            periodscheck('searchbanperiods');
        } elseif($srchtype != 'title') {
            $srchtype = 'title';
        }

        $forumsarray = array();
        if(!empty($srchfid)) {
            foreach((is_array($srchfid) ? $srchfid : explode('_', $srchfid)) as $forum) {
                if($forum = intval(trim($forum))) {
                    $forumsarray[] = $forum;
                }
            }
        }

        $fids = $comma = '';
        foreach($_DCACHE['forums'] as $fid => $forum) {
            if($forum['type'] != 'group' && (!$forum['viewperm'] && $readaccess) || ($forum['viewperm'] && forumperm($forum['viewperm']))) {
                if(!$forumsarray || in_array($fid, $forumsarray)) {
                    $fids .= "$comma'$fid'";
                    $comma = ',';
                }
            }
        }

        if($threadplugins && $specialplugin) {
            $specialpluginstr = implode("','", $specialplugin);
            $special[] = 127;
        } else {
            $specialpluginstr = '';
        }
        $specials = $special ? implode(',', $special) : '';
        $srchfilter = in_array($srchfilter, array('all', 'digest', 'top')) ? $srchfilter : 'all';

        $searchstring = $srchtype.'|'.addslashes($srchtxt).'|'.intval($srchuid).'|'.$srchuname.'|'.addslashes($fids).'|'.intval($srchfrom).'|'.intval($before).'|'.$srchfilter.'|'.$specials.'|'.$specialpluginstr;
        $searchindex = array('id' => 0, 'dateline' => '0');

        $query = $sdb->query("SELECT searchid, dateline,
            ('$searchctrl'<>'0' AND ".(empty($discuz_uid) ? "useip='$onlineip'" : "uid='$discuz_uid'")." AND $timestamp-dateline<$searchctrl) AS flood,
            (searchstring='$searchstring' AND expiration>'$timestamp') AS indexvalid
            FROM {$tablepre}searchindex
            WHERE ('$searchctrl'<>'0' AND ".(empty($discuz_uid) ? "useip='$onlineip'" : "uid='$discuz_uid'")." AND $timestamp-dateline<$searchctrl) OR (searchstring='$searchstring' AND expiration>'$timestamp')
            ORDER BY flood");

        while($index = $sdb->fetch_array($query)) {
            if($index['indexvalid'] && $index['dateline'] > $searchindex['dateline']) {
                $searchindex = array('id' => $index['searchid'], 'dateline' => $index['dateline']);
                break;
            } elseif($adminid != '1' && $index['flood']) {
                showmessage('search_ctrl', 'search.php');
            }
        }

        if($searchindex['id']) {

            $searchid = $searchindex['id'];

        } else {

            if(!$srchtxt && !$srchuid && !$srchuname && !$srchfrom && !in_array($srchfilter, array('digest', 'top')) && !is_array($special)) {
                showmessage('search_invalid', 'search.php');
            } elseif(isset($srchfid) && $srchfid != 'all' && !(is_array($srchfid) && in_array('all', $srchfid)) && empty($forumsarray)) {
                showmessage('search_forum_invalid', 'search.php');
            } elseif(!$fids) {
                showmessage('group_nopermission', NULL, 'NOPERM');
            }

            if($adminid != '1' && $maxspm) {
                if(($sdb->result_first("SELECT COUNT(*) FROM {$tablepre}searchindex WHERE dateline>'$timestamp'-60")) >= $maxspm) {
                    showmessage('search_toomany', 'search.php');
                }
            }

            $digestltd = $srchfilter == 'digest' ? "t.digest>'0' AND" : '';
            $topltd = $srchfilter == 'top' ? "AND t.displayorder>'0'" : "AND t.displayorder>='0'";

            if(!empty($srchfrom) && empty($srchtxt) && empty($srchuid) && empty($srchuname)) {

                $searchfrom = $before ? '<=' : '>=';
                $searchfrom .= $timestamp - $srchfrom;
                $sqlsrch = "FROM {$tablepre}threads t WHERE $digestltd t.fid IN ($fids) $topltd AND t.lastpost$searchfrom";
                $expiration = $timestamp + $cachelife_time;
                $keywords = '';

            } else {

                $sqlsrch = $srchtype == 'fulltext' ?
                "FROM {$tablepre}posts p, {$tablepre}threads t WHERE $digestltd t.fid IN ($fids) $topltd AND p.tid=t.tid AND p.invisible='0'" :
                "FROM {$tablepre}threads t WHERE $digestltd t.fid IN ($fids) $topltd";

                if($srchuname) {
                    $srchuid = $comma = '';
                    $srchuname = str_replace('*', '%', addcslashes($srchuname, '%_'));
                    $query = $db->query("SELECT uid FROM {$tablepre}members WHERE username LIKE '".str_replace('_', '\_', $srchuname)."' LIMIT 50");
                    while($member = $db->fetch_array($query)) {
                        $srchuid .= "$comma'$member[uid]'";
                        $comma = ', ';
                    }
                    if(!$srchuid) {
                        $sqlsrch .= ' AND 0';
                    }
                } elseif($srchuid) {
                    $srchuid = "'$srchuid'";
                }

                if($srchtxt) {
                    if(preg_match("(AND|\+|&|\s)", $srchtxt) && !preg_match("(OR|\|)", $srchtxt)) {
                        $andor = ' AND ';
                        $sqltxtsrch = '1';
                        $srchtxt = preg_replace("/( AND |&| )/is", "+", $srchtxt);
                    } else {
                        $andor = ' OR ';
                        $sqltxtsrch = '0';
                        $srchtxt = preg_replace("/( OR |\|)/is", "+", $srchtxt);
                    }
                    $srchtxt = str_replace('*', '%', addcslashes($srchtxt, '%_'));
                    foreach(explode('+', $srchtxt) as $text) {
                        $text = trim($text);
                        if($text) {
                            $sqltxtsrch .= $andor;
                            $sqltxtsrch .= $srchtype == 'fulltext' ? "(p.message LIKE '%".str_replace('_', '\_', $text)."%' OR p.subject LIKE '%$text%')" : "t.subject LIKE '%$text%'";
                        }
                    }
                    $sqlsrch .= " AND ($sqltxtsrch)";
                }

                if($srchuid) {
                    $sqlsrch .= ' AND '.($srchtype == 'fulltext' ? 'p' : 't').".authorid IN ($srchuid)";
                }

                if(!empty($srchfrom)) {
                    $searchfrom = ($before ? '<=' : '>=').($timestamp - $srchfrom);
                    $sqlsrch .= " AND t.lastpost$searchfrom";
                }

                if(!empty($specials)) {
                    $sqlsrch .=  " AND special IN (".implodeids($special).")";
                }

                if(!empty($specialpluginstr)) {
                    $sqlsrch .=  " AND iconid IN (".implodeids($specialplugin).")";
                }

                $keywords = str_replace('%', '+', $srchtxt).(trim($srchuname) ? '+'.str_replace('%', '+', $srchuname) : '');
                $expiration = $timestamp + $cachelife_text;

            }

            $threads = $tids = 0;
            $maxsearchresults = $maxsearchresults ? intval($maxsearchresults) : 500;
            $query = $sdb->query("SELECT ".($srchtype == 'fulltext' ? 'DISTINCT' : '')." t.tid, t.closed, t.author $sqlsrch ORDER BY tid DESC LIMIT $maxsearchresults");
            while($thread = $sdb->fetch_array($query)) {
                if($thread['closed'] <= 1 && $thread['author']) {
                    $tids .= ','.$thread['tid'];
                    $threads++;
                }
            }
            $db->free_result($query);

            $db->query("INSERT INTO {$tablepre}searchindex (keywords, searchstring, useip, uid, dateline, expiration, threads, tids)
                    VALUES ('$keywords', '$searchstring', '$onlineip', '$discuz_uid', '$timestamp', '$expiration', '$threads', '$tids')");
            $searchid = $db->insert_id();

            !($exempt & 2) && updatecredits($discuz_uid, $creditspolicy['search'], -1);

        }

        showmessage('search_redirect', "search.php?searchid=$searchid&orderby=$orderby&ascdesc=$ascdesc&searchsubmit=yes");

    }

}

?>

now show:光宇游戏

Offline

#24 2010-03-10 13:09:13

qie
Member
Registered: 2008-06-02
Posts: 376

Re: CJK search support

a search.php from a native chinese forum script, phpwind,you can got reference:

<?php
define('SCR','search');
require_once('global.php');
!$_G['allowsearch'] && Showmsg('search_group_right');
if ($groupid!=3 && $groupid!=4) {
    list($db_opensch,$db_schstart,$db_schend) = explode("\t",$db_opensch);
    if ($db_opensch && (($db_schstart > $db_schend) && ($_time['hours']>$db_schend) && ($_time['hours']<$db_schstart) || ($db_schstart < $db_schend) && (($db_schstart>-1 && $_time['hours']<$db_schstart) || ($db_schend>-1 && $_time['hours']>=$db_schend)))) {
        Showmsg('search_opensch');
    }
}
InitGP(array('sch_type','keyword','authorid','step','method','f_fid','sch_time','orderway','asc','pwuser','advanced'));
InitGP(array('sch_area','newatc','digest'),'GP',2);
$sch_area>0 && $_G['allowsearch']!=2 && Showmsg('search_tpost');
if (isset($authorid) && (int)$authorid<1) {
    $errorname = $authorid;
    Showmsg('user_not_exists');
}
if ($sch_time=='newatc') {
    $newatc = 1;
    $sch_time = 86400;
}

if ($keyword) {
    $keyword = str_replace('&nbsp;','',$keyword);
    $skeyword = $keyword;
    $keyword_A = explode(' ',$keyword);
    foreach ($keyword_A as $key=>$value) {
        $value = trim($value);
        if (empty($value)) {
            unset($keyword_A[$key]);
        } else {
            $keyword_A[$key] = $value;
        }
    }
    $keyword = $keyword_A ? implode('|',$keyword_A) : '';
    $keyword && strlen($keyword)<3  && Showmsg('search_word_limit');
    $metakeyword = strip_tags($keyword);
    $subject = "$metakeyword - ";
    $db_metakeyword = str_replace('|',',',$metakeyword);
}
require_once(R_P.'require/header.php');

$forumadd = $p_table = $f = $db_searchinfo = '';
$fidout = array('0');
($newatc || is_numeric($authorid) || $digest) && $step = 2;

require_once(D_P.'data/bbscache/forumcache.php');
$query = $db->query('SELECT fid,allowvisit,password '.($step!=2 ? ',name,f_type' : '')." FROM pw_forums WHERE type<>'category'");
while ($rt = $db->fetch_array($query)) {
    $allowvisit = (!$rt['allowvisit'] || $rt['allowvisit']!=str_replace(",$groupid,",'',$rt['allowvisit'])) ? true : false;
    if ($rt['f_type']=='hidden' && $allowvisit) {
        $forumadd .= "<option value=\"$rt[fid]\"> &nbsp;|- $rt[name]</option>";
    } elseif ($rt['password'] || !$allowvisit) {
        if ($step!=2) {
            $forumcache = preg_replace("/\<option value=\"$rt[fid]\"\>(.+?)\<\/option\>\\r?\\n/is",'',$forumcache);
        } else {
            $fidout[] = $rt['fid'];
        }
    }
}
$fidout = pwImplode($fidout);
$_G['schtime']!='all' && !is_numeric($_G['schtime']) && $_G['schtime'] = 7776000;
list($f,$db_searchinfo) = explode("\t",readover(D_P.'data/bbscache/info.txt'));
$disable = $_G['allowsearch']==1 ? 'disabled' : '';
if ($_G['allowsearch']==2) {
    $t_table = '';
    if ($db_plist && count($db_plist)>1) {

        $p_table = "<select name=\"ptable\">";
        foreach ($db_plist as $key=>$val) {
            $name = $val ? $val : ($key != 0 ? getLangInfo('other','posttable').$key : getLangInfo('other','posttable'));
            $p_table .= "<option value=\"$key\">".$name."</option>";
        }
        $p_table .= '</select>';
    }
    if ($db_tlist) {
        $t_table = '<select name="ttable">';
        foreach ($db_tlist as $key => $value) {
            $name = !empty($value['2']) ? $value['2'] : ($key == 0 ? 'tmsgs' : 'tmsgs'.$key);
            $t_table .= "<option value=\"$key\">$name</option>";
        }
        $t_table .= '</select>';
    }
}

${'time_'.$_G['schtime']} = 'selected';
$method == 'OR' ? $checked_or = 'checked' : $checked_and = 'checked';
$sch_area == 2 ? $checked_2 = 'checked' : ($sch_area == 1 ? $checked_1 = 'checked' : $checked_0 = 'checked');
$checked_disget = $digest==1 ? 'checked' : '';
if ($f_fid) {
    $forumcache = preg_replace("/\<option value=\"$f_fid\"\>(.+?)\<\/option\>(\\r?\\n)/is","<option value=\"".$f_fid."\" selected>\\1</option>\\2",$forumcache);
    $forumadd = preg_replace("/\<option value=\"$f_fid\"\>(.+?)\<\/option\>(\\r?\\n)/is","<option value=\"".$f_fid."\" selected>\\1</option>\\2",$forumadd);
}
$sch_time && ${'time_'.$sch_time} = 'selected';
${'order_'.$orderway} = 'selected';
$asc == 'ASC' ? $asc_ASC = 'checked' : $asc_DESC = 'checked';
if ($step == 2) {
    include(D_P.'data/bbscache/forum_cache.php');
    @set_time_limit(0);
    $keyword_A = array();
    $schedid = '';
    InitGP(array('sid','seekfid','page','ptable'));
    $f_fid = (int)$f_fid;
    !$seekfid && $seekfid = (empty($f_fid) || $f_fid=='all') ? 'all' : $f_fid;
    if ($seekfid != 'all') {
        $seekfid = (int)$seekfid;
    }
    $admincheck = $total = 0;
    $isGM = CkInArray($windid,$manager);
    if ($seekfid!='all') {
        if ($isGM) {
            $admincheck = 1;
        } else {
            $foruminfo = $db->get_one("SELECT forumadmin,fupadmin FROM pw_forums WHERE fid=".pwEscape($seekfid));
            $isBM = admincheck($foruminfo['forumadmin'],$foruminfo['fupadmin'],$windid);
            $pwSystem = pwRights($isBM,false,$seekfid);
            if ($pwSystem && ($pwSystem['tpccheck'] || $pwSystem['digestadmin'] || $pwSystem['lockadmin'] || $pwSystem['pushadmin'] || $pwSystem['coloradmin'] || $pwSystem['downadmin'] || $pwSystem['delatc'] || $pwSystem['moveatc'] || $pwSystem['copyatc'] || $pwSystem['topped'])) {
                $admincheck = 1;
            }
        }
    }
    $superRight = ($SYSTEM['superright'] && $SYSTEM['delatc']) ? true : false;/*超级删除权限*/
    $superEdit = ($SYSTEM['superright'] && $SYSTEM['deltpcs']) ? true : false;/*超级编辑权限*/
    unset($f_fid);
    if($db_sphinx['isopen'] == 1 && $keyword){
        require_once R_P.'require/sphinxsearch.php';
    }else{
        require_once R_P.'require/normalsearch.php';
    }
}
require_once PrintEot('search');footer();

?>

Last edited by qie (2010-03-10 13:10:05)


now show:光宇游戏

Offline

#25 2010-03-10 13:19:29

Reines
Lead developer
From: Scotland
Registered: 2008-05-11
Posts: 3,165
Website

Re: CJK search support

That first one seems to use the rather inefficient version again. The bottom one I'm not sure, it looks like the actual search function is contained within a file called "require/normalsearch.php".

Offline

Board footer

Powered by FluxBB 1.5.0