Senseless Search Failure

Ty Durden

Member
Dec 7, 2009
63
1
Trying to understand something, not seeing why the behavior is what it is --

On one of the larger threads, there are literally 100s of pages. I've found by actually looking through that there are literally dozens of pages which have the specific video "type" I'm looking for (for example -- IAT, JUC, BKSP, whatever).

A "search within thread", however, says there is like ONE SINGLE entry, a result I know for an absolute fact is totally wrong.

Why is this search algorithm failing to do what it's obviously there for?

Since the system doesn't index on names or identifiers (i.e, MIDD273) so you can explicitly search for a video of interest, the search feature failing to properly support any related action seriously reduces the site's utility.
:brucelee:

I'm not complaining, it's a beggars-can't-be-choosers thing, I ack -- it's more a "Why does the site NOT work as one would hope it would?"
:guaah:
 

Rollyco

Team Tomoe
Oct 4, 2007
3,562
34
What thread, and what search term are you inputting?
 

guy

(;Θ_Θ)ゝ”
Feb 11, 2007
2,079
43
A "search within thread", however, says there is like ONE SINGLE entry, a result I know for an absolute fact is totally wrong.

Works fine for me. "Search this thread" for the term "thread" and I got two results. Now it should be three, including this post.

Remember, search terms are character-specific, so a search for "MIDD273" would not match any post that contains "MIDD-273" or "MIDD 273", and vice versa. And since the contents of the post are entirely up to the poster (what words they include, etc), there's no way for a person to predict which posts have the content they want -- let alone a search algorithm.
 

Ty Durden

Member
Dec 7, 2009
63
1
Works fine for me. "Search this thread" for the term "thread" and I got two results. Now it should be three, including this post.

Remember, search terms are character-specific, so a search for "MIDD273" would not match any post that contains "MIDD-273" or "MIDD 273", and vice versa. And since the contents of the post are entirely up to the poster (what words they include, etc), there's no way for a person to predict which posts have the content they want -- let alone a search algorithm.

Item 1:
The "search" excludes three-character or less searches. Since the identifiers of a good 3/5ths of JAV titles begin with a three character sequence, you pretty much make 3/5ths of the obvious search terms out of bounds. This should be dropped to excluding 2 character sequences (which is still annoying but I gtant that search would probably be rididulous), and/or you ought to encourage some mechanism for adding the ids (ex: ATI120 & ATI) as keywords for posts which contain such referents (and this assumes the keyword search is not also limited to >3 chars, which may well be the case). If the reference count goes over a certain rational number of entries (say, 1000), then fine, cut it off and warn the user, so they know they need to cut it down. Better still, allow them to do a boolean "AND/OR/NOT" search , something there's no evidence the search mechanism supports as all despite the fact that it would be quite useful in finding a specific entry. All searches, IIRC, are handled as "or" searches which is far, far less useful than if they were all "and" searches. I can reproduce an "or" search with multiple searches. I can't cut down on a thousand results from an "or" search by any given mechanism, since "search this thread" doesn't work, as shown below.

Item 2:
Remember, search terms are character-specific, so a search for "MIDD273" would not match any post that contains "MIDD-273" or "MIDD 273", and vice versa.
I'm well aware of this. But the instance in question it was a four-char search in a thread with multiple four-character occurrences of the code -- I reduced the search down to something as simple as MIDD which OUGHT to find all three of the above examples you provided. I can't tell you the exact example of this that triggered this comment but I suspect that if you go to a large thread you KNOW contains a number of MIDD, or MIAD or some other occurrence, it won't find all of them.

Here's a working example of this defect -- this thread has an SSPD in this entry (see the next to last file referenced, which is an SSPD)

Now, pull up that thread, click on "search this thread" and enter "SSPD":
http://www.akiba-online.com/forum/search.php?do=process&searchthreadid=111388
The response?
1. Sorry - no matches. Please try some different terms.

QED -- either the "search this thread" function
a) does not work reliably,
b) does not work at all,
or
c) does not work in a manner reasonably expected.

I contend that, in all likelihood, the same error occurs with the whole search function and not merely the "search this thread" sub-function.

=================================

I do wish to re-iterate -- this is an acked "beggars can't be choosers" issue -- you're under no obligation to deal with it -- I'm not "commanding you to fix my problem" -- but given that there are thousands of items listed on the site, the tools provided for looking for specific items strike me as woefully inadequate, as well as ineffective, for the task.

I'm trying to make sure that you realize this is intended as constructive feedback and not "whining demands for satisfaction from people under no obligation to me in the first place"...

Ideally, it should be easy to find the entry for any specific item if you have the jav box code, assuming someone has posted it up. I would think that you would concur with this.

I've instead found it frustratingly difficult, partly due to the lack of a good search function that actually provides decent results. I think part of it is probably due to the forum software, but that's the kind of thing you guys can possibly "fix" eventually by kvetching in the right ears about it.

I'd also think that providing a standard set of searchable keyword formats (i.e, MIDD, MIDD256, MIDD-256) appropriate to each entry posted should be encouraged (NOT required) as the"ideal standard" for a top-flight poster. It would make the place a bit less chaotic, probably cut back on multiple posts for the same subject matter, and make the whole site much more useful and user-friendly in finding a specific file one wants.
:study:
 

Rollyco

Team Tomoe
Oct 4, 2007
3,562
34
The tools to do what you want are there:

To get "AND" functionality, prefix required search terms with the + character. For example, try searching for choker pigtails, then try +choker +pigtails

"NOT" appears to work as well; try it and see. I input +choker NOT pigtails and got the expected result.

To find that SSPD entry in the travisparkers JAVPalace thread you posted, input SSPD* into the "Search this thread" field. In this case *SSPD* is not necessary, because parentheses are database "word delimiters" (like the period and comma) and are not indexed.

Three-character searches are possible, but carry a performance impact. I might enable them in the future if I have time to monitor the effects.
 

Ty Durden

Member
Dec 7, 2009
63
1
a) Thank you very much.

b) "To find that SSPD entry in the travisparkers JAVPalace thread you posted, input SSPD* into the "Search this thread" field. In this case *SSPD* is not necessary, because parentheses are database "word delimiters" (like the period and comma) and are not indexed."

I take it then that searches are word-based, not character-based. This is distinctly counterintuitive, and, if I may suggest, possibly ought to be indicated somewhere in the search description text, along with the above kind of example. This explains a lot and should make finding some stuff much easier.

A final suggestion (or query, if it already exists somehow):
If it is possible to search a given set of results with another search -- that is, if I did a search for "JUC" and then could do a search for "hikari" on that set of results? I suppose you could probably do this with an "and" search, of course.


Again -- thank you.
 

Ty Durden

Member
Dec 7, 2009
63
1
Hmmm. Are you certain that "+x +y" searches for threads with both of them?

Because it appears to me that it searches for threads with either. I tried a two-word search and it listed off a number of threads that did not have both words... just one of them. In several cases, the subsequent searches within the thread did not produce both words. In each case, there was always one of them there, but there was not necessarily the other.
:notagain:
 

Rollyco

Team Tomoe
Oct 4, 2007
3,562
34
Can you give me an example search query that isn't behaving properly?
 

Rollyco

Team Tomoe
Oct 4, 2007
3,562
34
That is a vBulletin bug. MySQL fulltext boolean searching works fine on our server, but vBulletin mangles certain terms before sending them to the database. Problems related to this bug that I've seen so far are:

1) "-" prefix does not work.
2) When sending more than one search term, wildcard characters may not work properly.
 

Ty Durden

Member
Dec 7, 2009
63
1
That is a vBulletin bug. MySQL fulltext boolean searching works fine on our server, but vBulletin mangles certain terms before sending them to the database. Problems related to this bug that I've seen so far are:

1) "-" prefix does not work.
2) When sending more than one search term, wildcard characters may not work properly.


So I am correct, the "search" feature isn't working.:joker: