Troubles configuring search on metadata

Hi,

Here's a use case for the search feature we're having troubles dealing with:

We have various files, say to store numerous norm documents. Their titles have the following format: “NF Z65-130.pdf”.

Searching through titles using the FULLTEXT operator gives unsatisfying results since dashes (-) are stripped down from the query (see NXQLQueryBuilder.sanitizeFulltextInput) (1) ; so searching “Z65-130” (even with the double dashes) will eventually be translated as query documents that contain either “Z65” or “130”, which will return a lot of results.

The more appropriate operator seems to be LIKE, except it requires the use of the % operator as the wildcard, which is not user-friendly (depending on the field, users would have to use either * or %).

Is there any approach that would allow queries to match user expectations when they simply type “Z60-130” or “*Z60-130*“?

Thanks for your help

(1) This seems to also occur when dc:title is removed from default-repository-config.xml, i.e. when NXQL queries fall back from FULLTEXT 'query' to LIKE '%query%'.

0 votes

1 answers

877 views

ANSWER



You're working with the assumption that fulltext search can search something like “Z65-130”. You'll find out that this is not always the case, it heavily depends on the database, the database fulltext configuration, etc.

If what you're searching for is not a regular word in a known language, then fulltext search is not a good match for you unless you spend a lot of time tweaking the database fulltext parser and making sure all the Nuxeo layers know about this specific parsing, which would require code changes.

Fulltext is primarily designed for human languages.

0 votes



Hi Florent,

Thanks for your answer. I forgot to mention we're using Postgres indeed.

But yeah I agree with you, that's why the LIKE operation seemed like a better approach, if it weren't for the use of "%". Do you know if there's a way to turn * as the wildcard for LIKE instead (whether at Nuxeo or Postgres level)? I get that it might feel hacky, but from the point of view of the end-user, having the 2 operators FULLTEXT/LIKE with different syntaxes can be confusing.

10/02/2013