How To Search Using portal_catalog

portal_catalog is a tool that indexes all documents in the site, and make it easy to search for documents. ERP5 uses ZSQLCatalog, which is like Zope's ZCatalog, but using a relational database as a backend.

Basic usage

Filtering on fields

   1 result = context.portal_catalog.searchResults(reference='what')
   2 # or simply:
   3 result = context.portal_catalog(reference='what')

This will return all the objects for which reference is 'what'.

{X} The section below is not accurate since http://svn.erp5.org/?view=rev&revision=14934 but it's left here because it's for now the only documentation for keyword_search_keys.

Note that some keys in the catalog have a special behaviour, they are used to retrieve documents which keys contains the search parameter (but not exact match). This is the case for keys that are configured as keyword_search_keys in the catalog, by default it contains title and description. For example:

   1 result = context.portal_catalog(title='what')

will return documents with title 'what', but also documents with title 'whatever'. Technically, it does a "WHERE title LIKE '%what%'". ## XXX jerome: this concept should ideally be introduced later not to confuse the reader.

Specifying portal type

   1 result = context.portal_catalog(portal_type='Person')
   2 result = context.portal_catalog(portal_type=('Person', 'Organisation'))

Sorting results

Results can be sorted on any returned field, ascending or descending. Here goes self-explanatory example:

   1 result = context.portal_catalog(
   2   portal_type='Person',
   3   sort_on=(('title','ascending'),)
   4   )
   5 result = context.portal_catalog(
   6   portal_type=('Person', 'Organisation'),
   7   sort_on=(('title', 'ascending'),
   8            ('reference', 'descending'))
   9   )

It is possible to choose sorting type: alphabetical (default) or numerical. Use CHAR or SIGNED modification:

   1 result = context.portal_catalog(
   2   portal_type='Person',
   3   sort_on=(
   4     ('id','descending','SIGNED',), # numerical
   5     ('title','ascending','CHAR'), # alphabetical
   6     ('description','ascending',), # alphabetical, default
   7     )
   8   )

Limiting results

Results can be limited, eg:

   1 result = context.portal_catalog(portal_type='Person', limit=1)
   2 result = context.portal_catalog(portal_type=('Person', 'Organisation'), limit=44)

This way you can also increase the default limit - by default, portal_catalog returns only 1 000 results, but you can increase it by specifying a higher limit, or pass limit=None for no limit (but you should always to avoid using this for obvious performance reasons).

Security

The catalog will only return documents on which the current user have the View permission. If called from a Python script with proxy roles, those proxy roles are taken into account. If you want all documents without security, call unrestrictedSearchResults or unrestrictedCountResults.

Advanced usage

Filtering by category membership

If you want the catalog to return only documents which are members of a certain category, submit the uid of the category using auto-generated key "category_uid". For example, if you want to know all women from the Corleone family, do:

   1 female_uid = context.portal_categories.gender.female.getUid()
   2 result = context.portal_catalog(gender_uid=female_uid, portal_type='Person', title='Corleone')

(but, in this case, be very careful!)

The same can be done for relations (keep in mind that a relation between objects is also a category membership!). For example, these two lines are equivalent:

   1 result = context.getSubordinationRelatedValueList()
   2 result = context.portal_catalog(subordination_uid=context.getUid())

Except that with the latter you can use all the power of portal_catalog, and it produces more usable listbox.

For delivery related relations: source, destination, source_section and destination_section this trick will work only if the objects being looked up are of one of delivery portal types (you can find out which ones are these by running getPortalDeliveryTypeList). This is because they are stored in delivery table which is queried before related keys are used. If you need to query one of these relations for a non-delivery portal type, use the following workaround:

   1 result = context.portal_catalog(default_source_section_uid=context.getUid())

It will do the job.

   1 query = Query(**{'delivery.start_date':[DateTime(2010,3,1), DateTime(2010,4,1),],'range':'minmax'})
   2 result = context.portal_catalog(query=query, portal_type='Sale Order')

There is more than minmax. Check SearchKey.py source code.

Using this in a listbox

To use this mechanism in a listbox, you need to supply the argument to the list method by setting a listbox's default parameter in TALES expressions:

python: [('category_uid',  (here.methodOrScriptToGetUid(),))]

However, this works only if listbox is used in one context only (e.g. a module). This is because listbox params are cached, so if you want to use it in an object context you have to write a script to be used as a listbox action, something like:

# (a script accepts a **kw argument)
kw['category_uid'] = here.methodOrScriptToGetUid()
return context.portal_catalog(**kw)

Note: Same rules applies to searchFolder, which is a method you can call on folderish objects that will only return document contained on this module.

Specifics of some fields

Most fields are self explanatory, like 'portal_type' or 'id', but there is a number of fields that might need a bit of an explanation.

One of those is the "owner" field. Located recently in main catalog table. The field contains the reference of the person with 'Owner' local role on the object. But only if the person has the View permission to that object. In all other cases it contains an empty string.

Search Text

Some search keys (Keyword, FullText, DateTime, Default) detect when given value follows a certain syntax, and when it does it expands it into multiple queries, which can involve distinct columns.

This detection is triggered by the presence of (at least) one of the following in searched value:

When a Search Text syntax is detected, the expression is parsed and split into multiple Query & ComplexQuery instances. Technical detail: parser actualy generates an abstract syntax tree which is then merged and transformed into queries.

The syntax should be easily readable from parser source code, and reserved keywords from lexer source code. As always, it's easier to see sample queries to get the general idea:

Assumptions:

Search Text

SQL

Remark

'foo'

default_column='foo'

'foo bar'

default_column='foo' AND default_column='bar'

Implicit operator

'foo other:bar'

default_column='foo' AND other='bar'

Valid column name

'foo unknown:bar'

default_column='foo' AND default_column='unknown:bar'

Invalid column name

'foo OR bar'

default_column='foo' OR default_column='bar'

Explicit logical operator

'foo or bar'

default_column='foo or bar'

Logical operator case is important

'>=foo'

default_column>='foo'

Comparison operator

'>= foo'

default_column='>=' AND default_column='foo'

Don't put any space between a comparison operator and compared value

(foo OR bar) AND other:baz

(default_column='foo' OR default_column='bar') AND other='baz'

Parentheses

"foo bar"

default_column='foo bar'

String Note: single quotes won't work.

Note: regular logical operator precedence takes place: AND has a higher priority than OR, and an implicit AND operator has the same priority as an explicit AND operator.

HowToSearchPortalCatalog (last edited 2010-05-12 07:08:42 by Yusei)