Overview

Packages

  • CONTENIDO
  • Core
    • Authentication
    • Backend
    • Cache
    • CEC
    • Chain
    • ContentType
    • Database
    • Debug
    • Exception
    • Frontend
      • Search
      • URI
      • Util
    • GenericDB
      • Model
    • GUI
      • HTML
    • I18N
    • LayoutHandler
    • Log
    • Security
    • Session
    • Util
    • Validation
    • Versioning
    • XML
  • Module
    • ContentSitemapHtml
    • ContentSitemapXml
    • ContentUserForum
    • NavigationTop
    • ScriptCookieDirective
  • mpAutoloaderClassMap
  • None
  • PHP
  • Plugin
    • ContentAllocation
    • CronjobOverview
    • FormAssistant
    • FrontendLogic
    • FrontendUsers
    • Linkchecker
    • ModRewrite
    • Newsletter
    • Repository
      • FrontendNavigation
      • KeywordDensity
    • SIWECOS
    • SmartyWrapper
    • UrlShortener
    • UserForum
    • Workflow
  • PluginManager
  • Setup
    • Form
    • GUI
    • Helper
      • Environment
      • Filesystem
      • MySQL
      • PHP
    • UpgradeJob

Classes

  • cSearch
  • cSearchBaseAbstract
  • cSearchIndex
  • cSearchResult
  • Overview
  • Package
  • Class
  • Tree
  • Deprecated
  • Todo

Class cSearchIndex

CONTENIDO API - Search Index Object.

This object creates an index of an article.

Create object where $db is the global CONTENIDO database object.

$oIndex = new SearchIndex($db);

Start indexing where $aContent is the complete content of an article specified by its content types.

$oIndex->start($idart, $aContent);

It looks like: Array ( [CMS_HTMLHEAD] => Array ( [1] => Herzlich Willkommen... [2] => ...auf Ihrer Website! ) [CMS_HTML] => Array ( [1] => Die Inhalte auf dieser Website ...

The index for keyword 'willkommen' would look like '&12=1(CMS_HTMLHEAD-1)' which means the keyword 'willkommen' occurs 1 times in article with articleId 12 and content type CMS_HTMLHEAD[1].

TODO: The basic idea of the indexing process is to take the complete content of an article and to generate normalized index terms from the content and to store a specific index structure in the relation 'con_keywords'.

To take the complete content is not very flexible. It would be better to differentiate by specific content types or by any content.

The &, =, () and - seperated string is not easy to parse to compute the search result set.

It would be a better idea (and a lot of work) to extend the relation 'con_keywords' to store keywords by articleId (or content source identifier) and content type.

The functions removeSpecialChars, setStopwords, setContentTypes and setCmsOptions should be sourced out into a new helper-class.

Keep in mind that class Search and SearchResult uses an instance of object Index.

cSearchBaseAbstract
Extended by cSearchIndex
Package: Core\Frontend\Search
Copyright: four for business AG <www.4fb.de>
License: http://www.contenido.org/license/LIZENZ.txt
Author: Willi Man
Located at classes/search/class.search.index.php
Methods summary
public
# __construct( cDb $db = NULL )

Constructor to create an instance of this class.

Constructor to create an instance of this class.

Set object properties.

Parameters

$db
cDb
$db [optional] CONTENIDO database object

Throws

cDbException
cInvalidArgumentException

Overrides

cSearchBaseAbstract::__construct()
public
# start( integer $idart, array $aContent, string $place = 'auto', array $cms_options = array(), array $aStopwords = array() )

Start indexing the article.

Start indexing the article.

Parameters

$idart
integer
$idart Article Id
$aContent
array
$aContent The complete content of an article specified by its content types. It looks like: [ [CMS_HTMLHEAD] => [ [1] => Herzlich Willkommen... [2] => ...auf Ihrer Website! ] [CMS_HTML] => [ [1] => Die Inhalte auf dieser Website ... ] ]
$place
string
$place [optional] The field where to store the index information in db.
$cms_options
array
$cms_options [optional] One can specify explicitly cms types which should not be indexed.
$aStopwords
array
$aStopwords [optional] Array with words which should not be indexed.

Throws

cInvalidArgumentException
cDbException
public
# createKeywords( )

For each cms-type create index structure.

For each cms-type create index structure.

It looks like: Array ( [die] => CMS_HTML-1 [inhalte] => CMS_HTML-1 [auf] => CMS_HTML-1 CMS_HTMLHEAD-2 [dieser] => CMS_HTML-1 [website] => CMS_HTML-1 CMS_HTML-1 CMS_HTMLHEAD-2 )

Throws

cInvalidArgumentException
public
# saveKeywords( )

Generate index_string from index structure and save keywords. The index_string looks like "&12=2(CMS_HTMLHEAD-1,CMS_HTML-1)".

Generate index_string from index structure and save keywords. The index_string looks like "&12=2(CMS_HTMLHEAD-1,CMS_HTML-1)".

Throws

cInvalidArgumentException
cDbException
public
# deleteKeywords( )

If keywords don't occur in the article anymore, update index_string and delete keyword if necessary.

If keywords don't occur in the article anymore, update index_string and delete keyword if necessary.

Throws

cInvalidArgumentException
cDbException
public
# getKeywords( )

Get the keywords of an article.

Get the keywords of an article.

Throws

cInvalidArgumentException
cDbException
public mixed
# removeSpecialChars( string $key )

Remove special characters from index term.

Remove special characters from index term.

Parameters

$key
string
$key Keyword

Returns

mixed
public string
# addSpecialUmlauts( string $key )

Parameters

$key
string
$key Keyword

Returns

string
public
# setStopwords( array $aStopwords )

Set the array of stopwords which should not be indexed.

Set the array of stopwords which should not be indexed.

Parameters

$aStopwords
array
$aStopwords
public
# setContentTypes( )

Set the cms types.

Set the cms types.

Throws

cInvalidArgumentException
cDbException
public
# setCmsOptions( mixed $cms_options )

Set the cms_options array of cms types which should be treated special.

Set the cms_options array of cms types which should be treated special.

Parameters

$cms_options
mixed
$cms_options
public boolean
# checkCmsType( string $idtype )

Check if the requested content type should be indexed (false) or not (true).

Check if the requested content type should be indexed (false) or not (true).

Parameters

$idtype
string
$idtype

Returns

boolean
public array
# getCmsType( )

Returns the property _cmsType.

Returns the property _cmsType.

Returns

array
public array
# getCmsTypeSuffix( )

Returns the property _cmsTypeSuffix.

Returns the property _cmsTypeSuffix.

Returns

array
Methods inherited from cSearchBaseAbstract
_debug()
Properties summary
protected array $_keycode array()
#

content of the cms-types of an article

content of the cms-types of an article

protected array $_keywords array()
#

list of keywords of an article

list of keywords of an article

protected array $_stopwords array()
#

words, which should not be indexed

words, which should not be indexed

protected array $_keywordsOld array()
#

keywords of an article stored in the DB

keywords of an article stored in the DB

protected array $_keywordsDel array()
#

keywords to be deleted

keywords to be deleted

protected string $_place
#

'auto' or 'self'

'auto' or 'self'

The field 'auto' in table con_keywords is used for automatic indexing. The value is a string like "&12=2(CMS_HTMLHEAD-1,CMS_HTML-1)", which means a keyword occurs 2 times in article with $idart 12 and can be found in CMS_HTMLHEAD[1] and CMS_HTML[1].

The field 'self' can be used in the article properties to index the article manually.

protected array $_cmsOptions array()
#

array of cms types

array of cms types

protected array $_cmsType array()
#

array of all available cms types

array of all available cms types

htmlhead - HTML Headline html - HTML Text head - Headline (no HTML) text - Text (no HTML) img - Upload id of the element imgdescr - Image description link - Link (URL) linktarget - Linktarget (_self, _blank, _top ...) linkdescr - Linkdescription swf - Upload id of the element etc.

protected array $_cmsTypeSuffix array()
#

suffix of all available cms types

suffix of all available cms types

protected integer $idart
#
Properties inherited from cSearchBaseAbstract
$cfg, $client, $lang, $oDB
CMS CONTENIDO 4.10.1 API documentation generated by ApiGen 2.8.0