Overview

Packages

  • CONTENIDO
  • Core
    • Authentication
    • Backend
    • Cache
    • CEC
    • Chain
    • ContentType
    • Database
    • Debug
    • Exception
    • Frontend
      • Search
      • URI
      • Util
    • GenericDB
      • Model
    • GUI
      • HTML
    • I18N
    • LayoutHandler
    • Log
    • Security
    • Session
    • Util
    • Validation
    • Versioning
    • XML
  • Module
    • ContentRssCreator
    • ContentSitemapHtml
    • ContentSitemapXml
    • ContentUserForum
    • NavigationTop
    • ScriptCookieDirective
  • mpAutoloaderClassMap
  • None
  • Plugin
    • ContentAllocation
    • CronjobOverview
    • FormAssistant
    • FrontendLogic
    • FrontendUsers
    • Linkchecker
    • ModRewrite
    • Newsletter
    • Repository
      • FrontendNavigation
      • KeywordDensity
    • SearchSolr
    • SmartyWrapper
    • UrlShortener
    • UserForum
    • Workflow
  • PluginManager
  • Setup
    • Form
    • GUI
    • Helper
      • Environment
      • Filesystem
      • MySQL
      • PHP
    • UpgradeJob
  • Smarty
    • Cacher
    • Compiler
    • Config
    • Debug
    • PluginsBlock
    • PluginsFilter
    • PluginsFunction
    • PluginsInternal
    • PluginsModifier
    • PluginsModifierCompiler
    • PluginsShared
    • Security
    • Template
    • TemplateResources
  • Swift
    • ByteStream
    • CharacterStream
    • Encoder
    • Events
    • KeyCache
    • Mailer
    • Mime
    • Plugins
    • Transport

Classes

  • cSearch
  • cSearchBaseAbstract
  • cSearchIndex
  • cSearchResult
  • Overview
  • Package
  • Class
  • Todo
  • Download

Class cSearchIndex

CONTENIDO API - Search Index Object

This object creates an index of an article

Create object with $oIndex = new SearchIndex($db); # where $db is the global CONTENIDO database object. Start indexing with $oIndex->start($idart, $aContent); where $aContent is the complete content of an article specified by its content types. It looks like Array ( [CMS_HTMLHEAD] => Array ( [1] => Herzlich Willkommen... [2] => ...auf Ihrer Website! ) [CMS_HTML] => Array ( [1] => Die Inhalte auf dieser Website ...

The index for keyword 'willkommen' would look like '&12=1(CMS_HTMLHEAD-1)' which means the keyword 'willkommen' occurs 1 times in article with articleId 12 and content type CMS_HTMLHEAD[1].

TODO: The basic idea of the indexing process is to take the complete content of an article and to generate normalized index terms from the content and to store a specific index structure in the relation 'con_keywords'. To take the complete content is not very flexible. It would be better to differentiate by specific content types or by any content. The &, =, () and - seperated string is not easy to parse to compute the search result set. It would be a better idea (and a lot of work) to extend the relation 'con_keywords' to store keywords by articleId (or content source identifier) and content type. The functions removeSpecialChars, setStopwords, setContentTypes and setCmsOptions should be sourced out into a new helper-class. Keep in mind that class Search and SearchResult uses an instance of object Index.

cSearchBaseAbstract
Extended by cSearchIndex
Package: Core\Frontend\Search
Copyright: four for business AG <www.4fb.de>
License: http://www.contenido.org/license/LIZENZ.txt
Author: Willi Man
Located at classes/search/class.search.index.php
Methods summary
public
# __construct( cDb $db = NULL )

Constructor, set object properties

Constructor, set object properties

Parameters

$db
CONTENIDO Database object

Overrides

cSearchBaseAbstract::__construct()
public
# start( integer $idart, array $aContent, string $place = 'auto', array $cms_options = array(), array $aStopwords = array() )

Start indexing the article.

Start indexing the article.

Parameters

$idart
Article Id
$aContent

The complete content of an article specified by its content types. It looks like Array ( [CMS_HTMLHEAD] => Array ( [1] => Herzlich Willkommen... [2] => ...auf Ihrer Website! ) [CMS_HTML] => Array ( [1] => Die Inhalte auf dieser Website ...

$place

The field where to store the index information in db.

$cms_options

One can specify explicitly cms types which should not be indexed.

$aStopwords
Array with words which should not be indexed.
public
# createKeywords( )

for each cms-type create index structure. it looks like Array ( [die] => CMS_HTML-1 [inhalte] => CMS_HTML-1 [auf] => CMS_HTML-1 CMS_HTMLHEAD-2 [dieser] => CMS_HTML-1 [website] => CMS_HTML-1 CMS_HTML-1 CMS_HTMLHEAD-2 )

for each cms-type create index structure. it looks like Array ( [die] => CMS_HTML-1 [inhalte] => CMS_HTML-1 [auf] => CMS_HTML-1 CMS_HTMLHEAD-2 [dieser] => CMS_HTML-1 [website] => CMS_HTML-1 CMS_HTML-1 CMS_HTMLHEAD-2 )

public
# saveKeywords( )

generate index_string from index structure and save keywords The index_string looks like "&12=2(CMS_HTMLHEAD-1,CMS_HTML-1)"

generate index_string from index structure and save keywords The index_string looks like "&12=2(CMS_HTMLHEAD-1,CMS_HTML-1)"

public
# deleteKeywords( )

if keywords don't occur in the article anymore, update index_string and delete keyword if necessary

if keywords don't occur in the article anymore, update index_string and delete keyword if necessary

public
# getKeywords( )

get the keywords of an article

get the keywords of an article

public mixed
# removeSpecialChars( string $key )

remove special characters from index term

remove special characters from index term

Parameters

$key
Keyword

Returns

mixed
public string
# addSpecialUmlauts( string $key )

Parameters

$key
Keyword

Returns

string
public
# setStopwords( array $aStopwords )

set the array of stopwords which should not be indexed

set the array of stopwords which should not be indexed

Parameters

$aStopwords
public
# setContentTypes( )

set the cms types

set the cms types

public
# setCmsOptions( mixed $cms_options )

set the cms_options array of cms types which should be treated special

set the cms_options array of cms types which should be treated special

Parameters

$cms_options
public boolean
# checkCmsType( string $idtype )

Check if the requested content type should be indexed (false) or not (true)

Check if the requested content type should be indexed (false) or not (true)

Parameters

$idtype

Returns

boolean
public array
# getCmsType( )

Returns

array
the _cmsType property
public array
# getCmsTypeSuffix( )

Returns

array
the _cmsTypeSuffix property
Methods inherited from cSearchBaseAbstract
_debug()
Properties summary
protected array $_keycode

the content of the cms-types of an article

the content of the cms-types of an article

# array()
protected array $_keywords

the list of keywords of an article

the list of keywords of an article

# array()
protected array $_stopwords

the words, which should not be indexed

the words, which should not be indexed

# array()
protected array $_keywordsOld

the keywords of an article stored in the DB

the keywords of an article stored in the DB

# array()
protected array $_keywordsDel

the keywords to be deleted

the keywords to be deleted

# array()
protected string $_place

'auto' or 'self' The field 'auto' in table con_keywords is used for automatic indexing. The value is a string like "&12=2(CMS_HTMLHEAD-1,CMS_HTML-1)", which means a keyword occurs 2 times in article with $idart 12 and can be found in CMS_HTMLHEAD[1] and CMS_HTML[1]. The field 'self' can be used in the article properties to index the article manually.

'auto' or 'self' The field 'auto' in table con_keywords is used for automatic indexing. The value is a string like "&12=2(CMS_HTMLHEAD-1,CMS_HTML-1)", which means a keyword occurs 2 times in article with $idart 12 and can be found in CMS_HTMLHEAD[1] and CMS_HTML[1]. The field 'self' can be used in the article properties to index the article manually.

#
protected array $_cmsOptions

array of cms types

array of cms types

# array()
protected array $_cmsType

array of all available cms types

array of all available cms types

htmlhead - HTML Headline html - HTML Text head - Headline (no HTML) text - Text (no HTML) img - Upload id of the element imgdescr - Image description link - Link (URL) linktarget - Linktarget (_self, _blank, _top ...) linkdescr - Linkdescription swf - Upload id of the element etc.

# array()
protected array $_cmsTypeSuffix

the suffix of all available cms types

the suffix of all available cms types

# array()
protected integer $idart
#
Properties inherited from cSearchBaseAbstract
$cfg, $client, $lang, $oDB
CMS CONTENIDO 4.9.7 API documentation generated by ApiGen