Skip to main content

ATG Search Indexing - overview of different steps in search indexing


Read more about the search indexing behind the scene steps @ http://tips4ufromsony.blogspot.in/2011/12/atg-search-indexing-behind-scene-steps.html

ATG Search prepares searchable content by indexing the products specified in the XML definition file (/atg/commerce/search/ProductCatalogOutputConfig).

Generally there are two types of indexing
1.  Full Indexing  --> all data taken for indexing
2.  Incremental Indexing --> only changed data will be taken for indexing

When full indexing is triggered, following happens:

   1. The out of box component BulkLoader will call IndexedItemsGroup.getGroupMembers() to load the products to the XHTL document. It prevents uncategorized products from getting indexed. The definition file format begins with a top-level item as a product and includes the properties of parent category and childskus. For each product, the set of Variant Producers configured in ProductCatalogOutputConfig is executed to check how many index items are to be created.

   2. XHTML documents are generated for each product, in order to submit to the engine for indexing. The XHTML is generated based on the definition file specified in ProductCatalogOutputConfig. An XHTML document that represents a Commerce product includes information about its parent category’s properties, as well as information about the properties of the child SKUs.

  3. The definition file, product-catalog-output-config is parsed to generate the text and meta properties, to be added to the index. The Text–properties indicates the properties which can be searched on. The Meta-properties indicate the properties which can be sent as constraints for faceted search. The text property will be specified in <text-properties> tag and meta property in <meta-properties> tag. The properties for which there is a custom property accessor specified, the property accessor is used to obtain the value to be indexed.

  4. After all the products have been added, the out of box PostIndexCustomization is executed to add any refineConfig and rankConfig information. This is used by the engine for generating facets and for manipulating the search results

  In case of failure in indexing, check the following logs
- JBoss server logs - <JBOSS_HOME>\server\atg\logs\server.log
- Dumping request logs Folder - <ATG_HOME>\logs\searchEngineActivity\*.xml ( request and response xmls). These logs will provide whether what was the request send to search engine in xml form and what was the response from engine for a query.
- Soap request logs - <ATG_HOME>\Search2007.1\SearchEngine\i686-win32-vc71\bin. This is used for checking the indexing failing.


Comments

  1. Hi, I've been dealing with some questions regarding Search and indexing, I got the process a lot clear now (thanks for that) but I got some questions:
    - if there are problems with facets, like not showing the right facets configured, does that mean there was a problem during the indexing on the PostIndexCustomization?
    - Why during the PostIndexCustomization the indexing can take too long, like 2 hours? And before it took like 30 min, what could be a starting point to find what is wrong?
    - Which one is better, full or incremental indexing? Can both coexist? What do you recommend?

    Thanks a lot!

    ReplyDelete
  2. - If there are issues with facets not showing on the site, first check the data in your database, then the refinement repository, then the refineconfig passed to search engine.

    - I haven't tries incremental indexing.But if you have small changes per day, you can go ahead with incremental, otherwise go with full indexing

    ReplyDelete

Post a Comment

Popular posts from this blog

ATG CA - BCC home screen : how to add a new link

          Activity source is the property which controls the links on the left nav on the BCC home screen. All activity sources are registered with the ActivityManager component at /atg/bizui/activity/ActivityManager . When rendering the BCC home page, the ActivityManager cycles through all the registered ActivitySource components and displays left navigation links for each of them on the BCC home page. For example if I want to add a new link "My New Link" , below screen shots exaplins how this can be done 1. Add  activityManager.properties to specify the activityresources. In this  activityManager, I specified one MyActivitySource. 2. Add  MyActivitySource.properties  to specify the name of the link and the other details . Here it refers to a bundle properties file.  3. Add  the bundle properties file  to specify the name of the link.  4. Now you could see the new link @ BCC home page . Each activity source has the following important functions:

ATG Search Indexing - behind the scene steps explained

Read more about the search indexing @  http://tips4ufromsony.blogspot.com/2011/11/atg-search-architectural-flow-search.html ATG search indexing involves index file creation, deploying and copying the index file to the search engine's box. The steps can be divided into Initial stage, Preparing Content, Indexing and Deploying. Please find below the detailed analysis of each step. 1. Initial stage:        a. Check whether the folder deployshare configured correctly @ LaunchingService.deployShare  ( \atg\search\routing\LaunchingService.deployShare ). Lets assume that it is configured to \Search2007.1\SearchEngine\i686-win32-vc71\buildedIndexFiles.        b. Lets assume that the index file folder ( \Search2007.1\SearchEngine\i686-win32-vc71\indexFiles)  has the following segments (folders) currently :                     66900009 @ index engine box        66900010 @ search engine box        c. Lets assume that the component SearchEngineService has the "

Eclipse plug-in to create Class and Sequence diagrams

ModelGoon is an Eclipse plug-in avaiable for UML diagram generation from Java code. It can be used to generate Package Dependencies Diagram, Class Diagram, Interaction Diagram and Sequence Diagram. You coud get it from http://marketplace.eclipse.org/content/modelgoon-uml4java Read more about it and see some vedios about how to create the class and sequence diagram @ http://www.modelgoon.org/?tag=eclipse-plugin Find some snapshots below which gives an idea about the diagram generation.

ATG Search and startRemoteLauncher

If the search engine application is running in a separate box than the ATG CA-BCC deployed server, the search engine is invoked through a remoteLauncher running in these boxes. Means, a remoteLauncher needs to run in the host where SearchEngine is installed remotely to start the search engine. You could find the startRemoteLauncher.sh @ ATG/ATG2007.1/Search2007.1/SearchAdmin/bin/startRemoteLauncher.sh Start remoteLauncher using the command   ./startremotelauncher.sh –p <RMI Launcher service Port > & The Launcher Service Port on the host machine can be found in BCC @ Search Administration>Projects > Search Project: gmri_search_en_CA > Environments > Host Machine: > Advanced Settings. If the startRemoteLanucher reports a BindingException , then you need to find the process that is using the launchServicePort. For that run the netstat command like :   netstat -an |grep 10880 Then inorder to identify the process that uses this port run the command like : r

Search Facets - how to create a new search facets in ATG Search

A Facet is a search refinement element that corresponds to a property of a commerce item type. ATG supports the search result refinement using the Faceted Search concept. Read more about facted search @  http://en.wikipedia.org/wiki/Faceted_search . Facet can either be ranges or specific values. Each facet is stored in the RefinementRepository as a separate refineElement repository item. Facets are divided into Global and Local facets. Global facets apply to all the categories and local facets only to the category in which they are created. For example Price/Brand can be considered as the facets that are common for all skus and New Release/Coming Soon can be considered as the facets that are specific to Physical Media products like Vidoe/DVD/Blue-ray/Books. We can use the ATG BCC - Merchandising UI to create facets. The Faceting Property depends on the meta-properties defined in the \atg\commerce\search\product-catalog-output-config.xml ( the definitionFile of the \atg\commerce\s