Skip to main content

ATG Search Indexing - overview of different steps in search indexing


Read more about the search indexing behind the scene steps @ http://tips4ufromsony.blogspot.in/2011/12/atg-search-indexing-behind-scene-steps.html

ATG Search prepares searchable content by indexing the products specified in the XML definition file (/atg/commerce/search/ProductCatalogOutputConfig).

Generally there are two types of indexing
1.  Full Indexing  --> all data taken for indexing
2.  Incremental Indexing --> only changed data will be taken for indexing

When full indexing is triggered, following happens:

   1. The out of box component BulkLoader will call IndexedItemsGroup.getGroupMembers() to load the products to the XHTL document. It prevents uncategorized products from getting indexed. The definition file format begins with a top-level item as a product and includes the properties of parent category and childskus. For each product, the set of Variant Producers configured in ProductCatalogOutputConfig is executed to check how many index items are to be created.

   2. XHTML documents are generated for each product, in order to submit to the engine for indexing. The XHTML is generated based on the definition file specified in ProductCatalogOutputConfig. An XHTML document that represents a Commerce product includes information about its parent category’s properties, as well as information about the properties of the child SKUs.

  3. The definition file, product-catalog-output-config is parsed to generate the text and meta properties, to be added to the index. The Text–properties indicates the properties which can be searched on. The Meta-properties indicate the properties which can be sent as constraints for faceted search. The text property will be specified in <text-properties> tag and meta property in <meta-properties> tag. The properties for which there is a custom property accessor specified, the property accessor is used to obtain the value to be indexed.

  4. After all the products have been added, the out of box PostIndexCustomization is executed to add any refineConfig and rankConfig information. This is used by the engine for generating facets and for manipulating the search results

  In case of failure in indexing, check the following logs
- JBoss server logs - <JBOSS_HOME>\server\atg\logs\server.log
- Dumping request logs Folder - <ATG_HOME>\logs\searchEngineActivity\*.xml ( request and response xmls). These logs will provide whether what was the request send to search engine in xml form and what was the response from engine for a query.
- Soap request logs - <ATG_HOME>\Search2007.1\SearchEngine\i686-win32-vc71\bin. This is used for checking the indexing failing.


Comments

  1. Hi, I've been dealing with some questions regarding Search and indexing, I got the process a lot clear now (thanks for that) but I got some questions:
    - if there are problems with facets, like not showing the right facets configured, does that mean there was a problem during the indexing on the PostIndexCustomization?
    - Why during the PostIndexCustomization the indexing can take too long, like 2 hours? And before it took like 30 min, what could be a starting point to find what is wrong?
    - Which one is better, full or incremental indexing? Can both coexist? What do you recommend?

    Thanks a lot!

    ReplyDelete
  2. - If there are issues with facets not showing on the site, first check the data in your database, then the refinement repository, then the refineconfig passed to search engine.

    - I haven't tries incremental indexing.But if you have small changes per day, you can go ahead with incremental, otherwise go with full indexing

    ReplyDelete

Post a Comment

Popular posts from this blog

ATG - more about Forms and Form Handlers

An ATG form is defined by the dsp:form tag, which typically encloses DSP tags that specify form elements, such as dsp:input that provide direct access to Nucleus component properties. Find below a sample dsp:form tag.    <dsp:form action="/testPages/showPersonProperties.jsp" method="post" target="_top">      <p>Name: <dsp:input bean="/samples/Person.name" type="text"/>      <p>Age: <dsp:input bean="/samples/Person.age" type="text" value="30"/>      <p><dsp:input type="submit" bean="/samples/Person.submit"/> value="Click to submit"/>    </dsp:form>   When the user submits the form, the /samples/Person.name property is set to the value entered in the input field.Unlike standard HTML, which requires the name attribute for most input tags; the name attribute is optional for DSP form element tags. If an input tag omits the n...

Good features of Eclipse 3.6 (Eclipse Helios) JDT

Read the Eclipse Galileo features @  http://tips4ufromsony.blogspot.com/2011/10/good-features-of-eclipse35-eclipse.html New options in Open Resource dialog : The Open Resource dialog supports three new features: • Path patterns: If the pattern contains a /, the part before the last / is used to match a path in the workspace: • Relative paths: For example, "./T" matches all files starting with T in the folder of the active editor or selection: • Closer items on top: If the pattern matches many files with the same name, the files that are closer to the currently edited or selected resource are shown on top of the matching items list. MarketPlace :  Searching and adding new plugins for Eclipse have always been a challenge. The Eclipse Marketplace makes this much easier – it allows you to not only search a central location of all Eclipse plugins, but also allows you to find the most recent and the most popular plugins. Fix multiple proble...

ATG - basic concepts of ATG

This blog is for the ATG beginners to get some basic overview about ATG. I just given the ATG concepts as a list of numbered points for the ease of understanding. 1. At the framework level, ATG is a               java based application platform for hosting web-based applications, as well as RMI accessible business components,               with an ORM layer,               a component container,               an MVC framework,               and a set of tag libraries for JSP. 2. Art Technology Group(ATG)'s Dynamo Application Server (DAS) is a Java EE compliant application server. DAS is no longer actively developed as ATG recommends using other Java EE applications servers for its products such as BEA WebLogic, JBoss or IBM WebSphere. 3. Prior to ATG 2007, JHTML was used instead of JSP for view purpose. J...

Do you know these features of Java Decompiler ?

Other than the basic “search for a word” and “go to line number”, following are some of the features of the latest Java Decompiler: 1. You can open multiple Jar files in tab. You can open multiple class files in tab. If one class has a reference to another class in the same jar, you can go to that class file by just clicking :  2. You can view the Type Hierarchy of a class file by selecting : Navigate –>  Open Type Hierarchy ( F4 ) :  3. You can view the outline of the class ( methods and variables ) by selecting : Navigate –> Quick Outline ( Ctrl + Shift + O ) or you can click on the middle button of your mouse to open the quick outline :  4. It is possible to save a single class file as the Java source file and also to save all the class files in a jar as a zipped source file :  5.  Copy and paste a stacktrace from a file into the Decompiler and the stacktrace became active in a clipboard window, provided the class files a...

MPC way of subtitle download

How can we easily download the subtitle of a video ? If you are using the Media Player Classic, there is an easy option to download subtitle (if you are already connected to internet). Go to File --> Subtitle Databse --> Download. Now a list of subtitles will be listed including the language and you can choose the one and also can replace the existing one (if any). You have the option to save this subtitle (@  File --> Save Subtitle). Please find below some snaps: