Skip to main content

ATG Search Indexing - overview of different steps in search indexing


Read more about the search indexing behind the scene steps @ http://tips4ufromsony.blogspot.in/2011/12/atg-search-indexing-behind-scene-steps.html

ATG Search prepares searchable content by indexing the products specified in the XML definition file (/atg/commerce/search/ProductCatalogOutputConfig).

Generally there are two types of indexing
1.  Full Indexing  --> all data taken for indexing
2.  Incremental Indexing --> only changed data will be taken for indexing

When full indexing is triggered, following happens:

   1. The out of box component BulkLoader will call IndexedItemsGroup.getGroupMembers() to load the products to the XHTL document. It prevents uncategorized products from getting indexed. The definition file format begins with a top-level item as a product and includes the properties of parent category and childskus. For each product, the set of Variant Producers configured in ProductCatalogOutputConfig is executed to check how many index items are to be created.

   2. XHTML documents are generated for each product, in order to submit to the engine for indexing. The XHTML is generated based on the definition file specified in ProductCatalogOutputConfig. An XHTML document that represents a Commerce product includes information about its parent category’s properties, as well as information about the properties of the child SKUs.

  3. The definition file, product-catalog-output-config is parsed to generate the text and meta properties, to be added to the index. The Text–properties indicates the properties which can be searched on. The Meta-properties indicate the properties which can be sent as constraints for faceted search. The text property will be specified in <text-properties> tag and meta property in <meta-properties> tag. The properties for which there is a custom property accessor specified, the property accessor is used to obtain the value to be indexed.

  4. After all the products have been added, the out of box PostIndexCustomization is executed to add any refineConfig and rankConfig information. This is used by the engine for generating facets and for manipulating the search results

  In case of failure in indexing, check the following logs
- JBoss server logs - <JBOSS_HOME>\server\atg\logs\server.log
- Dumping request logs Folder - <ATG_HOME>\logs\searchEngineActivity\*.xml ( request and response xmls). These logs will provide whether what was the request send to search engine in xml form and what was the response from engine for a query.
- Soap request logs - <ATG_HOME>\Search2007.1\SearchEngine\i686-win32-vc71\bin. This is used for checking the indexing failing.


Comments

  1. Hi, I've been dealing with some questions regarding Search and indexing, I got the process a lot clear now (thanks for that) but I got some questions:
    - if there are problems with facets, like not showing the right facets configured, does that mean there was a problem during the indexing on the PostIndexCustomization?
    - Why during the PostIndexCustomization the indexing can take too long, like 2 hours? And before it took like 30 min, what could be a starting point to find what is wrong?
    - Which one is better, full or incremental indexing? Can both coexist? What do you recommend?

    Thanks a lot!

    ReplyDelete
  2. - If there are issues with facets not showing on the site, first check the data in your database, then the refinement repository, then the refineconfig passed to search engine.

    - I haven't tries incremental indexing.But if you have small changes per day, you can go ahead with incremental, otherwise go with full indexing

    ReplyDelete

Post a Comment

Popular posts from this blog

ATG Product Catalog schema ER diagram

Check out the O rder schema ER-Diagram @   http://tips4ufromsony.blogspot.in/2012/02/atg-order-schema-er-diagram.html Check out the User Profile  schema ER-Diagram @ http://tips4ufromsony.blogspot.in/2012/03/atg-user-profile-schema-er-diagram.html If you would like to know the relationship between different Product Catalog tables, please find below screen shots of  Product Catalog schema ER Diagrams.

How to simulate Browser back button

When someone asks how to simulate a back button, they really mean to ask how to create a link that points to the previously visited page. Most browsers tend to keep a list of which websites the user has visited and in what order they have done so. The DOM window object provides access to the browser's history through the history object. Moving backward and forward through the user's history is done using the   back(), forward(), and go() methods of the  history  object. To move backward through history, just do window.history.back() ; This will act exactly like the user clicked on the Back button in their browser toolbar. Find below a sample html code: <html> <head> <script type="text/javascript"> function goBack(){  window.history.back() } </script> </head> <body>    <input type="button" value="Back" onclick="goBack()" /> </body> </html>

Useful windows shortcut keys

Following are the most useful set of windows shortcut keys that I would like to share with you. run commands calc        -->  Calculator  mspaint   -->  To get Paint textpad   -->  To get Textpad if installed cmd        -->  Opens a new Command Window (cmd.exe)  control    -->  Displays Control Panel  msconfig   -->  Configuration to edit startup files  compmgmt.msc  -->  Computer management  fsmgmt.msc      -->  Folder Sharing Management  services.msc     -->   System Services  appwiz.cpl  --> Add/Remove Programs  ncpa.cpl     -->  Network Connections  %userprofile%  -->  Opens you User's Profile  %temp%  -->  Opens temporary file Folder  temp ...

Jsp and CSS size limits that web developers need to aware

Here I am listing some erroneous cases that might occur in your web development phase, due to some size restrictions. JSP file size limit : You might get some run time exceptions that the JSP file size limit exceeds. Please find below the reason : In JVM the size of a single JAVA method is limited to 64kb. When the jsp file is converted to Servlet, if the jspservice method's size exceeds the 64kb limit, this exception will occur. Keep in mind that this exception depends on the implementation of the JSP translator, means the same JSP code may give an exception in Tomcat and may run successfully in Weblogic due to the the difference in the logic to built the Servlet methods from JSP. The best way to omit this issue is by using dynamic include.For example, if you are using                  <%@ include file="sample.jsp" %> (static include),  replace this to               ...

JBoss - how to take thread dumps

Following are the different options in JBoss to take the thread dump. 1. Using JBoss jmx-console :    Got to http://localhost:8080/jmx-console  and search for the mbean " serverInfo " and click on that link. Click the  invoke mbean operation under listThreaddump() method. This will give you the current snapshot of threads which are running in your JBoss jvm. 2. By using twiddle :   In the commnad promt go to <JBOSS-HOME>/bin and run the command  twiddle.bat invoke "jboss.system:type=ServerInfo" listThreadDump > threadDump.html  and you could find the threadDump.html with the JBoss thread dump. 3. Using "Interrupt" signal :  Use kill -3 <process-id> to generate thread dump. You will find the thread dumps in server logs.