Skip to main content

ATG Search and how to generate XHTMLs from STG file


The ATG search  indexing will give you the idx and stg files. When I analyse the stg files with some text editors like Textpad or Ultraedit , found some <html> and </html> tags and the contents inside these tags seems to be the same content of the temporary XHTML files , which will be generated during the search indexing for each indexed item. So I deicded to take the contents in between the <html> and </html> tags and save as XHTML file and it works for almost all indexed items. As you might know, these XHTML file’s <head> tag contains all the meta properties ( refine properties ) and the <body> tag have the text properties ( searchable properties ) for each indexed item.

Please note that the above steps are not an ATG recommended method to generate the XHTML files. I come across to this simple method to form the XHTML files and I am not 100% sure that this will give all the XHTML files of a search index . But I found this to be very useful for debugging any ATG search related issues.

Please find the below JAVA code written to genrate the XHTML files from an stg file. The main method is expecting the name of the stg file as a program argument. This will create a folder named “XHTML_Files” in the current directory and will save the XHTML files inside this folder.

Please find the below screen shot of XHTML files generated using this JAVA code :




Please find the below screen shot of a sample XHTML file generated using this JAVA code :





import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;

public class ATGSearchStgXHTMLGenerator {
public static void main(String args[]) {
try {
String fileName =  args[0];
String xhtmlfilePath = ".\\XHTML_Files";
if(fileName ==  null){
System.out.println("Give the stg file name as input");
return;
}
File stgfile = new File(fileName);
FileReader stgfileFis = new FileReader(stgfile);
BufferedReader stgfileBr = new BufferedReader(stgfileFis);
String readLine = stgfileBr.readLine();
String outToWrite = null;
String outFileName = null;
File outFile = new File(xhtmlfilePath);
boolean exists = outFile.exists();
if(!exists){
(new File(xhtmlfilePath)).mkdir();
}
FileWriter outFileWriter = null;
int lineNumber = 1;
int xhtmlFileCount = 1;
boolean canWriteXhtmlfile = false;
do {
if(readLine.contains("<html>")
&& readLine.contains("</html>")){
outToWrite = readLine.substring(readLine.indexOf("<html>"),readLine.indexOf("</html>")+7);
outFileName = "\\XHTMLFile_";
canWriteXhtmlfile = true;
}else if(readLine.contains("<html>")
&& !readLine.contains("</html>")){
System.out.println("In the STG file at lineNumber:" +lineNumber+" ERROR: html tag found, but no end tag for");
outToWrite = readLine.substring(readLine.indexOf("<html>"),readLine.length());
outFileName = "\\XHTMLFile_Error_";
canWriteXhtmlfile = true;
}
if(canWriteXhtmlfile){
outFile =  new File(xhtmlfilePath+outFileName+(xhtmlFileCount++)+".xhtml");
outFileWriter =  new FileWriter(outFile);
outFileWriter.write(outToWrite);
outFileWriter.close();
}
readLine = stgfileBr.readLine();
lineNumber++;
canWriteXhtmlfile = false;
} while (readLine != null);
System.out.println("The STG file is processed fully till lineNumber"+lineNumber);
} catch (Exception e) {
e.printStackTrace();
}
}
}

Comments

Popular posts from this blog

How to convert your Blogger Blog to PDF ?

You can use a website called "blogbooker" @  http://www.blogbooker.com/blogger.php   to convert your Blogger Blog to a PDF . Please find the steps below : 1. Save your blog as an xml using Blogger Settings - Other - Export Blog option 2. Go to the website " http://www.blogbooker.com/blogger.php " and select this XML , give your blog address and select the options like date range, page size, font, ... 3. Click the  "Create Your BlogBook" button to view and save your blog as PDF

Good features of Eclipse 3.6 (Eclipse Helios) JDT

Read the Eclipse Galileo features @  http://tips4ufromsony.blogspot.com/2011/10/good-features-of-eclipse35-eclipse.html New options in Open Resource dialog : The Open Resource dialog supports three new features: • Path patterns: If the pattern contains a /, the part before the last / is used to match a path in the workspace: • Relative paths: For example, "./T" matches all files starting with T in the folder of the active editor or selection: • Closer items on top: If the pattern matches many files with the same name, the files that are closer to the currently edited or selected resource are shown on top of the matching items list. MarketPlace :  Searching and adding new plugins for Eclipse have always been a challenge. The Eclipse Marketplace makes this much easier – it allows you to not only search a central location of all Eclipse plugins, but also allows you to find the most recent and the most popular plugins. Fix multiple proble...

Eclipse plugin: InstaSearch – for quick search

InstaSearch is an Eclipse plug-in for performing quick and advanced search of workspace files. This will index the files and when you search for some file contents, it will look with in this index and the search results will be faster, just like the Goolge instant search. It uses Lucene ( http://lucene.apache.org/ ) for indexing and fast searching of files in the workspace. Each search result file then can be previewed using few most matching and relevant lines. A double-click on the match leads to the matching line in the file. Main Features Instantly shows search results Shows a preview using relevant lines Periodically updates the index Matches partial words (e.g. case in CamelCase) Opens and highlights matches in files Searches JAR source attachments Supports filtering by extension/project/working set Download / Installation In Eclipse Helios (3.6) please install using the  Eclipse Marketplace from the Help menu http://marketplace.eclipse.org/s...

Google Chrome : Extensions and Applications

Following are some set of configurations that you can do after installing the Google Chrome : ------------------------------------------------------------------------------------------------------------------ To get the home button in the toolbar go to Options – Show Home button in the toolbar ------------------------------------------------------------------------------------------------------------------ World time application –> https://chrome.google.com/webstore/detail/jdhpjomiingppeefgnohkiapmnaeakoj?hl=en-US ------------------------------------------------------------------------------------------------------------------ Stop watch application –> https://chrome.google.com/webstore/detail/ggnidjbcahhbnleinchgobfnabopeioh?hl=en-US ------------------------------------------------------------------------------------------------------------------ chrome SEO extension –> https://chrome.google.com/webstore/detail/oangcciaeihlfmhppegpdceadpfaoclj ----...

ATG Search - how estore(commerce instance) forms the search engine SOAP URL ?

The comminucation between the Commerce box and the Search engine is through SOAP. Read  more about this architecture @  http://tips4ufromsony.blogspot.in/2011/11/atg-search-architectural-flow-search.html The commerce instance forms the SOAP url just like the below code: private URL getSearchEngineURL(SearchEngine engine) {       SearchEnvironmentHost h =  engine.getSearchEnvironmentHost();       SearchMachine hi = h.getSearchMachine() ;       return new URL( "http://" + hi.getHostname() + ":" + engine.getPort() + "/AEXmlService/" );   } So the commerce instance need the hi.getHostname()  and engine.getPort() to form the url. It is obtained as below: 1. The component / atg/commerce/search/refinement/ CommerceFacetSearchService has the siteName defined, which will be pointing to the environment name defined in the Search Project. Read  more about this search project setup @  http://...