The ATG search indexing will give you the idx and stg files. When I analyse the stg files with some text editors like Textpad or Ultraedit , found some <html> and </html> tags and the contents inside these tags seems to be the same content of the temporary XHTML files , which will be generated during the search indexing for each indexed item. So I deicded to take the contents in between the <html> and </html> tags and save as XHTML file and it works for almost all indexed items. As you might know, these XHTML file’s <head> tag contains all the meta properties ( refine properties ) and the <body> tag have the text properties ( searchable properties ) for each indexed item.
Please note that the above steps are not an ATG recommended method to generate the XHTML files. I come across to this simple method to form the XHTML files and I am not 100% sure that this will give all the XHTML files of a search index . But I found this to be very useful for debugging any ATG search related issues.
Please find the below JAVA code written to genrate the XHTML files from an stg file. The main method is expecting the name of the stg file as a program argument. This will create a folder named “XHTML_Files” in the current directory and will save the XHTML files inside this folder.
Please find the below screen shot of XHTML files generated using this JAVA code :
Please find the below screen shot of a sample XHTML file generated using this JAVA code :
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
public class ATGSearchStgXHTMLGenerator {
public static void main(String args[]) {
try {
String fileName = args[0];
String xhtmlfilePath = ".\\XHTML_Files";
if(fileName == null){
System.out.println("Give the stg file name as input");
return;
}
File stgfile = new File(fileName);
FileReader stgfileFis = new FileReader(stgfile);
BufferedReader stgfileBr = new BufferedReader(stgfileFis);
String readLine = stgfileBr.readLine();
String outToWrite = null;
String outFileName = null;
File outFile = new File(xhtmlfilePath);
boolean exists = outFile.exists();
if(!exists){
(new File(xhtmlfilePath)).mkdir();
}
FileWriter outFileWriter = null;
int lineNumber = 1;
int xhtmlFileCount = 1;
boolean canWriteXhtmlfile = false;
do {
if(readLine.contains("<html>")
&& readLine.contains("</html>")){
outToWrite = readLine.substring(readLine.indexOf("<html>"),readLine.indexOf("</html>")+7);
outFileName = "\\XHTMLFile_";
canWriteXhtmlfile = true;
}else if(readLine.contains("<html>")
&& !readLine.contains("</html>")){
System.out.println("In the STG file at lineNumber:" +lineNumber+" ERROR: html tag found, but no end tag for");
outToWrite = readLine.substring(readLine.indexOf("<html>"),readLine.length());
outFileName = "\\XHTMLFile_Error_";
canWriteXhtmlfile = true;
}
if(canWriteXhtmlfile){
outFile = new File(xhtmlfilePath+outFileName+(xhtmlFileCount++)+".xhtml");
outFileWriter = new FileWriter(outFile);
outFileWriter.write(outToWrite);
outFileWriter.close();
}
readLine = stgfileBr.readLine();
lineNumber++;
canWriteXhtmlfile = false;
} while (readLine != null);
System.out.println("The STG file is processed fully till lineNumber"+lineNumber);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Comments
Post a Comment