Thursday, December 27, 2007

Development: Optimize Prime Number Search Using Sieve of Eratosthenes

For some reason I have been playing one of those online games where you have a series of hacking or programming related challenges to pass. The most recent challenge I did was one involving a series of silly computations on large numbers involving retrieving all of that numbers preceding prime numbers. Not too difficult, however, performance was a key component of the exercise since you had to have your program give an answer in under 3 seconds. And to add to my frustrations, I decided to do the whole thing in Java due to Javas utilities for network communication.

First, I figured I would take the input number, loop through all numbers up to it, and add any prime number to a List. What resulted was something like below:

import java.util.ArrayList;
import java.util.List;


public class BadPrimer {

/*
* Generic prime number check
*/
public boolean isPrime(int number)
{
if (number == 1)
return false;

for (int x = 2; x < number; x++)
{
if ((number % x) == 0)
{
return false;
}
}

return true;
}

/**
* Find all prime numbers before number
* @param number
* @return
*/
public List findAllPrimes(int number)
{
List l = new ArrayList();

for (int x = 2; x <= number; x++)
{
if (isPrime(x))
{
l.add(x);
}
}

return l;
}

public static void main(String[] args) {
BadPrimer p = new BadPrimer();

List l = p.findAllPrimes(8000000);
for (java.util.Iterator i = l.iterator(); i.hasNext();)
{
Integer number = (Integer)i.next();
System.out.println(number);
}
}

}


Needless to say, that was not quite up to task. Worked great on small numbers, but for numbers in the area of 8,000,000 or so, it was just horrendously bad. So, back to the drawing board. My next thought was, why not generate the list of primes beforehand. But generating the list took just as long, and wasn’t very efficient. There are things I could have done to speed things up, but the bottom line was this just wasn’t efficient enough.

So I had to hit the books a little and find some of those really old school math algorithms that weird Greeks came up with thousands of years ago to do large calculations without the aid of computers. What I came up with was the Sieve of Eratosthenes, a way of taking a set of numbers in a list, and eliminating numbers that aren’t prime numbers, really quickly. So, with pre-generating a list of Prime numbers using the found algorithm, my next attempt looked like this:

import java.util.Arrays;

public class BadPrimer {
public int [] primeArray;

/**
* using the sieve of Eratosthenes algorithm, quite cool actually
*/
public void buildPrimeSet()
{
int MAX_SIZE = 10000000;

//create a boolean array and set all elements to true
boolean [] numArray = new boolean[MAX_SIZE];
Arrays.fill(numArray, true);

//we already know 0 and 1 are not prime numbers, so ignore them
numArray[0] = false;
numArray[1] = false;

//x will be out driving prime loop, y will be the elimination loop
for (int x = 2; x < MAX_SIZE; x++)
{
//if x is still true, it is a prime, and we need to keep it
if (numArray[x])
{
//advance our inner loop, starting at twice the current position of x, and start dividing
//If you use y++ as the counter advance, the generation takes to long
for (int y = (x * 2); y < MAX_SIZE; y += x)
{
//if y is already false, dont bother setting
if (numArray[y])
{
numArray[y] = false;
}
}
}
}

int totalCount = 0;

//find the total number of primes
//this could be done in the above loop, but for logic
//illistration, I kept it here
for (int x = 2; x < MAX_SIZE; x++)
{
if (numArray[x])
{
totalCount++;
}
}

//create our array based on the number of primes
//and populate the array with the prime numbers
//Note: there are better ways of doing this, such as adding
//the prime numbers in the above loop when they are found, but
//I did it this way for logic reason, not efficiency
primeArray = new int[totalCount];
int pos = 0; //a position counter
for (int x = 2; x < MAX_SIZE; x++)
{
if (numArray[x])
{
primeArray[pos] = x;
pos++;
}
}
}

/**
* Find all prime numbers before number
* @param number
* @return
*/
public int findAllPrimes(int number)
{
//using x as our arary position
//go through the list until the value in our
//array is greater than the number used. We now have the cut off position
//to mark all prime numbers lower than our current number
int x = 0;
while (primeArray[x] <= number)
{
x++;
}

return x;
}

public static void main(String[] args) {
BadPrimer p = new BadPrimer();

p.buildPrimeSet();
int primeCutOff = p.findAllPrimes(31337);

for (int x = 0; x < primeCutOff; x++)
{
System.out.println(p.primeArray[x]);
}
}

}


I saved you a lot of the iterations I went through. The final chunk above was after learning quite a few things. First, working with primitives is much faster than working with objects. I abandoned the idea of using a List of integers and instead went with an Array of integers. This made a large improvement in performance since I didn’t have to do any conversions of object types, and I found out the hard way that Java will take int from a List and convert them to Integer objects. The second biggest thing was eliminating unnecessary iterations in my loops. At first, I was going through the inner loop in buildPrimeSet() using y++ instead of y+=x. This wasted a lot of iterations since the most a number divided by something else will do is halve it (integer wise). So if x was 50, it didn’t make sense to test 51 – 99 since they aren’t valid multiples of X, and that kind of defeats the purpose of the Sieve of Eratosthenes algorithm, which states to eliminate numbers that are multiples of the current prime.

So, the end result, what took hours to run before gets run in a matter of milliseconds now. Valuable lesson... I might need to brush up on some of my algorithm development skills. Regardless, this took care of the first part of the calculation in finding all the prime numbers for the requested number, the remaining parts will remain a mystery so I don’t give away the answer to the challenge.

Monday, December 10, 2007

BIRT: Dynamic Images

There was a question on the BIRT newsgroup about how to put a dynamic image into a report file. This is typically done in multi-company sites, where a different header or logo will appear based on some environmental parameter. In the following example I will show how to do this with a report parameter, although the same thing can be done based on a URL.

The following example will show one of two logos, depending on the value of a report parameter. If the value is equal to 0, we will show the Eclipse logo at

http://www.eclipse.org/eclipse.org-common/themes/Phoenix/images/eclipse_home_header.jpg

If the value is 1, we will show the BIRT Exchange logo:

http://www.birt-exchange.com/themes/birt/images/homep_r1_c1.gif

This image will reside in the Master Page of the report. So lets take a look at how to do this.

  1. First, create a new report called DynamicImage.rptDesign

Figure 1. Create new Report

  1. In the new report design, open up the Master Page tab.
  2. Drag over a grid, and make it 2 column, 1 row.
  3. In the 1st column, drag over an Image component.
  4. For the URL, put in “http://www.eclipse.org/eclipse.org-common/themes/Phoenix/images/eclipse_home_header.jpg”. Be sure to keep the quotes. This is more as a place holder than anything else.

Figure 2. Master Page with Image

  1. Create a new report parameter called imageSelector. It needs to be an Integer type. In the following screen shot, I am using a List Box and a List of Values for this purpose.

Figure 3. Create new Parameter

  1. Select the Image.
  2. Open the Script tab in the Report Designer
  3. Change the Event to onRender.
  4. Use the following BIRT Script
if (params["imageSelector"] == 0)
{
this.setURL("http://www.eclipse.org/eclipse.org-common/themes/Phoenix/images/eclipse_home_header.jpg");
}
else
{
this.setURL("http://www.birt-exchange.com/themes/birt/images/homep_r1_c1.gif");
}

  1. Save and run.

When you set the value to Eclipse, it will show the Eclipse logo. When set to the BIRT Exchange, it will show that logo.

Figure 4. The Final Report

Saturday, December 08, 2007

BIRT: Passing Serialized Objects as Parameters

Recently I had a question about being able to serialize Java Objects and use them as the data source. I have a few possible solutions, but I wanted to look at one of the options a little deeper.

In this scenario, the Java Objects are actually generated outside of the report application, and need to be passed to the report as a parameter. The easiest way was to create a Java class that extends the java.io.Serializable class. In additional to using the Serializable class, I also want to URL encode the serialized class. What this means is that I need to Decode and Deserialize the class inside of the report.

The following is a walkthrough of the classes that I built, and an external Java Event Handler for a Report Design that will handle the object.

To demonstrate this, I used the following Java class.

package com.digiassn.blogspot;

import java.io.Serializable;

public class EmployeeParameter implements Serializable {
static final long serialVersionUID = 11111112;

private int employeeNumber;
private String date;
public int getEmployeeNumber() {
return employeeNumber;
}
public void setEmployeeNumber(int employeeNumber) {
this.employeeNumber = employeeNumber;
}
public String getDate() {
return date;
}
public void setDate(String date) {
this.date = date;
}


}



Not much to this class. The serialVersionUID was defined to avoid issues with the URLEncode and URLDecode process, which if not implicitly defined, causes the URLDecode of the object to fail. This is also the same reason I used String as my Date type instead of the java.util.Date. For some reason there were errors with the Date and its use of the serialVersionUID (yet, String, and a few other classes didn’t have any issues).

I wanted to test to see if the whole serialize and URLEncode process would work, so I created the following Unit test.

package com.digiassn.blogspot.tests;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;
import java.net.URLEncoder;

import junit.framework.TestCase;

import com.digiassn.blogspot.EmployeeParameter;

public class TestEmployeeParameter extends TestCase {
private EmployeeParameter param;

protected void setUp() throws Exception {
super.setUp();

param = new EmployeeParameter();

param.setEmployeeNumber(1);
param.setDate("01-01-2005");
}

protected void tearDown() throws Exception {
super.tearDown();
}

public void testGetEmployeeNumber() {
assertEquals(1, param.getEmployeeNumber());
}

public void testGetDate() {
assertEquals("01-01-2005", param.getDate());
}

public void testSerialize()
{
ByteArrayOutputStream bos = null;
try {
bos = new ByteArrayOutputStream();
ObjectOutputStream obj_out = new ObjectOutputStream (bos);
obj_out.writeObject ( param );
} catch (IOException e) {
e.printStackTrace();
fail("Error with serialization\n\n");
}

String encoded = bos.toString();
try {
encoded = URLEncoder.encode(encoded, "UTF-8");
} catch (UnsupportedEncodingException e1) {
e1.printStackTrace();

fail("Unsupported formatting");
}
System.out.print("The serialized output is: " + encoded);

try {
String toDecode = URLDecoder.decode(encoded, "UTF-8");
ByteArrayInputStream bis = new ByteArrayInputStream(toDecode.getBytes());
ObjectInputStream obj_in = new ObjectInputStream (bis);

Object obj = obj_in.readObject();

if (obj instanceof EmployeeParameter)
{
assertEquals(1, ((EmployeeParameter)obj).getEmployeeNumber());
assertEquals("01-01-2005", ((EmployeeParameter)obj).getDate());
}
} catch (IOException e) {
e.printStackTrace();
fail("Error with deserialization");
} catch (ClassNotFoundException e) {
e.printStackTrace();
fail("Error with deserialization");
}

}

}


So, my next task is to create the Event Handler. In BIRT, event handlers need to extend their appropriate event handler type. Since this is a Scripted Data Source, I need to extend the org.eclipse.birt.report.engine.api.script.eventadapter.ScriptedDataSetEventAdapter type.

package com.digiassn.blogspot.handlers;

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;

import org.eclipse.birt.report.engine.api.script.IReportContext;
import org.eclipse.birt.report.engine.api.script.IUpdatableDataSetRow;
import org.eclipse.birt.report.engine.api.script.ScriptException;
import org.eclipse.birt.report.engine.api.script.eventadapter.ScriptedDataSetEventAdapter;
import org.eclipse.birt.report.engine.api.script.instance.IDataSetInstance;

import com.digiassn.blogspot.EmployeeParameter;

public class EmployeeEventHandler extends ScriptedDataSetEventAdapter {
private EmployeeParameter param;
int count = 0;

@Override
public boolean fetch(IDataSetInstance dataSet, IUpdatableDataSetRow row) {

if (count < 1)
{
try {
if (param == null)
{
row.setColumnValue("employeeNumber", -1);
row.setColumnValue("setDate", "Error, param is null");
}
else
{
row.setColumnValue("employeeNumber", param.getEmployeeNumber());
row.setColumnValue("setDate", param.getDate());
}
count++;

return true;
} catch (ScriptException e) {
e.printStackTrace();
}
}


return false;
}

@Override
public void beforeOpen(IDataSetInstance dataSet,
IReportContext reportContext) {

String myParam = null;

try {
myParam = URLDecoder.decode((String) reportContext.getParameterValue("employeeParam"), "UTF-8");
} catch (UnsupportedEncodingException e1) {
e1.printStackTrace();
}
System.out.println("Got parameters");
ByteArrayInputStream bis = new ByteArrayInputStream(myParam.getBytes());

try {
ObjectInputStream obj_in = new ObjectInputStream(bis);

param = (EmployeeParameter)obj_in.readObject();
} catch (IOException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
}




So, now I have the event handler made, I need to create the report. To create the report I use the following steps:
1: Create a new report.
2: Create a Scripted Data Source.
3: Create a data set.
4: Add in the following two fields into the data set
-employeeNumber
-setDate
5: Add a report parameter called employeeParam.
6: Drag the data set over to the report design window.
7: Select the new data set in the Data Explorer.
8: In the Property Editor. Select the Event Handler tab.
9: Under the Event Hander, browse and select the Java Object Event handler.

Now, when I run the report, my report will use the external object. I need to pass in the serialized version of the object. In my case, I used the following test string.

%C2%AC%C3%AD%05sr5com.digiassn.blogspot.EmployeeParameter%C2%A9%C5%A0%C3%88%02%02I%0EemployeeNumberL%04datet%12Ljava%2Flang%2FString%3Bxp%01t%0A01-01-2005



I would recommend that you use the Unit test and copy the serialized object from the standard output device.

And that’s it. Now, I deployed this to an Apache Tomcat server using the BIRT Web Viewer to view my report. I had to copy the .class files for EmployeeParameter and the event handler into the shared class folder under Tomcat. I also needed copy all the JAR files in the Web Viewer into the shared lib folder to get around some silly bug in the Web Viewer (note: this was not a problem prior to version 2.2.1).

I still ran into one outstanding issue with this approach. If I tried to use the serialized object in a URL parameter in a browser, I would get a SaxParser error. This wasn’t a problem if I passed the parameter using the dialog in the web viewer, or if I assigned the parameter using the Report Engine API, so it must have something to do with the way the Viewer app handles its SOAP requests.

Monday, November 26, 2007

BIRT: Dynamic Adding Tables and Columns

I had a request from a co-worker the other day wanting to know how Columns could be Dynamically added to Tables in BIRT. The following example is a Initialize Event Handler used in a BIRT report to dynamically create a table, add a few columns, and a single row. Once added to the report, the script then finds the Table in the report, and adds a single column to the report.


//get a reference to the ElementFactory
elementFactory = reportContext.getReportRunnable().designHandle.getElementFactory();

//create a new table with 3 columns
dynamicTable = elementFactory.newTableItem("myNewTable", 3);
dynamicTable.setWidth("100%");

//set reference to the first detail row
myNewRow = dynamicTable.getDetail().get(0);

//get references to the first 3 cells
firstCell = myNewRow.getCells().get(0);
secondCell = myNewRow.getCells().get(1);
thirdCell = myNewRow.getCells().get(2);

//create the cells and add
label = elementFactory.newLabel("firstCellLabel");
label.setText("First Cell");
firstCell.getContent().add(label);

label = elementFactory.newLabel("secondCellLabel");
label.setText("Second Cell");
secondCell.getContent().add(label);

label = elementFactory.newLabel("thirdCellLabel");
label.setText("Third Cell");
thirdCell.getContent().add(label);

//although it is not in the autocomplete, getBody is a method of the ReportDesignHandle class
reportContext.getReportRunnable().designHandle.getBody().add(dynamicTable);

//now, to demonstrate, get a reference to the table fromt he report design, this automatically cast
//to a TableHandel type
dynamicTable = reportContext.getReportRunnable().designHandle.findElement("myNewTable");

//now insert the column to the right of the indicated column position. Column number is 1 based, not 0 based
dynamicTable.insertColumn(3, 1);

//get the first detail row and the 4th column. This is 0 based
myNewRow = dynamicTable.getDetail().get(0);
forthCell = myNewRow.getCells().get(3);

//create a new label
label = elementFactory.newLabel("forthCell");
label.setText("dynamic cell");
forthCell.getContent().add(label);

Saturday, October 20, 2007

BIRT: Parameterized Page Break Interval

A useful trick to know with BIRT is how to control the pagination for report output. It would be nice if you could present the user with a parameter to control the number of rows displayed per page and let them decide. While there is the page_break_interval property in a table, it is not easily exposable. Thanks to Jason Weathersbys of BirtWorlds help, I have a simply answer on how to create a parameter and assign the page_break_interval the value I need.

What I did was create a user parameter called NewParameter. I know, clever name, but this was just at test scenario. I have a table called counTable that will output the results of a dataset. Then, I used the following script in the tables onPrepare event handler (although this could be done in the reports Initialize event handler also):

tableContext = reportContext.getReportRunnable().designHandle.getDesignHandle().findElement("countTable");

tableContext.setPageBreakInterval(params["NewParameter"]);


That’s it. Now, when the report is displayed, a parameter will prompt the user to enter the number of rows to break each page on. This is useful when used with the Progressive Viewing feature to allow the user to really control pagination when embedding BIRT into an application. The completed report is below in XML.

<?xml version="1.0" encoding="UTF-8"?>
<report xmlns="http://www.eclipse.org/birt/2005/design" version="3.2.14" id="1">
<property name="author">John Ward</property>
<property name="createdBy">Eclipse BIRT Designer Version 2.2.0.v20070620 Build &lt;2.2.0.v20070626-1003></property>
<property name="units">in</property>
<text-property name="title">Page Break Interval</text-property>
<property name="comments">Copyright (c) 2007 &lt;&lt;Your Company Name here>></property>
<html-property name="description">A test report that will generate a data set with 6000 numbers, then assign the tables page break interval using a parameter.</html-property>
<text-property name="displayName">Blank Report</text-property>
<property name="iconFile">/templates/blank_report.gif</property>
<parameters>
<scalar-parameter name="NewParameter" id="6">
<property name="valueType">static</property>
<property name="dataType">string</property>
<text-property name="promptText">Page Break Interval</text-property>
<property name="controlType">text-box</property>
<property name="defaultValue">5</property>
<structure name="format">
<property name="category">Unformatted</property>
</structure>
</scalar-parameter>
</parameters>
<data-sources>
<script-data-source name="Data Source" id="23"/>
</data-sources>
<data-sets>
<script-data-set name="Data Set" id="24">
<list-property name="resultSetHints">
<structure>
<property name="position">0</property>
<property name="name">count</property>
<property name="dataType">any</property>
</structure>
</list-property>
<list-property name="columnHints">
<structure>
<property name="columnName">count</property>
</structure>
</list-property>
<structure name="cachedMetaData">
<list-property name="resultSet">
<structure>
<property name="position">1</property>
<property name="name">count</property>
<property name="dataType">any</property>
</structure>
</list-property>
</structure>
<property name="dataSource">Data Source</property>
<method name="open"><![CDATA[x = 0;]]></method>
<method name="fetch"><![CDATA[if (x < 6000)
{
x++;
row["count"] = x;

for (y = 0; y < 6000; y++);

return true;
}

return false;]]></method>
</script-data-set>
</data-sets>
<page-setup>
<simple-master-page name="Simple MasterPage" id="2">
<page-footer>
<text id="3">
<property name="contentType">html</property>
<text-property name="content"><![CDATA[<value-of>new Date()</value-of>]]></text-property>
</text>
</page-footer>
</simple-master-page>
</page-setup>
<body>
<table name="countTable" id="27">
<property name="width">100%</property>
<property name="dataSet">Data Set</property>
<list-property name="boundDataColumns">
<structure>
<property name="name">count</property>
<expression name="expression">dataSetRow["count"]</expression>
<property name="dataType">any</property>
</structure>
</list-property>
<method name="onPrepare"><![CDATA[tableContext = reportContext.getReportRunnable().designHandle.getDesignHandle().findElement("countTable");

tableContext.setPageBreakInterval(params["NewParameter"]);]]></method>
<column id="36"/>
<header>
<row id="28">
<cell id="29">
<label id="30">
<text-property name="text">count</text-property>
</label>
</cell>
</row>
</header>
<detail>
<row id="31">
<cell id="32">
<data id="33">
<property name="resultSetColumn">count</property>
</data>
</cell>
</row>
</detail>
<footer>
<row id="34">
<cell id="35"/>
</row>
</footer>
</table>
</body>
</report>

Sunday, October 07, 2007

BIRT: Progressive Viewing during Render

A coworker and I recently engaged is a discussion about the requirements for a client of ours. Although this ultimately wasn’t the solutions we ended up going with, one of the possible options for this client was to do what is called Progressive Viewing in BIRT. Progressive Viewing is the ability for BIRT to view a particular page out of a report document while the Report Document itself is still rendering. If you have ever used the commercial versions of Actuate IServer or ERD Pro, this is familiar. Page 1 gets displayed, while pages 2 – 10 are still being rendered, then when pages 2 – 10 are renders, the next range of pages gets rendered, and the user can view what is ready to view. Since I wanted to know how to do this for an unrelated project, I decided to research how this is done for future reference.

Surprisingly, this was much easier to implement than I would have thought, with BIRT defining an Interface that acts as a Callback when pages are ready. The way it works is for each page event, a callback method is called with the page number, a checkpoint status of either true or false, which means all pages before this page are ready for viewing, and a checkpoint has been reached, and a reference to the Report Document being rendered so you do not need to create a separate Report Document interface. In the OnPage method, you define what you want to do with it. In the case of an interactive viewer, you would simply notify the parent process that the next batch of pages are ready for the user to navigate forward. In the following example, I am rendering out the “Checkpoint” pages when they are ready, and informating the user via the standard console that the next page range is ready. The following bit is a JUnit Test I used for testing the rendering of a custom emitter I was using for a client, slightly modified to do Progressive Viewing rendering. In addition, it also demonstrates doing a Run then Render task in BIRT to allow for pagination.



package test;

import static org.junit.Assert.fail;

import java.util.logging.Level;

import org.eclipse.birt.core.exception.BirtException;
import org.eclipse.birt.core.framework.Platform;
import org.eclipse.birt.report.engine.api.EngineConfig;
import org.eclipse.birt.report.engine.api.EngineException;
import org.eclipse.birt.report.engine.api.IPageHandler;
import org.eclipse.birt.report.engine.api.IRenderTask;
import org.eclipse.birt.report.engine.api.IReportDocument;
import org.eclipse.birt.report.engine.api.IReportDocumentInfo;
import org.eclipse.birt.report.engine.api.IReportEngine;
import org.eclipse.birt.report.engine.api.IReportEngineFactory;
import org.eclipse.birt.report.engine.api.IReportRunnable;
import org.eclipse.birt.report.engine.api.IRunTask;
import org.eclipse.birt.report.engine.api.RenderOption;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;

public class TestXMLEmitter {
private String BIRT_HOME = "C:/BIRT_RUNTIME_2_2/birt-runtime-2_2_0/ReportEngine/";
private String REPORT_DESIGN_FILE = "C:/Workbench/pageBreakInterval.rptdesign";
private String OUTPUT_FILE_LOCATION_RENDER = "C:/temp/myReportOutputRender.xml";
private String OUTPUT_FILE_LOCATION_RUN_RENDER = "C:/temp/myReportOutputRunRender.xml";
private IReportEngine engine;

@Before
public void setUp() throws Exception {
EngineConfig config = new EngineConfig();
config.setBIRTHome(BIRT_HOME);
config.setLogConfig("C:/TEMP/", Level.ALL);

try {
Platform.startup( config );
IReportEngineFactory factory = (IReportEngineFactory) Platform.createFactoryObject( IReportEngineFactory.EXTENSION_REPORT_ENGINE_FACTORY);
engine = factory.createReportEngine(config);
engine.changeLogLevel(Level.ALL);
} catch (RuntimeException e) {
e.printStackTrace();
}
}

@After
public void tearDown()
{
engine.shutdown();
Platform.shutdown();
}

/*@Test
//Removed, for our purposes, we do not need to do a run and render task
public void testReportRun() throws Exception {
try {
//once the engine is started, create the report task
IReportRunnable design = engine.openReportDesign(REPORT_DESIGN_FILE);

IRunAndRenderTask task = engine.createRunAndRenderTask(design);

//RenderOption renderOption = new RenderOption();
RenderOption renderOption = new RenderOption();

renderOption.setOutputFormat("MyXMLEmitter");
renderOption.setOutputFileName(this.OUTPUT_FILE_LOCATION_RUN_RENDER);

renderOption.setOutputStream(System.out);
task.setRenderOption(renderOption);

task.setParameterValue("NewParameter", 5);

task.run();

task.close();

} catch (RuntimeException e) {
engine.getLogger().log(Level.SEVERE, e.getMessage(), e);
fail(e.getMessage());
}
} */

@Test
public void testReportRender() throws Exception {
try {
//create the report task and open the report design file
IReportRunnable design = engine.openReportDesign(REPORT_DESIGN_FILE);
IRunTask runTask = engine.createRunTask(design);

//For our test report, we have a parameter called NewParameter, that
//is actually setting the pageBreakInterval property. We need to set this
//before running our report
runTask.setParameterValue("NewParameter", 5);

//Define the callback handler for new page events
runTask.setPageHandler(new MyPageHandler());

//run and close the report run task
runTask.run("C:/temp/tempDoc.rptDocument");
runTask.close();

//Inform the user running is complete
System.out.println("Done running");
} catch (RuntimeException e) {
engine.getLogger().log(Level.SEVERE, e.getMessage(), e);
fail(e.getMessage());
}
}

/**
* Class MyPageHandler
*
* This class will handle new page events for the Report Run Task
*/
public class MyPageHandler implements IPageHandler {

//Define local variables for the callback class
private String OUTPUT_FILE_LOCATION_RUN_RENDER = "C:/temp/myReportOutputRunRender";
private int lastCheckpoint = 0;

/**
* void onPage
*
* @param pageNumber - the page number that is currently be called for event
* @param readyForViewing - is this event a Check POint event
* @param reportDocument - instance to the report document
*/
public void onPage(int pageNumber, boolean readyForViewing, IReportDocumentInfo reportDocument) {
//we only want to do something if this is a checkpoint event
if (readyForViewing)
{
//Just let the user know that the next page ranges are ready, then set the last check point to the
//current page
System.out.println("Pages " + lastCheckpoint + " through " + pageNumber + " are ready for viewing");
lastCheckpoint = pageNumber;

try {
//open the report document then create the render task from it
IReportDocument iReportDocument = reportDocument.openReportDocument();
IRenderTask task = engine.createRenderTask(iReportDocument);

//Set Render context to handle url and image locataions
RenderOption renderOption = new RenderOption();

//set the output file format and the emitter output in the render options
renderOption.setOutputFileName(OUTPUT_FILE_LOCATION_RENDER + pageNumber + ".xml");
renderOption.setOutputFormat("MyXMLEmitter");
task.setRenderOption(renderOption);

//Render Page and close the render task
task.setPageNumber(pageNumber);
task.render();
task.close();

System.out.println("Page " + pageNumber + " is ready for viewing");
} catch (EngineException e) {
e.printStackTrace();
} catch (BirtException e) {
e.printStackTrace();
}
}
}
}
}

Wednesday, September 19, 2007

Repair: USB Device Refuses to Initialize

The past few days have been PC hell for me. Some strange behavior has cropped up my my system that has earned it the right to be re-formatted. Since I am traveling however, I don’t have access to my backup drives or my install CD’s for my new system. So in the meantime, I needed to work around the issues I was having.

A few months ago I wrote about Hacking my Motorola Razr to use as a broadband modem while I travel. This has been working great for the past few months, up until about 3 days ago. Now my system has been behaving odd for the past few days. Things like the “Safely remove this Device” icon disappearing. To resolve that I had to go under Device Manager, right-mouse click on any of the USB disks I had (in my case it was my IPod), go to Policies, and manually click on the Safetly Remove Hardware link in the description text. Ever since that issue has not appeared.

However, the issue that has been driving me nuts is the fact that my Razr stopped working. Everytime I would connect it to my machine, I would get the New Hardware Found wizard and an error stating “An error occured during the installation of this device Fatal Error during installation…”. Now I searched and searched for a few days and found nothing but garbage advice. Things like totally remove all USB devices under Device Manager did not resolve this problem (nor do I think they are the correct solution). I cleared every USB device entry under the registry to reinitialize the know USB device catalog (not something I recommend, and even if you do, backup backup backup). This also didn’t fix the issue. I reinstalled the drivers, updated the drivers, yet nothing would get this device to work. I could plug it into another machine and it would work fine. I could even plug it in running VMWare and set VMWare to use it, with VMWare recognizing the device and it working correctly. So that narrows it down to something software related.

So how did I fix the problem? Well, I went back to good old fasion detective work to figure it out, and worked on blind faith and a prayer. I did a search for any files modified within the past day hoping that some sort of log was laying around. What I came across was 2 log files, setupapi.log and setupact.log (information about these files can be found here). When I opened the file I came across the following entry in the setupapi.log:

#-019 Searching for hardware ID(s): usb\vid_22b8&pid_2a62&rev_0001,usb\vid_22b8&pid_2a62
#-018 Searching for compatible ID(s): usb\class_02&subclass_02&prot_01,usb\class_02&subclass_02,usb\class_02
#-198 Command line processed: C:\WINDOWS\system32\services.exe
#I022 Found "USB\VID_22B8&PID_2A62" in C:\WINDOWS\inf\oem44.inf; Device: "Motorola USB Modem"; Driver: "Motorola USB Modem"; Provider: "Motorola"; Mfg: "Motorola"; Section name: "USB1XCDMA".
#I023 Actual install section: [USB1XCDMA.NT]. Rank: 0x00000001. Effective driver date: 06/18/2007.
#-166 Device install function: DIF_SELECTBESTCOMPATDRV.
#I063 Selected driver installs from section [USB1XCDMA] in "c:\windows\inf\oem44.inf".
#I320 Class GUID of device remains: {4D36E96D-E325-11CE-BFC1-08002BE10318}.
#I060 Set selected driver.
#I058 Selected best compatible driver.
#-166 Device install function: DIF_INSTALLDEVICEFILES.
#I124 Doing copy-only install of "USB\VID_22B8&PID_2A62\5&B4DCBF&0&2".
#-166 Device install function: DIF_REGISTER_COINSTALLERS.
#I056 Coinstallers registered.
#-166 Device install function: DIF_INSTALLINTERFACES.
#-011 Installing section [USB1XCDMA.NT.Interfaces] from "c:\windows\inf\oem44.inf".
#I054 Interfaces installed.
#-166 Device install function: DIF_INSTALLDEVICE.
#E151 Coinstaller 3 of 3 failed. Error 1603: Fatal error during installation.

Huh… what is the CoInstaller? I checked the next file hoping to find another clue:

WdfCoInstaller: [09/18/2007 16:38.06.531] Update process returned error code :error(1603) Fatal error during installation.
. Possible causes are running free version of coinstaller on checked version of OS or vice versa. Look at the Kmdf documentation as to what steps need to be followed to install the correct versionof the coinstaller

What is this? Another clue. So there is definitely a problem with this WdfCoInstaller. A nother search on my system turned up the final clue, a log file called Wdf01005Inst.log. I took a look inside of that:

0.047: 2007/09/18 16:38:06.312 (local)
0.047: c:\6cdccf09bac8c0c517fbc0e694\update\update.exe (version 6.3.4.0)
0.047: Hotfix started with following command line: /quiet
0.047: In Function GetBuildType, line 1170, RegQueryValueEx failed with error 0x2
0.047: C:\WINDOWS\system32\DRIVERS\wdf01000.sys is Present
0.047: FileVersion of C:\WINDOWS\system32\DRIVERS\wdf01000.sys is Greater Than 1.5.5600.0
0.047: Condition Check for Line 1 of PreRequisite returned FALSE
0.047: ReadStringFromInf: UpdSpGetLineText failed: 0xe0000102
0.047: Wdf01005 Setup encountered an error: Setup cannot continue because one or more prerequisites required to install Wdf01005 failed. For More details check the Log File c:\windows\Wdf01005Inst.log
0.062: ReadStringFromInf: UpdSpGetLineText failed: 0xe0000102
0.062: Setup cannot continue because one or more prerequisites required to install Wdf01005 failed. For More details check the Log File c:\windows\Wdf01005Inst.log
0.062: Update.exe extended error code = 0xf0f4
0.062: Update.exe return code was masked to 0x643 for MSI custom action compliance.


This solidifies it, I had tracked down the problem. I made a backup copy of the C:\Windows\system32\Drivers\wdf01000.sys and deleted it. When I re-ran the install, it worked fine. Thanks to logs I was able to track down the issue, not the fall back street level tech resolution of reformat and reinstall OS.

Although I fixed my issues, I am still going to reformat my system. I haven’t installed anything or changed anything that would cause this previously working setup to change, especially something that relies on Kernel Mode drivers. So without any relevant changes to my system, with yet changing behavior, it would leave me to believe something else is afoot. So better safe than sorry. time, unfortunately, is not something I have the luxury of having at the moment to track down issues.

Thursday, August 23, 2007

BIRT: Writing an Emitter

Another article I have had on the back burner for the past few months is for writing emitters using BIRT. Emitters are one of the many extension points in BIRT. Emitters are the output rendering mechanism in BIRT. When you run a BIRT report and get the HTML output from the Report Engine, that HTML is created using the BIRT HTML Emitter. Ditto for the PDF output. There are even Office emitters out there.

The way that I typically look at Emitters is that they are a mechanism for getting BIRT report output to some output mechanism. That output can be a file, a stream, and an IPC listener, whatever. This gives BIRT the ability to serve as more than just a Report Engine, but possibly as a middleware also. Why would you do this? Despite the added bloat, it allows you take advantage of the BIRT internal mechanisms for sorting, aggregating, formatting, and a filtering data. Of course, more often than not, you will just use an Emitter to output file formats.

So how do emitters work? An emitter is an Eclipse Plug-in, and when writing one, you need to set up as an Eclipse Plug-In project. Since it is an Eclipse Plug-in, it requires that you set up the proper entries in the plug-in.xml and Manifest.MF files. This can be a bit tedious to do, and in past experiences required a bit of trial and error on my part.

Since it is an Eclipse Plugin, there are two classes that need to be created. The Activator, which is usually automatically created when you create a new plug-in project, and the actual emitter. The Activator extends the org.eclipse.core.runtime.Plugin class. The code for this is automatically generated upon project creation. You will only need to be sure that the plug-in.xml file is pointing to the correct Activator.

The Emitter class itself is an extension of the org.eclipse.birt.report.engine.emitter.ContentEmitterAdapter class. This is where all the magic happens. The emitter class will simply implement certain methods based on the requirements of the emitter. In the following example emitter code, I wrote an emitter that will generate a very generic XML file using JaxB.

package com.        .birt.emitter;

import java.io.OutputStream;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;
import javax.xml.bind.PropertyException;

import org.eclipse.birt.report.engine.content.IBandContent;
import org.eclipse.birt.report.engine.content.IReportContent;
import org.eclipse.birt.report.engine.content.IRowContent;
import org.eclipse.birt.report.engine.content.ITextContent;
import org.eclipse.birt.report.engine.emitter.ContentEmitterAdapter;
import org.eclipse.birt.report.engine.emitter.IEmitterServices;

import com. .birt.emitter.xml.ObjectFactory;
import com. .birt.emitter.xml.Root;
import com. .birt.emitter.xml.Root.Office;
import com. .birt.emitter.xml.Root.Office.Employees;

public class XMLEmitter extends ContentEmitterAdapter {
private ObjectFactory xmlObjectFactory;
private Root xml;
private Office currentOffice;
private OutputStream reportOutputStream;

@Override
public void initialize(IEmitterServices service) {
super.initialize(service);

//initialize my object factory for the XML file and create a root element in the XML file
this.xmlObjectFactory = new ObjectFactory();
this.xml = xmlObjectFactory.createRoot();

//create an OutputStream to output to the console
reportOutputStream = service.getRenderOption().getOutputStream();
}

@Override
public void startRow(IRowContent row) {
super.startRow(row);

//When we encounter a new row, and it is a HEADER for a group, we need
//to create a new office element
if (row.getBand().getBandType() == IBandContent.BAND_GROUP_HEADER)
{
this.currentOffice = xmlObjectFactory.createRootOffice();
}
}

@Override
public void endRow(IRowContent row) {
super.endRow(row);

//Once we encounter the end row and this is a HEADER row, we need to add
//this to our XML structure under the Office sections
if (row.getBand().getBandType() == IBandContent.BAND_GROUP_HEADER)
{
xml.getOffice().add(currentOffice);
}
}

@Override
public void end(IReportContent report) {
super.end(report);

//At the end of our report generation, create a new JaxB Marshaller object, and output the
//formatted output to the console
try {
JAXBContext jaxContext = JAXBContext.newInstance("com. .birt.emitter.xml", Root.class.getClassLoader()) ;
Marshaller xmlOutputWriter = jaxContext.createMarshaller();
xmlOutputWriter.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);

xmlOutputWriter.marshal(xml, reportOutputStream);
} catch (PropertyException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (JAXBException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

@Override
public void startText(ITextContent text) {
super.startText(text);

//If this is a new text element (data in a row), then we need
//to go ahead and add this as our current office name if this is a header.
//If this is not a header, then we know it is employee information, and we
//need to add this to our list of office employees
IRowContent row = (IRowContent)text.getParent().getParent();
if (row.getBand().getBandType() == IBandContent.BAND_GROUP_HEADER)
{
currentOffice.setName(text.getText());
}
else
{
Employees currentEmployee = xmlObjectFactory.createRootOfficeEmployees();

currentEmployee.setName(text.getText());
currentOffice.getEmployees().add(currentEmployee);
}
}
}


You will notice it uses a very SAX type of processing, where each element gets a Start and End method. Each of these element equates to a type of designer element. In the above example, we are only looking for new rows, and new text elements. This emitter makes the following assumptions:

-That the report design file is a single table

-That the table is grouped by an Office ID

-That the data element in the Detail row only contains the Employees name.

So in other words, this particular emitter is not a general purpose emitter, it is designed with a specific report design file in mind.

Setting up the Emitter is another task in itself. I used the Eclipse 3.3 Plug-In configuration editor to set mine up, however, you can manually edit yours by hand. The first thing I did was to configure the general purpose things, such as name and ID, and maek sure the activator was correct.

Figure 1. Emitter configuration

Next I need to configure the Dependencies. In this case, the BIRT Report Engine and the Eclipse Core.

Figure 2. Dependencies

Next, I specified the packages to export during build needed for Runtime. I specified the 3 packages I need to export in my Emitter, the package with the Activator, the package with the emitter itself, and the package with the JaxB generated classes. In the classpath, I specify the jars I need for JaxB to work properly.

Figure 3. Runtime Exported Classes and Jars

Next, I specify how to configure my extensions. I create a new emitter extension, specify the class to use, the format, which is important when I specify the output format for BIRT to use, I will use this string. I specify a generic MIME type to use, and specify an ID, in which I used my package name. I also specify no-pagination, which is important if you are building a BIRT emitter that will support multiple pages, such as the PDF emitter. This will influence the behavior of the document generator inside of BIRT, and will add more legwork to the emitter.

Figure 4. Extension configuration

That’s pretty much it. Now, when I want to test this, I right-mouse click on my project and specify Export, and Deployable Plug-Ins and Fragments. I usually export to my BIRT Runtime folder for testing, and will write a few Unit Tests to test the execution of the emitter. Below is an example of the output I get from my emitter.



<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Root xmlns="http://www.example.org/OfficeLayout">
<Office>
<name>1</name>
<Employees>
<name>Murphy, Diane</name>
</Employees>
<Employees>
<name>Patterson, Mary</name>
</Employees>
<Employees>
<name>Firrelli, Jeff</name>
</Employees>
<Employees>
<name>Bow, Anthony</name>
</Employees>
<Employees>
<name>Jennings, Leslie</name>
</Employees>
<Employees>
<name>Thompson, Leslie</name>
</Employees>
</Office>
<Office>
<name>2</name>
<Employees>
<name>Firrelli, Julie</name>
</Employees>
<Employees>
<name>Patterson, Steve</name>
</Employees>
</Office>
<Office>
<name>3</name>
<Employees>
<name>Tseng, Foon Yue</name>
</Employees>
<Employees>
<name>Vanauf, George</name>
</Employees>
</Office>
<Office>
<name>4</name>
<Employees>
<name>Bondur, Gerard</name>
</Employees>
<Employees>
<name>Bondur, Loui</name>
</Employees>
<Employees>
<name>Hernandez, Gerard</name>
</Employees>
<Employees>
<name>Castillo, Pamela</name>
</Employees>
<Employees>
<name>Gerard, Martin</name>
</Employees>
</Office>
<Office>
<name>5</name>
<Employees>
<name>Nishi, Mami</name>
</Employees>
<Employees>
<name>Kato, Yoshimi</name>
</Employees>
</Office>
<Office>
<name>6</name>
<Employees>
<name>Patterson, William</name>
</Employees>
<Employees>
<name>Fixter, Andy</name>
</Employees>
<Employees>
<name>Marsh, Peter</name>
</Employees>
<Employees>
<name>King, Tom</name>
</Employees>
</Office>
<Office>
<name>7</name>
<Employees>
<name>Bott, Larry</name>
</Employees>
<Employees>
<name>Jones, Barry</name>
</Employees>
</Office>
</Root>

IT Field: Following the Money...

I read the following article over at Richard Bejtlichs Taosecurity blog. I haven’t commented on his articles in a while, so I figured he was due for some feedback, only this time I agree with his sentiments 100 percent.

One of the comments that really struck the whole heart of the issue in my eyes was from a reader who commented that he had struck up a conversation with a guy while at the mall. The guy basically went on to state that he was getting into network security to get one of those “6 figure salaries” that he hears so much about.

This is not a problem that is inherent to network security, but to the IT industry as a whole. This was really prevalent during the dotcom bust of the late 90’s and early 2000’s. I remember reading articles in magazines talking about developers being lured away from jobs the way that NBA superstars are. Now, we have the same type of talk about Network Security. I thought it was ridiculous then, I still do.

What is the result of sensationalist talk like this? Unmotivated, rushed to get educated, and unqualified individuals filling sensitive positions that are merely money chasers and are only willing to put in their 9-5. What is the result, a large surplus of unqualified workers filling job slots. What’s in store for the network security field? If what has happened to the development community is any indication, these “menial” and entry level positions of Network Administrator will get outsourced to save costs, blocking promising and talented administrators from the field. Those that get the jobs out of trade school will be unqualified, creating poor network infrastructures, and larger holes. Just like menial coding jobs get outsourced or offshored, and create bugs and security holes get created in software. Thus the cycle will continue.

What businesses fail to understand is that it’s not the money that makes personnel good, but their understanding and dedication to the job. When people ask me how I got involved in development work, I tell them I got involved with it when I was young, fell in love with the work, and I’d be doing it for free. My degree was a result of my dedication to the work, not as a result of my desire to earn money. Everything else just fell into place. I don’t chase the money (although getting paid is nice), but I’d still do this for fun even if I wasn’t getting paid. That’s the kind of dedication network security folks are facing, hackers who love to hack, and programmers who love to program.

So lets compare and contrast. The dedicated hackers who do it for the love of hacking, oh and they just so happen to sometimes get paid by organized crime for their skills, or the 6 figure 9-5’ers who have a 2 year degree from a trade school, or a business degree with IT emphasis? I agree with Richs statements, and I weep for the future of the network security field. So when does the flood of clueless articles in business magazines talking about the failure of network security begin?

BIRT: Using the Design Engine API and Open Libraries

Recently I gave a presentation on BIRT at the Actuate International Users Conference. One of the things I discussed was embedding the BIRT Design Engine API into an application. This is an often overlooked aspect of BIRT, since most discussions center around report creation using the Eclipse editor and the BIRT Report Engine. I figured it would be cool to do something with the design engine as well. This is useful if your users would like to create their own simple, custom reports and you would like to give them that functionality. There are already products out there that are built on this concept.

The BIRT Design engine is actually a fairly simple API to use. It is part of the org.eclipse.birt.report.model.api package. The steps for creating a report using the API are illustrated below.

Figure 1. Creating a BIRT Report using the Design Engine API

In the above sequence, the user is presented with a list of data sets that are available in a report Library, the user selects a data set to build their own custom report off of, and a new report is created. I just recycled this diagram from my presentation since I am lazy, but the steps are illustrated in the third section, Create New Report.

Its fairly simple, the program instantiates a Report Design Engine object, and creates a new Design Session, the session creates a new report design, data sets are added to the new report design, and a table is built off of the data sets. Then the report design file is saved.

Below is sample code for using the BIRT Report Design engine. The below example will create a simple report in a temporary folder, add a simple report footer, add a grid component, and inside of the grid, add a label that says Hello World. Nothing too fancy with this one.

import java.io.IOException;

import org.eclipse.birt.core.exception.BirtException;
import org.eclipse.birt.core.framework.Platform;
import org.eclipse.birt.report.model.api.CellHandle;
import org.eclipse.birt.report.model.api.DesignConfig;
import org.eclipse.birt.report.model.api.DesignElementHandle;
import org.eclipse.birt.report.model.api.ElementFactory;
import org.eclipse.birt.report.model.api.GridHandle;
import org.eclipse.birt.report.model.api.IDesignEngine;
import org.eclipse.birt.report.model.api.IDesignEngineFactory;
import org.eclipse.birt.report.model.api.LabelHandle;
import org.eclipse.birt.report.model.api.ReportDesignHandle;
import org.eclipse.birt.report.model.api.RowHandle;
import org.eclipse.birt.report.model.api.SessionHandle;
import org.eclipse.birt.report.model.api.SimpleMasterPageHandle;
import org.eclipse.birt.report.model.api.activity.SemanticException;
import org.eclipse.birt.report.model.api.command.ContentException;
import org.eclipse.birt.report.model.api.command.NameException;

import com.ibm.icu.util.ULocale;

public class DesignTest {

/**
* @param args
*/
public static void main(String[] args) {
try {
//create the report design engine configuration pointing to the BIRT runtime
DesignConfig dconfig = new DesignConfig();
dconfig.setBIRTHome("C:/BIRT_RUNTIME_2_2/birt-runtime-2_2_0/ReportEngine");
IDesignEngine engine = null;

//try to start up the eclipse platform to load any plugins and create
//a new design engine
Platform.startup( dconfig );
IDesignEngineFactory factory = (IDesignEngineFactory) Platform.createFactoryObject( IDesignEngineFactory.EXTENSION_DESIGN_ENGINE_FACTORY );
engine = factory.createDesignEngine( dconfig );

//create a new session
SessionHandle session = engine.newSessionHandle( ULocale.ENGLISH ) ;

// create a design or a template. Then create a report element factory
ReportDesignHandle design = session.createDesign();
ElementFactory efactory = design.getElementFactory();

//set my initial properties
design.setDisplayName("my Test Report");
design.setDescription("test");
design.setIconFile("/templates/blank_report.gif");
design.setFileName("c:/TEMP/sample.rptdesign");
design.setDefaultUnits("in");
design.setProperty("comments", "what not and what have you");

SimpleMasterPageHandle element = efactory.newSimpleMasterPage( "Page Master" );
DesignElementHandle footerText = efactory.newTextItem("test");
footerText.setProperty("contentType", "html");
footerText.setStringProperty("content", "MyTest");

//Add in a simple page footer to our master page
element.getPageFooter().add(footerText);

//try to add the footer to the Master Page
try {
design.getMasterPages( ).add( element );
} catch (ContentException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (NameException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

//create a new grid element, and set the width to 100 percent of the page design
GridHandle grid = efactory.newGridItem( null, 1, 1);
grid.setWidth( "100%" );

//Add the grid to the report body
design.getBody( ).add( grid );

//create a new row
RowHandle row = (RowHandle) grid.getRows( ).get( 0 );

// Create a label and add it to the first cell.
LabelHandle label = efactory.newLabel( "Hello, world!" );
label.setText("Hello, World!");
CellHandle cell = (CellHandle) row.getCells( ).get( 0 );
cell.getContent( ).add( label );

//save the report design
design.saveAs( "c:/TEMP/sample.rptdesign" );
design.close( );
System.out.println("Finished");
} catch (ContentException e) {
e.printStackTrace();
} catch (NameException e) {
e.printStackTrace();
} catch (SemanticException e) {
e.printStackTrace();
} catch (BirtException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}

}
}

That is a simple example. So what happens when we want to put some real data in? Well thats when things get fun, due to having to bind the data to the output elements. So while adding elements is easy, data binding is a bit tricky. If you remember from the old days of using the BIRT report designer, you had to bind data sets and tables/lists in order for data to show up. This is done for us automatically nowadays, however if you are writing a program that utilizes the Design Engine API, you will need to do this step for your users. The binding is done by adding a ComputedColumn to the Table/List ColumnBindings property. Then you can add your element to the column. Below is the code that implements the functionality in Figure 1. Not only does it demonstrate how to create a report design in BIRT, it also demonstrates how to open a Report Library, which also has to be done using the BIRT Design Engine API. It will open a report library using code and retrieve the data set that matches the name of the string passed into the method. (Note: for my example, I used a contains comparison instead of the equals comparison. This isn’t necessary, and I only used this since I was working around another issue related and copied and pasted the code I had used). You can only add in a data set once, hence the use of the hasDataSetAlready variable. If you try to add in the same data set multiple times, you will get an error. And since the DataSet method does not allow for the clone method, creating a copy would have taken too much effort to demonstrate this simple concept.

public boolean createReport(String reportName, List dataSetNames) {
try {
DesignConfig dconfig = new DesignConfig();
DataSetHandle dataSetHandleToUse = null;
DataSourceHandle dataSourceHandle = null;
dconfig.setBIRTHome("C:/BIRT_RUNTIME_2_2/birt-runtime-2_2_0/ReportEngine");
IDesignEngine dengine = null;

//try to start up the eclipse platform
IDesignEngineFactory factory = (IDesignEngineFactory) Platform.createFactoryObject( IDesignEngineFactory.EXTENSION_DESIGN_ENGINE_FACTORY );
dengine = factory.createDesignEngine( dconfig );

//create a new session, open the library, and retrieve the first data source since it is uniform in our library
SessionHandle session = dengine.newSessionHandle( ULocale.ENGLISH ) ;
LibraryHandle design = session.openLibrary("C:/eclipse/GWTBirt/BIRTGwt/src/reports/DataSets.rptlibrary");
dataSourceHandle = (DataSourceHandle) design.getDataSources().get(0);

//create a new report
ReportDesignHandle reportDesign = session.createDesign();
reportDesign.getDataSources().add(dataSourceHandle);


//find the correct data set based on dateSetName
int dataSetCount = 0;
for (Iterator dataSetIterator = dataSetNames.iterator(); dataSetIterator.hasNext();)
{
dataSetCount++;
String dataSetName = (String) dataSetIterator.next();

for (Iterator i = design.getDataSets().iterator(); i.hasNext(); )
{
DataSetHandle dataSetHandle = (DataSetHandle) i.next();

if (dataSetHandle.getName().contains(dataSetName))
{

dataSetHandleToUse = dataSetHandle;
dataSetHandleToUse.setName(dataSetHandle.getName());
}
}

//Add the current data set to the report design
boolean hasDataSetAlready = false;
for (Iterator i = reportDesign.getDataSets().iterator(); i.hasNext();)
{
DataSetHandle dataSetInReport = (DataSetHandle) i.next();

if (dataSetInReport.getName().equalsIgnoreCase(dataSetHandleToUse.getName()))
{
hasDataSetAlready = true;
}
}
if (hasDataSetAlready == false)
reportDesign.getDataSets().add(dataSetHandleToUse);

//get the columns from the selected dataset
List columnList = new ArrayList();
for (Iterator i = dataSetHandleToUse.getCachedMetaDataHandle().getResultSet().iterator(); i.hasNext(); )
{
ResultSetColumnHandle colInfo = (ResultSetColumnHandle)i.next();

columnList.add(colInfo.getColumnName());
}

//create new table, set the data set
TableHandle reportTable = reportDesign.getElementFactory().newTableItem("testTable" + dataSetHandleToUse.getName(), columnList.size());
reportTable.setWidth("100%");
reportTable.setDataSet(dataSetHandleToUse);

//create a new detail row and add to the report
RowHandle detailRow = (RowHandle) reportTable.getDetail().get(0);
int x = 0; //used to mark current column position

//go through column list and create a new column binding, otherwise data will not be populated into the report
//Then add a new column to our row
for (Iterator i = columnList.iterator(); i.hasNext();)
{
String columnName = (String) i.next();

ComputedColumn computedColumn = StructureFactory.createComputedColumn();
computedColumn.setName(columnName);
computedColumn.setExpression("dataSetRow[\"" + columnName +"\"]");
PropertyHandle computedSet = reportTable.getColumnBindings( );
reportTable.getColumnBindings().addItem(computedColumn);

//add new data item and cell
DataItemHandle data = reportDesign.getElementFactory().newDataItem(columnName);
data.setResultSetColumn(columnName);
CellHandle cell = (CellHandle)detailRow.getCells().get(x);
cell.getContent().add(data);
x++; //advance position
}

//add the table to my report
reportDesign.getBody().add(reportTable);
}
//set my initial properties for the new report
reportDesign.setDisplayName(reportName);
reportDesign.setDescription(reportName);
reportDesign.setIconFile("/templates/blank_report.gif");
reportDesign.setFileName("C:/eclipse/GWTBirt/BIRTGwt/src/reports/" + reportName + ".rptdesign");
reportDesign.setDefaultUnits("in");
reportDesign.setProperty("comments", reportName);
reportDesign.setProperty(IReportRunnable.TITLE, reportName);

//save report design
reportDesign.saveAs("C:/eclipse/GWTBirt/BIRTGwt/src/reports/" + reportName + ".rptdesign");

return true;
} catch (ContentException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (NameException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (DesignFileException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SemanticException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

return false;
}

Tuesday, August 21, 2007

ETL: Kettle

For several months now I have been meaning to take a more in depth look at Kettle, an open source ETL tool that is part of the Pentaho project. I was first turned on to Kettle back in January while attending a product introduction for a commercial partner of the company I work for. It caught my attention mainly because at the time they recommended it, but also because I remembered the horrible times I had having to deal with large number of data loads in different formats. So, as a test of the ETL capabilities of Kettle, I decided to give a simple run of loading data from BIRT’s Classic Cars Derby database and exporting that data to a Microsoft Access database.

Since Kettle is a Java based tool, I need JDBC drivers for each of my tools. I decided to try this in a two step process just to make things interesting. First, I will export the data from Classic Cars to a series of text files. Then I will import those text files into Access. Nothing is stopping me from going directly into the Access database, I just wanted to simulate the process that I usually encounter more often than not, which is database-to-text-to-database.

Kettle 2.5.0 comes distributed as a single zip file, and installation is as simple as extracting the zip to a given location. Not wanting to be to terrible original, I just unzipped the file into C:\Kettle. Kettle is broken up into 4 versions. I only needed 1, Spoon, which designs Transformations and Jobs. I’m only concerned with Transformations for my experiment. So I launch Spoon.


Figure 1. Kettle Start Up Screen

When starting the program, I tried to set up a repository. The repository is a database of transformation and jobs. I thought this would be useful, however, I only had Derby handy. The derby driver that came with Kettle did not want to work with my database, and copying over the Derby drivers into the C:\Kettle\libext\JDBC folder. Long story shot, the repository didn’t quite work, so I skipped it, and went with No Repository to continue on.

Now, since I use the Classic Cars database for other things that are not part of the Embedded distribution of BIRT, I have extracted the sample databases JAR file to a folder at C:\ClassicCarsDatabase. So the first step is to research what the JDBC URL will be. In my case, the JDBC URL is jdbc:derby:C:/eclipse/GWTBirt/BIRTGwt/src/CCDataBase, and the driver I will use is org.apache.derby.jdbc.EmbeddedDriver. I know these since I use them with Hibernate, so I know these will work.

So, now that I am in Spoon, I go up to File, New, and Transformation. Under the Main Tree pane, I double-click on Database Connections. Now, since I had issues with Kettles Derby driver, I have to select Generic database, which will allow me to specify a JDBC URL and driver to use. So under the Generic Tab, I use the JDBC information I mentioned above.


Figure 2. Database Connection

Now that I have specified a database connection, I can drag and drop the Classic Cars database connection over to the Transformations tab. This will automatically create a new Table Input step for me. It will also bring up the edit dialog for the Table Input step. From the dialog I click on Get SQL Statement, which brings up a graphical browser of the database schemas and tables. Having had issues with schema namespacing with Hibernate using this driver and JDBC URL in the past, I actually browse to the full Schemas.Schema Name.tables path. Now, since I am simulating a data export of the database, I need to create a straight full select statement for each table. So, starting with my Employees table, I browse to the Schemas.CLASSICMODELS.EMPLOYEES entry in the browser and double-click. When asked if I want to include the field names, I say yes. I click on the Preview button to make sure it works. Once done, I hit OK. I repeat the step for all tables under the ClassicModels schema.


Figure 3. The Table Input Editor

Once I have all of my input tables created, I now need to specify my output text files. So, Under the Core Objects pane on the left hand side, I select the Output drop down, and drag over the Text File Output object. Once I drag it over, I actually need to double click on the object in order to bring up the editor. But before I do, I need to link which Table Input will go with it. I want to create a text file based on Employees. So in order to create the hop, I need to hold down the left shift key, and then try to drag and drop the Employees table Input object to the text output. This will create an arrow that points from the Employees Table Input to the Text Output. Now I Double Click on the Table Output object. For the filename I specify Employees since this will represent the Employees table. I then go over to the Fields tab and click on the Get Fields button. This will take care of retrieving all the fields to be outputted. I hit OK, then repeat the same steps for all the tables. Once done, I click on the execute button to actually execute. Of course, running the processes created would be useless if they could only be run from within Spoon. Fortunately, that’s what Pan is for. Read the documentation for more information on using Pan. (Note, due to the binary field in the ProductLines tabe, I did not export that field to the text files.)


Figure 4. The Stats Screen After Running the Transformation

Going into Access is the exact opposite. However, I did run into a few issues. I needed either a JDBC driver, or to set up an ODBC connection. I went with the later since I couldn’t find a free JDBC driver for access. Once I had that, I was all set. With the transformations, I can easily script the entire process using Pan or Kitchen. The scenario I used to picture was a large data transfer of employee certificates to sell insurance. Since this list would come in as SSN, I could have used Kettle to read in the text file, run the transform to replace the SSN’s with employee ID’s, and load into a database. I think I will keep this tool in mind for the future.