Friday, December 07, 2012

BIRT and Google BigQuery



BIRT offers many ways data can be retrieved, including extending JDBC drivers, scripted data sources and implementing the Open Data Access (ODA) framework extensions.  Each of these methods has pros and cons associated with them. Generally the most re-usable method is to implement the ODA extension which allows adding a personalized GUI to the BIRT designer.  Many examples of this approach are available on Birt-Exchange.  The quickest method is to implement a scripted data source, which involves writing some Java or JavaScript to retrieve values.  This post illustrates retrieving a Google BigQuery using the scripted data source approach.   


For this report we will use the births by year query described here.   


The query is defined as "SELECT year, SUM(record_weight) as births FROM publicdata:samples.natality GROUP BY year"

The first thing we will need to do is to create a Java class that will call BigQuery.  The class for this example is shown below.  More details on the BigQuery Java API are available here.



package birt.big.query.sample;
import java.io.File;
import java.io.IOException;
import java.security.GeneralSecurityException;
import java.util.List;

import com.google.api.client.googleapis.auth.oauth2.GoogleCredential;
import com.google.api.client.http.HttpTransport;
import com.google.api.client.http.javanet.NetHttpTransport;
import com.google.api.client.json.JsonFactory;
import com.google.api.client.json.jackson.JacksonFactory;
import com.google.api.services.bigquery.Bigquery;
import com.google.api.services.bigquery.model.QueryRequest;
import com.google.api.services.bigquery.model.QueryResponse;
import com.google.api.services.bigquery.model.TableCell;
import com.google.api.services.bigquery.model.TableRow;


public class GetBigQueryData {

        private static final String SCOPE = "https://www.googleapis.com/auth/bigquery";
        private static final HttpTransport TRANSPORT = new NetHttpTransport();
        private static final JsonFactory JSON_FACTORY = new JacksonFactory();
        private Bigquery bigquery;
        private static String QUERY = "SELECT year, SUM(record_weight) as births FROM publicdata:samples.natality GROUP BY year";

        public void setupConnection() throws GeneralSecurityException, IOException{

               GoogleCredential credential = new GoogleCredential.Builder().setTransport(TRANSPORT)
                              .setJsonFactory(JSON_FACTORY)
                               .setServiceAccountId("yournumber@developer.gserviceaccount.com")
                              .setServiceAccountScopes(SCOPE)
                              .setServiceAccountPrivateKeyFromP12File(
                                             new File("locationofyourkey.p12")).build();

               bigquery = new Bigquery.Builder(TRANSPORT, JSON_FACTORY, credential)
               .setApplicationName("BigQuery-Service-Accounts/0.1")
               .setHttpRequestInitializer(credential).build();
        }
        public List executeQuery( String query) throws IOException{


               QueryRequest queryInfo = new QueryRequest().setQuery(query);

               Bigquery.Jobs.Query queryRequest = bigquery.jobs().query("yourprojectid", queryInfo);
               QueryResponse queryResponse = queryRequest.execute();
               return queryResponse.getRows();
        }

        public static void main(String[] args) throws IOException, GeneralSecurityException {
               GetBigQueryData bq = new GetBigQueryData();
               bq.setupConnection();
               List rows = bq.executeQuery(QUERY);

               if( rows != null ){
                       for (TableRow row : rows) {
                             for (TableCell field: row.getF()) {
                              System.out.printf("%s--", field.getV());
                             }
                       }
               }

        }
}
 
This is a simple class based on the Google BigQuery examples.  Note that this project was configured to use a service account.  If you plan to use the above code make sure to do the same for your project and enter the correct service account id and private key.  You will also need to specify your project id when creating the query request.  This example has two methods that will be called from BIRT setupConnection and executeQuery.  The setupConnection method just builds our GoogleCredentials object and the executeQuery method executes the passed in query string and returns the query response. 

To make this class available to BIRT you can either add it to BIRT as a jar or create a Java project in the same workspace as the BIRT Report project.  For this post, a Java project was created (Named BirtGoogleBigQueryExample).  Using the BIRT Classpath preference, the Java project and the Google jars where added to the report project (Named BirtGoogleBigQueryReport).



Select Configure Project Specific Settings and click on the BIRT report project (BirtGoogleBigQueryReport in this example).
 
Next click on the Add Projects button and add the Java project (BirtGoogleBigQueryExample in this example).  Finally select the Add JARs button and add the Google BigQuery API jars to the Report Project classpath.

You can now create a Scripted Data Source in BIRT that calls this class.  First create a new report and in the data explorer view, right click on Data Sources and select New Data Source.


Select Scripted Data Source and Click Finish.  Next right click on the Data Sets node and select New Data Set.

Select the scripted data source you created and click on the next button.  Enter two columns, one for year and one for births.  Set their type to integer.
For this example only two events need to be implemented.  Both are on the scripted data set.  An open and a fetch event must be implemented.  To do this select the scripted data set in the data explorer view and click on the script tab in the report editor.  Enter the following script for the open event.

importPackage(Packages.birt.big.query.sample);
importPackage(Packages.com.google.api.services.bigquery.model);
importPackage(Packages.java.util);

var bq = new GetBigQueryData();
bq.setupConnection();
var mrows = bq.executeQuery("SELECT year, SUM(record_weight) as births FROM publicdata:samples.natality GROUP BY year");
iter = mrows.iterator();

This script imports the required packages and calls the setupConnection method from the class we created earlier.  The query is then executed by calling our executeQuery method.  An iterator is then created to use in the fetch method.  Select the fetch event and enter the following script.

if( iter.hasNext()){
 tr = iter.next();
 myrow = tr.getF();

 row["Year"] = parseFloat(myrow.get(0).getV());
 row["Births"] = parseInt(myrow.get(1).getV());

 return true;
}else{
 return false;
}

With BIRT scripted data sets the fetch method is continuously called until a false is returned.  In this example each of the BigQuery rows is iterated and the column values are set.  You can now use the data from the query in your report.  


When deploying the report make sure to have the BigQuery jars and the custom class in the classpath of your runtime.  In most cases you will just copy the jars to the WEB-INF/lib of the viewer.
 
The example Java Project and Report Project are available at Birt-Exchange.


 

Tuesday, October 16, 2012

Big Data and BIRT in San Francisco and New York

Actuate is sponsoring a customer day in San Francisco on November 8 and in New York on November 14.  This one day event is free to all and will focus on Big Data and Data Visualization.  Shaku Atre will be doing a keynote on Big Data entitled “Big Data in Motion and Humongous Data at Rest”.   Stephen Few will be doing a keynote on Data Visualization entitled “Telling Compelling Stories with Numbers” in San Francisco and Geoff McGhee will be presenting “Telling Stories with Data” in New York.  There will also be breakout sessions focused on Dashboards and Scorecards, Tips and Tricks, and Big Data Visualizations.   I will be doing a talk in the Tip and Tricks breakout entitled “BIRT Essentials: Tips and Tricks Every Developer Should Know”.  In this talk I will cover some of the most effective but least understood features of BIRT including how to improve performance, scripting, accessing Big Data sources, debugging reports, and making your reports more dynamic.


If you are interested in attending, check out the registrationpage for more details.

Friday, September 28, 2012

BIRT 4.2.1 Released


The BIRT team is pleased to announce the release of BIRT 4.2.1.  This point release fixes several issues including a data set parameter issue when using Oracle or MS SQL Server.  Additionally a bugzilla entry that involved Safari and Chrome viewing issues when the viewer is deployed to Weblogic has been fixed.  An RCP Designer startup issue has also been resolved and a bug with Joint Data Sets has been corrected.

When using a BIRT dataset within a report, BIRT automatically sets up a binding mapping between the dataset and the report item that will use the dataset.  The bindings are available in the Binding tab of the properties editor.  There are many reasons for using a binding mapping, but one main reason is to allow report item aggregations and computed columns.  These binding columns can not only source data from the data set but can also get data from external sources such as global JavaScript variables or external Java classes.  While this feature offers many benefits, one drawback has always been that if the data set was modified, bindings may become invalid.  In a previous release the team added a refresh button that would add missing columns to the binding map, but this approach did not clear invalid columns.  With this release a clear button has been added that allows a user to clear all the bindings that exist for a data bound report item.  So by clicking the clear button and then the refresh button all bindings will be recalculated.
To see a list of the bugzilla entries fixed in this release, take a look at this bugzilla query.

Monday, September 10, 2012

BIRT, Cassandra and Hector

While BIRT offers many ways to connect to Cassandra, including using the Cassandra JDBC driver, this post focuses on using a Scripted data source to call the Hector Client Java client.  A BIRT scripted data source allows external Java classed to be called to retrieve data for a BIRT report and can be written in Java or JavaScript.  The examples below will use JavaScript.    For this post we used the DataStax community edition which is available here, and created a keyspace with the name users and a column family named User.  The User column family contains three string columns for first name, last name and age.  The script used to load the sample data is available in the example download.

Set Designer Classpath


The first thing that you will need to do is set the classpath for the designer to access the following set of jars. 
  • hector-core-version.jar
  • hector-object-mapper-version.jar
  • slf4j-api-version.jar
  • libthrift-version.jar
  • apache-cassandra-thrift-version.jar
  • guava-rversion.jar
  • commons-lang-version.jar


All of these jars, with the exception of the two Hector jars are available in the /install-directory/DataStax Community/apache-cassandra/lib directory.  To get the Hector jars you can download and build the hector source or just download them from a maven repository.

The Hector-object-mapper jar file can be downloaded from here.
The Hector core jar file can be downloaded from here.

One way to setup the classpath is to create a libs directory in your Report Project and then copy all of the jars above to this folder.
Next Select Window->Preferences.  Select the Report Design->Classpath preference and click on the Configure project specific settings link.
 
Select the BIRT Project that you will be using Hector with and click on ok.

Select the enable project specific settings checkbox and add the jars in the lib folder you created earlier.


Creating a Scripted Data Source using Hector

 
You can now create a report that calls the Hector APIs directly.  To do this first create a new report.  Select the data explorer view and right click on the data sources node and click on New Data Source.  Select the Scripted Data Source option and click on finish.
 

Next right click on the Data Sets node and choose the New Data Set option.  Make sure to select the Scripted Data Source that you just created as the data source for this data set. 

 

Click on the Next button and enter each column name and data type for the data set.


 
Click on the Finish button.  You can now enter script for the data set.  To do this first make sure the data set is selected in data explorer view and click on the script tab at the bottom of the report canvas.


In the script editor you will have many events that could be scripted, but in this example all we need is an open script and a fetch script.  First select open from the script drop down list and enter a script similar to the following.

importPackage(Packages.java.util);

importPackage(Packages.me.prettyprint.cassandra.serializers);

importPackage(Packages.me.prettyprint.cassandra.service);

importPackage(Packages.me.prettyprint.hector.api);

importPackage(Packages.me.prettyprint.hector.api.beans);

importPackage(Packages.me.prettyprint.hector.api.factory);

importPackage(Packages.me.prettyprint.hector.api.query);

 

var cluster = HFactory.getOrCreateCluster("Test Cluster",new CassandraHostConfigurator("localhost:9160"));

var keyspace = HFactory.createKeyspace("users", cluster);

var rangeSlicesQuery = HFactory.createRangeSlicesQuery(keyspace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get())

.setColumnFamily("User").setRange(null, null, false, 10).setRowCount(100);            

var result = rangeSlicesQuery.execute();

myrows = result.get();          

rowsIterator = myrows.iterator();

Hector also supports using CQL so you could also use the following open script

importPackage(Packages.java.util);

importPackage(Packages.me.prettyprint.cassandra.serializers);

importPackage(Packages.me.prettyprint.cassandra.service);

importPackage(Packages.me.prettyprint.hector.api);

importPackage(Packages.me.prettyprint.hector.api.beans);

importPackage(Packages.me.prettyprint.hector.api.factory);

importPackage(Packages.me.prettyprint.hector.api.query);

importPackage(Packages.me.prettyprint.cassandra.model);

 

var cluster = HFactory.getOrCreateCluster("Test Cluster",new CassandraHostConfigurator("localhost:9160"));

var keyspace = HFactory.createKeyspace("users", cluster);

            

var cqlQuery = new CqlQuery(keyspace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get());

cqlQuery.setQuery("select * from User");

var resultCQL = cqlQuery.execute();    

rowsIterator = resultCQL.get().iterator();

Next add a fetch script like the following.

if (rowsIterator.hasNext()) {

     var myrow = rowsIterator.next();

     var cols = myrow.getColumnSlice().getColumns();

     for( ii=0; ii < cols.size(); ii++ ){

       row[cols.get(ii).getName()] = cols.get(ii).getValue();

     }

        return true;

}else{

       return false;

}



In the above fetch the script assumes you have named your scripted data set columns the same as the columns in Cassandra.  You should now be able to preview the data set.  Double click on the data set in the data explorer view and select preview.

 
You can now use the data set within your report. 


Deploying a Report that Uses the Hector API


 
If you are using the BIRT Viewer and deploy a report that calls the Hector API, verify that all the jars discussed in the beginning of this Post (Set Designer Classpath) are placed in WEB-INF/lib directory of the Viewer.  If you are running BIRT reports using the BIRT APIs verify that the above jars are also in the classpath.

More information on CQL and Hector is available here.  The example in this post is available on Birt-Exchange.


Tuesday, July 31, 2012

Using a Global JS Function across BIRT Reports

BIRT provides a scripting model that allows report customizations by implementing event handlers.  These events can be written in Java or JavaScript.  This model is described on the Eclipse Birt Site.


We have written many posts on using scripting to modify BIRT report components.  Below are just a few:

In this post we will describe one way that you can share a server side JavaScript function across reports.  Suppose that you have a JavaScript function to reverse a string like:

//external js function
function reverseMyString( MyString )
{
 var rString = "";
 for (i = 0; i < MyString.length; i++)
 {
  rString = MyString.substring(i, i+1) + rString;
 }

 return rString;
}



This function can be put in a .js file and then placed in the BIRT resource folder.  If you do not have a resource folder configured for your BIRT project, it can be set from the designer window preferences dialog.


The js file can be added to the report by selecting the general properties for the report and clicking on the add file button under Javascript Files.

The global function can now be called in the expression builder or in script.
The evaluate function within the reportContext object can also be used to evaluate your script at runtime.
var testString = "ZYXWVU";
this.text = reportContext.evaluate("reverseMyString('"+testString +"')");
Using this same method, a handle to the function can also be retrieved.
var testString = "ZYXWVU";
//Evaluate Function Name
var myfunc = reportContext.evaluate("reverseMyString");
this.text = myfunc( testString );
When writing Chart script it is important to understand that the Chart Engine’s Script Context is not the same as the reports.  The Chart Engine also does the bulk of its generation and rendering during the report engine’s render phase.  Chart script events can get access to the reportContext object by using the following script.
//get reportContext
var rC = icsc.getExternalContext().getScriptable();
Once the reportContext object is obtained you can make all of the standard calls available to it.  These include getting a report parameters values, getting or setting a global variable, retrieving a localized message or calling the evaluate function.  For example, to reverse the chart title the following script could be used.
function beforeGeneration( chart, icsc )
{
var currChartTitle = chart.getTitle().getLabel().getCaption().getValue()+"";
//get reportContext
var rC = icsc.getExternalContext().getScriptable();
var myFunc = rC.evaluate( "reverseMyString" );
chart.getTitle().getLabel().getCaption().setValue(myFunc( currChartTitle ));
}
Or you could create a global js function in your js file that takes the chart as a parameter and reverses its title like:
//external js function
function reverseMyTitle( chart )
{
    var MyString = chart.getTitle().getLabel().getCaption().getValue() + "";
    var rString = "";
    for (i = 0; i < MyString.length; i++)
    {
        rString = MyString.substring(i, i+1) + rString;
    }
    chart.getTitle().getLabel().getCaption().setValue( rString);
} 
You could then call this function in chart script as shown below.
function beforeGeneration( chart, icsc )
{
//get reportContext
var rC = icsc.getExternalContext().getScriptable();
var myFunc = rC.evaluate( "reverseMyTitle" );
myFunc( chart );
}




Wednesday, June 27, 2012

BIRT 4.2 Released


BIRT 4.2 is now available and with this release many improvements and new features are available. BIRT 4.2 now provides a new Excel data source that supports multi-sheet data sets, derived measures are now available on cubes, better filter support with aggregates that allows cumulative data to include or exclude filtered rows, and support for an OSGi or POJO runtime.  In addition BIRT now supports a Donut chart type and the build process has been modified to add Maven support for the BIRT engines.

To read more about these and other new features for BIRT, see the BIRT 4.2 New and Notable.

Wednesday, June 20, 2012

BIRT Area Chart Modifications


BIRT supplies a very robust and extensible chart engine, that can be used standalone or in conjunction with the report engine.  Currently the chart engine supports fourteen different main chart types and many sub-types.  Charts can be emitted in PNG, JPG, BMP, SVG within reports and can be also emitted to SWT, PDF and Swing outside of the report engine.    Virtually every area of the chart engine is also extensible, from adding new chart types to new output formats.  These are done with Eclipse extension points.  In addition the chart engine supports client side interactivity and server side event scripting.  Both of which have been discussed on the site before.
BIRT Chart Scripting Overview


One of the most often used chart types is the Area chart. 

Simple Area Chart


While this type of chart is an effective visualization, we often get questions on how to extend the Area chart to the extents of the plot.  In this post we will put together an example that illustrates one way of extending the covered area.

As discussed extensively in the post referenced above, the chart can be modified using script event handlers.  These handlers can be written in Java or JavaScript.  With certain chart types the render engine renders to the center of a data point intersection.  The Area and Line Charts are examples of the types of charts that exhibit this behavior.  To extend the area chart, a beforeDrawSeries event can be implemented to change the x location of the first and last data point to cover more area.  This approach will work for both 2D and 2D with depth types of charts.  The beforeDrawSeries event is fired for each runtime series and once for the category series.  So in the script you must first check that the locations you are going to modify are for the right series.  If you are not using optional grouping this check is as simple as just getting the series identifier.  This identifier is set as the series title in the third tab of the chart wizard.
Series Identifier
So in the beforeDrawSeries event handler you can check the series identifier by calling the following code.
if( series.getSeriesIdentifier() == "Series 1" ){}

The chart renderer stores all the data point information in a data point hints array.  This array can be retrieved in the beforeDrawSeries event handler by calling:

var dpharray = seriesRenderer.getSeriesRenderingHints().getDataPoints();


Each data point element in the array stores information like the category value, orthogonal value, and x/y location values that the renderer will use to draw the chart.  To get the x location of the area chart we first call the getLocation method and then the getX method.  This needs to be done for the first and last data points in the array.  You can set the x value using the setX method on the location object.  You can also get the width of a data point by calling the getSize method.  This method returns the width show in red in the following diagram.
The getSize method

Using the above methods we can subtract half the width from the first data point x value and add half the width to the x value of the last data point value.  The complete script is shown below.

function beforeDrawSeries( series, seriesRenderer, context )
{      
      if( series.getSeriesIdentifier() == "Series 1" ){
      var dpharray = seriesRenderer.getSeriesRenderingHints().getDataPoints();
                  var xval = dpharray[0].getLocation().getX();
                  var wid = dpharray[0].getSize();
                  dpharray[0].getLocation().setX(xval-(wid/2));
                  var xval = dpharray[dpharray.length-1].getLocation().getX();
                  dpharray[dpharray.length-1].getLocation().setX(xval+(wid/2));  

      }

}

A before and after example 2D with depth chart is shown below.
Before Example


After Example

This example is available on Birt-Exchange.