Friday, December 07, 2012

BIRT and Google BigQuery



BIRT offers many ways data can be retrieved, including extending JDBC drivers, scripted data sources and implementing the Open Data Access (ODA) framework extensions.  Each of these methods has pros and cons associated with them. Generally the most re-usable method is to implement the ODA extension which allows adding a personalized GUI to the BIRT designer.  Many examples of this approach are available on Birt-Exchange.  The quickest method is to implement a scripted data source, which involves writing some Java or JavaScript to retrieve values.  This post illustrates retrieving a Google BigQuery using the scripted data source approach.   


For this report we will use the births by year query described here.   


The query is defined as "SELECT year, SUM(record_weight) as births FROM publicdata:samples.natality GROUP BY year"

The first thing we will need to do is to create a Java class that will call BigQuery.  The class for this example is shown below.  More details on the BigQuery Java API are available here.



package birt.big.query.sample;
import java.io.File;
import java.io.IOException;
import java.security.GeneralSecurityException;
import java.util.List;

import com.google.api.client.googleapis.auth.oauth2.GoogleCredential;
import com.google.api.client.http.HttpTransport;
import com.google.api.client.http.javanet.NetHttpTransport;
import com.google.api.client.json.JsonFactory;
import com.google.api.client.json.jackson.JacksonFactory;
import com.google.api.services.bigquery.Bigquery;
import com.google.api.services.bigquery.model.QueryRequest;
import com.google.api.services.bigquery.model.QueryResponse;
import com.google.api.services.bigquery.model.TableCell;
import com.google.api.services.bigquery.model.TableRow;


public class GetBigQueryData {

        private static final String SCOPE = "https://www.googleapis.com/auth/bigquery";
        private static final HttpTransport TRANSPORT = new NetHttpTransport();
        private static final JsonFactory JSON_FACTORY = new JacksonFactory();
        private Bigquery bigquery;
        private static String QUERY = "SELECT year, SUM(record_weight) as births FROM publicdata:samples.natality GROUP BY year";

        public void setupConnection() throws GeneralSecurityException, IOException{

               GoogleCredential credential = new GoogleCredential.Builder().setTransport(TRANSPORT)
                              .setJsonFactory(JSON_FACTORY)
                               .setServiceAccountId("yournumber@developer.gserviceaccount.com")
                              .setServiceAccountScopes(SCOPE)
                              .setServiceAccountPrivateKeyFromP12File(
                                             new File("locationofyourkey.p12")).build();

               bigquery = new Bigquery.Builder(TRANSPORT, JSON_FACTORY, credential)
               .setApplicationName("BigQuery-Service-Accounts/0.1")
               .setHttpRequestInitializer(credential).build();
        }
        public List executeQuery( String query) throws IOException{


               QueryRequest queryInfo = new QueryRequest().setQuery(query);

               Bigquery.Jobs.Query queryRequest = bigquery.jobs().query("yourprojectid", queryInfo);
               QueryResponse queryResponse = queryRequest.execute();
               return queryResponse.getRows();
        }

        public static void main(String[] args) throws IOException, GeneralSecurityException {
               GetBigQueryData bq = new GetBigQueryData();
               bq.setupConnection();
               List rows = bq.executeQuery(QUERY);

               if( rows != null ){
                       for (TableRow row : rows) {
                             for (TableCell field: row.getF()) {
                              System.out.printf("%s--", field.getV());
                             }
                       }
               }

        }
}
 
This is a simple class based on the Google BigQuery examples.  Note that this project was configured to use a service account.  If you plan to use the above code make sure to do the same for your project and enter the correct service account id and private key.  You will also need to specify your project id when creating the query request.  This example has two methods that will be called from BIRT setupConnection and executeQuery.  The setupConnection method just builds our GoogleCredentials object and the executeQuery method executes the passed in query string and returns the query response. 

To make this class available to BIRT you can either add it to BIRT as a jar or create a Java project in the same workspace as the BIRT Report project.  For this post, a Java project was created (Named BirtGoogleBigQueryExample).  Using the BIRT Classpath preference, the Java project and the Google jars where added to the report project (Named BirtGoogleBigQueryReport).



Select Configure Project Specific Settings and click on the BIRT report project (BirtGoogleBigQueryReport in this example).
 
Next click on the Add Projects button and add the Java project (BirtGoogleBigQueryExample in this example).  Finally select the Add JARs button and add the Google BigQuery API jars to the Report Project classpath.

You can now create a Scripted Data Source in BIRT that calls this class.  First create a new report and in the data explorer view, right click on Data Sources and select New Data Source.


Select Scripted Data Source and Click Finish.  Next right click on the Data Sets node and select New Data Set.

Select the scripted data source you created and click on the next button.  Enter two columns, one for year and one for births.  Set their type to integer.
For this example only two events need to be implemented.  Both are on the scripted data set.  An open and a fetch event must be implemented.  To do this select the scripted data set in the data explorer view and click on the script tab in the report editor.  Enter the following script for the open event.

importPackage(Packages.birt.big.query.sample);
importPackage(Packages.com.google.api.services.bigquery.model);
importPackage(Packages.java.util);

var bq = new GetBigQueryData();
bq.setupConnection();
var mrows = bq.executeQuery("SELECT year, SUM(record_weight) as births FROM publicdata:samples.natality GROUP BY year");
iter = mrows.iterator();

This script imports the required packages and calls the setupConnection method from the class we created earlier.  The query is then executed by calling our executeQuery method.  An iterator is then created to use in the fetch method.  Select the fetch event and enter the following script.

if( iter.hasNext()){
 tr = iter.next();
 myrow = tr.getF();

 row["Year"] = parseFloat(myrow.get(0).getV());
 row["Births"] = parseInt(myrow.get(1).getV());

 return true;
}else{
 return false;
}

With BIRT scripted data sets the fetch method is continuously called until a false is returned.  In this example each of the BigQuery rows is iterated and the column values are set.  You can now use the data from the query in your report.  


When deploying the report make sure to have the BigQuery jars and the custom class in the classpath of your runtime.  In most cases you will just copy the jars to the WEB-INF/lib of the viewer.
 
The example Java Project and Report Project are available at Birt-Exchange.


 

1 comment:

Jason Weathersby said...

You may also want to look at the Starschema JDBC driver for BigQuery Located here:
http://code.google.com/p/starschema-bigquery-jdbc/