Thursday, December 18, 2008

California Immigration

I've uploaded the Census US demographic data to Good Data and I can't stop wondering. For example, would you believe that good one fourth of California population are immigrants? More precisely 26.2% Californians were born outside of US.




Do you want start wondering too? Let me know ( zd at gooddata.com) and I'll invite you to this analytic project.

Wednesday, November 5, 2008

My Good Data Web Expo Slides

See the slides that I presented on the WebExpo conference.

Good Data REST API

The Good Data BI platform is accessible through the stateless REST API. This HTTP-based API can be simply used from any 3rd party application as well as from a plain browser. The API provides the full power of our platform (we actually use it as the backend for our web frontend).

In fact the Good Data application consists of a handful types of services. Instances of these services can be dynamically added or removed (via simple HTTP load-balancing) on as needed basis. Add the Amazon EC2 cloud that allows us to add or remove a new machine and only pay for the CPU ticks that we really use. The net result is the great flexibility, scalability and cost efficiency.

The demo video below points at the fundamental architecture differences between our approach and some other on-demand BI vendors who simply deployed an existing BI package (e.g. Pentaho or MS Analytics) on the web (which unfortunately does not prevent their marketing from using the multi-tenant, SaaS mambo jumbo).

This video might help you to better understand the Good Data architecture. I apologize for no audio. Hopefully the simple step-by-step description below helps:

1. The /gdc suffix in the GDC BI platform URL shows the list of the REST API services that the platform provides.

2. Then we navigate to the metadata services that manage metadata for a selected BI project (the FoodMartDemo in our case).

3. We first show the FULL-TEXT SEARCH service. We specify the search term ("sales") directly in the service's URL. The list of matching results is shown.

4. We select one of the reports from the search result to inspect the report's definition. We can spit out the definition in many formats (e.g. JSON, YAML, ATOM, or XML). We use YAML as the default.

5. Then we demonstrate the metadata QUERY service. We list all reports in the FoodMartDemo project. We again inspect one of the reports: Salary by Year and State.

6. Then we are going to demonstrate the using service that shows us all dependencies (metadata objects that the selected report references) of the report. For example the report depends on it's definition (reportDefinition) object. We copy and paste the link of the report definition to the browser URL bar to inspect the report definition object structure. It contains all attributes and metrics that the report displays (all inner objects have their URLs too, so we could continue investigating them).

7. Then we navigate to the XTAB service. The XTAB can execute and cross-tabulate (or pivot if you like) the report's definition. We supply the report definition URL and it spits out the representation of the report result (you can see the the machine representation of the report's data). Notice the asynchronous processing here.

8. Then we go back to the original report Salary by Year and State. The report contains a reference to it's result.

9. We will copy and paste the result's URL to the EXPORTER service that returns (again asynchronously) the report result's data in MS Excel format.

If you have the Good Data platform demo account, you can try this script yourself at http://demo.gooddata.com/gdc (hint - you'll need to take a look at the LOGIN service).

New Good Data Website!

Check out the new Good Data website.

Thursday, October 9, 2008

Deadly iPhone screen


Now I know why one would want to set up multiple wake-up alarms for a single night. My "beloved" iPhone is the only device that we currently have at home capable of doing this. So it's Marimba, diapers, milk, burp. Twice every three hours. :-(

Ray has entered blogosphere!

Ray Light, the guy who sits next to me in Good Data has started blogging. I know that he will be damn good at it. You can start following him at http://www.collaborativeanalytics.com/.

Monday, September 29, 2008

Cloud Transparency

Radovan wrote a comment on Werner's article about cloud transparency. I agree that operators need monitoring and developers want to optimize their apps. However I don't understand how these needs relate to (location) transparency. I think that this is rather about functionality (e.g. OOTB performance and right management tools) that the cloud offers.

I see a little schizophrenia in the AWS messaging. "We (AWS) want developers to play with all nuts and bolts, optimize, monitor, and trace at the network packet level. And when the code jumps into our queuing, simple db, payment or whatever other high-level service then forget all transparency, close your eyes and cross your fingers." :-)

I think that different developers have different needs in terms of the right transparency level. IMO AWS is heavily used by web developers. I believe that particularly web developers will lean towards higher transparency, packaged high-level services and easy deployment.

Radovan, what do you think about the Google AppEngine?