2012-12-31

CsvCruncher – CSV manipulation through SQL (HSQLDB)

Hi,

I've crafted a tool which transforms a CSV file into another CSV file using a SQL statement (as I did not found such).

Can be useful for various quick'n'dirty integration during test automation, esp. for data-oriented tasks like perf-tests.

DOWNLOAD

Many tools spit out CSV, or are able to as one of output options. Also, in hudson, you can very simply log any values you get into bash like echo " $val2, $val2" >> data.csv, for each build or part of a build. So it can be kind of integration tool.

Then you can do quite complex queries – from a flat table, you can actually do subselects and then left joins, which gives you very powerful tool to process the data into something what is ready for plotting as-is – that means, data filtered, cleaned, aggregated, converted, aligned, sorted, etc.

That might be my POV, since I like SQL and it's my favorite language not only for querying but also data-oriented procedural programming. But nonetheless, I already shortened my perf test task by ~ 40 minutes of my work for each release. Instead of manual shannanigans in OpenOffice, I run a single command, and voila ;-)

HSQL's syntax: http://hsqldb.org/…s-chapt.html (I was very surprised by HSQL's features, it supports much more of SQL than e.g. MySQL.)

Enjoy :)


Usage:

crunch [-in] <inCSV> [-out] <outCSV> [-sql] <SQL>"

Example:

crunch input.csv output.csv "SELECT AVG(duration) AS durAvg FROM (SELECT * FROM indata ORDER BY duration LIMIT 2 OFFSET 6)"`

Example:

## jobName, buildNumber, config, ar, arFile, deployDur, warmupDur, scale
'eap-5.1.0-perf-deployers', 355,'production','testdata/war/hellothere.war','hellothere.war',10282,14804,1000
'eap-5.1.0-perf-deployers', 355,'production','testdata/ear/EarWithWar-Counter.ear','EarWithWar-Counter.ear',11005,18904,1000
'eap-5.1.0-perf-deployers', 355,'production','testdata-own/war/war-big-1.0.war','war-big-1.0.war',1966,14800,100
...
   SELECT jobName, buildNumber, config, ar, arFile, deployDur, warmupDur, scale,"
+ " CAST(warmupDur AS DOUBLE) / CAST(deployDur AS DOUBLE) AS warmupSlower FROM indata ORDER BY deployDur
'eap-5.1.0-perf-deployers',355,'production','testdata/war/hellothere.war','hellothere.war',10282,14804,1000,1.4397977047267068E0
'eap-5.1.0-perf-deployers',355,'production','testdata/ear/EarWithWar-Counter.ear','EarWithWar-Counter.ear',11005,18904,1000,1.7177646524307133E0
'eap-5.1.0-perf-deployers',355,'production','testdata-own/war/war-big-1.0.war','war-big-1.0.war',1966,14800,100,7.527975584944048E0

(crunch script yet to be done. Currently you need to call java -jar CsvCruncher.jar <args>.)



0