Often times I find myself having to handle a particular portion of a SQL statement via a script. I've written a lot of specialized regular expressions over time to handle these tasks, but the one that I've had to write the most is a basic 'tokenizer' which understands the quoting semantics of a MySQL statement. That is, it understands `BackTick`, 'Single Quote' and "Double Quoted" strings.
#PHP Heredoc syntax
$regex = <<< END_OF_REGEX
/
[^ \"'`(),]*\([^)]*\) #match functions like concat(x,"y",`a`.`z`) or sum(`xyz`);
|\([^)]*?\) #match grouped items
|"[^"]*?" #match double quoted items
|'[^']*?' #match single quoted items
|`[A-Za-z0-9_ .'\"()+\\-&^%\$+?%\\/\\\\!`]+` #match backtick mysql names
|[^ ,]+ #match keywords, operators and aliases
|,
/xi
END_OF_REGEX;
When used with the preg_match_all() function, an array is produced which represents the tokenized string. Functions calls, quoted strings and grouped expressions are returned together, and may need additional processing, depending on your needs.
For example:
-- select sum(a * b) sweety,'abc' as a1,
-- "abc",concat(`def`.`abc`,'hello', x.y, z) as `blah`,
-- null + 1
-- from `abc`.`def` as def1
-- join xyz.zzz z1
-- on (z1.a = def1.a)
Array
(
[0] => select
[1] => sum(a * b)
[2] => sweety
[3] => ,
[4] => 'abc'
[5] => as
[6] => a1
[7] => ,
[8] => "abc"
[9] => ,
[10] => concat(`def`.`abc`,'hello', x.y, z)
[11] => as
[12] => `blah`
[13] => ,
[14] => null
[15] => +
[16] => 1
[17] => from
[18] => `abc`.`def`
[19] => as
[20] => def1
[21] => join
[22] => xyz.zzz
[23] => z1
[24] => on
[25] => (z1.a = def1.a)
)
I'm using this to process SQL statements and automatically create the Flexviews calls for the query. The following output is generated by my program for the above SQL statement (only the SELECT clause is currently represented):
call flexviews.add_expr(@mvid, 'SUM', 'a * b', 'sweety');
call flexviews.add_expr(@mvid, 'GROUP', 'abc', '1');
call flexviews.add_expr(@mvid, 'GROUP', "abc", '_abc_');
call flexviews.add_expr(@mvid, 'GROUP', 'concat(`def`.`abc`,\'hello\', x.y, z)', 'blah');
call flexviews.add_expr(@mvid, 'GROUP', 'null + 1', 'null_+_1');
|