RevK®'s ramblings: INSERT INTO table SET field=value,...

2016-03-05

INSERT INTO table SET field=value,...

Doing a few searches I guess this is a mysql specific variation. I thought for a moment I was being stupid.

One of the pitfalls of learning any new programming language or system is that your knowledge can be incomplete and you do not realise it. I have had this on occasion with various systems, usually because of ongoing improvements to the system itself (like linux libraries) and simply not knowing some new and simpler way to do things now exists. This is one of the reasons code gets old - looking at code I wrote a year ago I cringe as I have learned new ways to do things over time.

Just occasionally my original learning of the system meant I missed some detail, because it was not needed in the first few cases of using the system, and then over time I never go back and re-read the text books (web pages) to realise I have missed something.

This syntax for SQL INSERT is an example I have only just realised exists and I was beginning to think I was going mad, and cursing why I did not know about it. Another example was when I discovered the "ON DUPLICATE" feature of INSERT statements.

I did a few searches, and one rather odd comment was "What a confusing syntax! If I didn't know how to use web search to lookup the documentation, I'd have thought someone mixed up UPDATE and INSERT!", so I thought I would explain good reasons why this syntax is so much better for some uses.

The main example is when using this in a program. (For a start, I have formatting functions that know how to escape values for SQL to avoid injection attacks, obviously).

1. The fact that INSERT and UPDATE were different syntax meant you have to have two separate bits of code for creating and entry and for updating the entry. This was annoying as often you want almost all of the same fields. Now you can make code that creates the query for either in one go :-

(a) INSERT INTO or UPDATE depending if existing or new

(b) table name

(d) lots of field=value pairs (and comma)

(e) either a comma or WHERE depending if new or existing

(f) keyfield=value;

Nice and simple.

2. The other issue with the INSERT INTO table (fields...) VALUES (values...) syntax is maintenance of code, and even worse if some fields are not always needed... You end up with something like

(a) INSERT INTO table (

(b) List of field names

(d) List of format controls like %s, %d, etc

(e) );

(f) List of variables to be printed in the formatting where the %s, %d, etc are located

Points (b), (d), and (f) have to all match up exactly, so adding a new field you have to count the position in the list and add field name, and in list of format controls, and in list of variables. If one of the field does not always want to be included you have to have a conditional entry in all three lists. It is tedious in code and messy and easy to get wrong.

Using the SET syntax you have

(a) INSERT INTO table SET

(b) Add each entry separately using field=, format control, variable

This means that if any field is optional, you have one condition test around that field, not three. It means if a new field is needed, you add one formatted print command. If a field is to be removed, you remove one formatted print command.

If you don't have libraries for formatting SQL as I do (uses %#s as a quoted escaped string, for example), you can have an add_string(field,value) function that adds ,field='escapedvalue' to the query in one go.

It just makes for easier to understand queries and neater code.

Now I have to resist the urge to grep all my code and change the syntax of existing usage - as that will definitely introduce new bugs. I just have to use this on new code and when maintaining old code, one step at a time...

9 comments:

Matthew WilkesSaturday, 5 March 2016 at 11:15:00 GMT
It's rather unusual to hear someone talk about writing SQL code for simple CRUD. You'd be much better off using an ORM or API above the cursor level. Not only would your code be more portable to different backends but you'd benefit from reduced maintainance work when upgrading db versions.
ReplyDelete
Replies
Tony HoyleSaturday, 5 March 2016 at 15:20:00 GMT
Putting %s in insert or update statements is bad practice IMO. I'd always use bind variables because you never know what's going to end up there..

Not sure you save much anyway. Once you've written the code to generate the update/insert - which normally happens once - you're passing it arrays of fields and things anyway. Unless you're hand coding the queries, which has its place of course.. OTOH in that case you save nothing as you're writing it each time anyway.
ReplyDelete
Replies
PhilMonday, 7 March 2016 at 12:01:00 GMT
Insert and Update are such common SQL commands that usually follow the same syntax each time (unlike a select that may involve a complicated join). It's a wonder that a library doesn't exist to build the SQL needed.
ReplyDelete
Replies