PostgreSQL Case Sensitivity Part 1 (The DDL)

Any time you discuss PostgreSQL, the issue of case sensitivity quickly boils to the top of the conversation. Having used PostgreSQL in several high volume production systems for the last 3 years I will attempt to seperate some of the myth and legend out of the conversation and provide some best practices methods that have worked for me.

The first thing to understand, is that there are actually two completely different case sensitivity issues with PostgreSQL. Case sensitivity of the data definition language (DDL) itself, which I will be covering in this first article, and case sensitivity of the data itself, which I will cover in a second article.

Most developers have adapted “CamelCase” as their preferred convention for using case in table and field names. So a table containing customer phone numbers might be “CustomerPhone”. Instead of “customer_phone” or some other variation. And the first name of the customer might be “FirstName” or “CustomerFirstName”. It makes the fields easy to read and there is no need for special characters such as the underscore, which those of us that have evolved from hunt and peck really appreciate!

So what’s the deal? Why is this an issue with PostgreSQL?

PostgreSQL treats all DDL as case sensitive. To assist with this, it forces all SQL code to lowercase before submitting it to the back-end,  If you use CamelCase when creating tables and fields in PostgreSQL, via PGAdmin, then the resulting DDL will have double quotes around the fields.  For those of you familiar with MS SQL Server the double quote serves the same purpose as the square brackets ( [  ] ).

CREATE TABLE "CustomerCase"
(
 "CustSysId" serial NOT NULL,
 "FirstName" character varying(20),
 "LastName" character varying(20),
 CONSTRAINT pkcustomercase PRIMARY KEY ("CustSysId" )
)
WITH (
 OIDS=FALSE
);

So why not just get rid of the double quotes? If you do, PostgreSQL will just ignore the case and create the table and fields in all lower case.

What happens if you create the table with double quotes around the fields so we get CamelCase field names? Then all of your resulting SQL code must also use double quotes.

Select "FirstName" from "CustomerCase"

That works but any of the following will generate an error:

select FirstName from CustomerCase
select FirstName from "CustomerCase"
select firstname from "CustomerCase"

Anything not wrapped with double quotes is being converted to lowercase prior to being submitted to the database. And since the DDL is case sensitive, firstname is not the same field as FirstName. In fact I can demostrate that by creating the following table:

CREATE TABLE "CustomerCase4"
(
 "CustSysId" serial NOT NULL,
 "firstname" character varying(20),
 "FirstName" character varying(20),
 "LastName" character varying(20),
 CONSTRAINT pkcustomercase4 PRIMARY KEY ("CustSysId" )
)
WITH (
 OIDS=FALSE
);

I have two first name columns, as the case is different PostgreSQL treats them as different names. PLEASE PLEASE NEVER DO THIS!!!!

An import fact to note is that the double quotes are not key, the case sensitivity is the key. To select both first name columns above you might think that you need to wrap both with double quotes since they were defined that way, but in reality the double quotes just force the field to be accepted as is instead of converting it to lower case. So both of the below statements will work.

select “FirstName”,firstname from “CustomerCase4″
select “FirstName”,”firstname” from “CustomerCase4″

Notice there are no double quotes around firstname in the first statement. And now for a bonus question …… Will the following statement work and if so what will it do?

select FirstName from "CustomerCase4"

It will not generate a SQL error; remember anything not wrapped in quotes will be converted to lower case, which means the field being selected is actually “firstname” not “FirstName”. The statement that actual gets executed on the server is

select firstname from "CustomerCase4"

If that didn’t make your head hurt just a little bit, than you're a better man than me!

So How Do I Deal With Case Sensitivity in the DDL?

By not dealing with it! In otherwords, I long ago quit trying to fight PostgreSQL on the topic and just create all my DDL in all lower case. At first it seemed like the fields were harder to read, but after a little time, I found that my brain sees CustomerFirstName and customerfirstname the same way. Would I preferer to be able to have CamelCase in my DDL? Of course! But not enough to pay for SQL Server licensing, or be forced to wrap double quotes around every single field and table name in every SQL statement I write.

So How is WinDev Development Effected by this?

First any SQL code you do using HExecuteSQLQuery will need to be the same as any code you execute via PGAdmin, meaning that if you had used double quotes and CamelCase in your DDL then your SQL statements would need to also be done using double quotes. So for the same reasons that I stated above, I create my DDL in PostgreSQL as all lower case.

I was hoping that I could change my Analysis to CamelCase and all my standard WX code would be in CamelCase and everything would work. However it appears that when the actual code is passed to the back-end everything gets wrapped with double quotes. And the following code

customer.FirstName = "test2"
customer.lastname = "test2"
HAdd(customer)

will generated the below error.

ERROR: column "FirstName" of relation "customer" does not exist at character 34

After discovering this I had hope it would be considered a bug in the Native PostgreSQL driver that pcSoft could address, but after doing some research on the issue, it appears that every external driver to PostgreSQL has this same issue and it appears to be in the driver itself that the double quotes are being applied.

Summary

So for my development I have adopted the following best practice. I create all my DDL in PostgreSQL in lower case. I import the resulting tables into the Analysis and leave everything as lowercase.

There is one trick that you can do to use the table names as CamelCase. In the table definition you have the ability to name both the table and the external database name of the table. So you can change the name of your table to CamelCase and just leave the external name in lower case.

And then your WX code will look like the following:

Customer.firstname = "test2"
Customer.lastname = "test2"
HAdd(Customer)

The one thing that pcSoft could do to make our life easier is expose an external name for all the items of a table; that way we could use the same trick for field names as we can for table names. I have submitted this as a suggestion to pcSoft, feel free to also express your interest in this feature to them!!!

There are of course alternatives to my method. You could use CamelCase in your DDL, and as long as your Analysis has the exact same mixed case all your standard Hxxxx statements in WX would work fine. And if you don’t do a lot of native SQL via HExecuteSQLQuery  or in PGAdmin then that might be an option for you. Just remember that anytime you do use native SQL you will need to double quote all fields and table names.

Another option that many may chose is to use lower case with an under score (i.e. customer_first_name). Everything will still need to be lowercase but the fields may be easier to read. For me the slightly easier to read field names is not worth having to using the underscore in the field names, but that is a personal preference and there certainly is nothing wrong with any of these options as long as you understand the consequences and it works for you.

Be sure to tune into part two of this article where I will get into case sensitivity of the actual data in PostgreSQL.