Databases


This is one of those things I can never remember how to do when I need to do it. I don’t know why, it’s really quite simple:

SELECT emailaddress, firstname, lastname,
COUNT(emailaddress)
FROM customers
GROUP BY emailaddress
HAVING COUNT(emailaddress) > 1

Chocs To Go The goal for students on the intermediate PHP and MySQL course that I’m running for IWA/HWG is to create a working shopping cart. The course starts in only three days, and I decided I’d better walk the walk and prove I could do it. So I have quickly mocked up a working version of the cart they are expected to build and put it online here. I think it will be helpful to students to see what they are aiming at, and it will make it easier for them to work out the logic if they can click through a real example. The design may not win any prizes, but the basics are all there.

All course materials now written and double-checked, quiz questions written and checked, SQL scripts generated and tested … just need to put week one’s materials up now!

Why doesn’t SQL Server come with a built-in method for generating SQL dump files so you can easily port your data to another database? Microsoft in its own little walled garden I suppose; while the facilities in Enterprise Manager for exporting to Excel, Access, ODBC, CSV etc. can be very useful, sometimes you just need a bunch of SQL statements <whisper>so you can export your data to Oracle</whisper>.

SQLDumper to the rescue; a handy little free utility that does just that, with a neat user interface that lets you select which database/tables you want to dump. Windows only, but hey, it’s a SQL server tool after all! Downside: you have to download and install the .NET framework if you don’t already have it, but the setup program handles that itself — just something to be aware of if you have a slow connection.

Since 1999 I’ve been teaching online for the Open University. It may not pay brilliantly, but it’s regular income and I really enjoy doing it; OU students often overcome major barriers to achieve a qualification they missed out on earlier in life, for whatever reason, so they can be very rewarding to teach. And the fact that it’s online means I can conveniently fit it around other work. Currently I teach three courses in the Certificate in Web Application Development.

I can now announce that I’m also going to start teaching an online course with eClasses.org, the online learning branch of IWA/HWG. The course uses PHP and MySQL to build a shopping cart, and is aimed at intermediate PHP developers (you don’t have to know anything about MySQL, but it certainly helps if you do). I’m really lookng forward to this; you can see the full course description here, and registration is now open.

It’s always a real headache manipulating dates between PHP and MySQL, because PHP’s date functions assume Unix timestamps, while MySQL has its own internal date format. They both have excellent built-in functions for handling dates, but they are fundamentally incompatible. I’ve torn my hair out before now trying to manipulate dates between the two, because pretty well every application I write needs to do date arithmetic.

Well, duh, I don’t know why I didn’t think of this before, but MySQL has a UNIX_TIMESTAMP function which sorts it all out for you by converting a MySQL date to a timestamp. Simon Willison explains it, so I don’t need to. So easy when you know how!

About the only reason I ever boot into Windows on the spare PC nowadays is because I need to use Enterprise Manager or Query Analyser to manage a SQL Server database. Today it was inconvenient to reboot, because I was in the middle of doing a lot of printing. So I googled “Linux SQL Server client” and came across sqsh. Installation and set up sounded a wee bit complicated (required compiling from source), but doable, so I put it on … the back burner, with a reminder to myself to do it sometime over the next week or so.

Then later on I happened to be sitting at the Mepis PC, so I pulled up Synaptic and typed sqsh into the search field. Hey presto! Two clicks and it was installed. “Still,” I thought, “I’m bound to have to fiddle about configuring it.”

Undeterred, I typed
sqsh -Umyusername -Pmypassword -Smyservername
at the shell prompt, and a sqsh prompt appeared. If Linux doesn’t give you an error message then nothing has gone wrong. So:
use mydatabasename
go

Still no error!
SELECT * FROM mylittletable
and there were the results! Fantastic — if possibly not all that secure when used remotely :-) Highly recommended if you need occasional access to a SQL Server or Sybase database from Linux.

I have been looking for a while for a tool to create Entity Relationship Diagrams (ERDs) on Linux. On Windows I used Dezign for Databases, an excellent tool which I would still be using if it was available for Linux. Linux tools that do as much as it does are few and far between. DBDesigner4 is open-source but wouldn’t run at all on my system and doesn’t seem to have any visible means of support. After much searching I ended up evaluating a commercial (i.e. non-free) product, Data Architect, from theKompany. After a few false starts trying to install the demo from an RPM, I managed to get it to work by the simple expedient of downloading and unzipping the tar archive. I had to edit the shell script to replace ksh with bash, but then it worked.

However, it requires ODBC to talk to databases (not an essential feature if you are just doing design amd then generating the necessary SQL scripts, but definitely nice to have). Being a Microsoft invention, ODBC is not that easy to get working on Linux. It took me all morning, but I succeeded in the end. So here’s how to do it on Xandros Home Premium:

1) Use Xandros Networks to search for and install unixODBC, the package that provides ODBC support (or use apt-get, or whatever package manager you use, or download direct from unixodbc.org).
2) As far as I could tell this did not install the GUI front-end that allows you to manage ODBC data sources in a similar way to the ODBC manager on Windows. This is not essential because you can simply edit the relevant configuration files if you know what you are doing. Not knowing this, I visited unixodbc.org and downloaded unixODBC-gui-qt-2.2.11-1.i386.rpm, which I then installed using Xandros networks.
3) Great, ODBC support is now available and the GUI works! However, I don’t have any drivers. I want to use it with MySQL, so it’s off to MySQL to download the necessary driver. Again I downloaded an RPM and installed using Xandros networks. I quickly found that this didn’t work with Xandros. So a quick Google sent me off here for a Debian package. Goodness knows why MySQL don’t provide a Debian version.
4) Now I again ran the GUI front end installed in step 2, and hey presto, the driver was there. The bad news: when I tried to set up and connect to a data source I got a message saying that the driver couldn’t find /var/mysqld/mysqld.sock. Not surprising, since this file did not exist. More googling and a poke around my file system established that the file was actually at /opt/lampp/var/mysql/mysql.sock. 5) So I needed to edit the odbc.ini file to tell it where to find the socket:

[MySQL-test]
Description = MySQL database test
Driver = MySQL
Server = localhost
Database = test
Port = 3306
Socket = /opt/lampp/var/mysql/mysql.sock
Option =
Stmt =

Progress! Now I can connect, but when I try to authenticate with a username and password, “Client does not support authentication protocol requested by server; consider upgrading MySQL client.” Aha, I’ve encountered this before; it’s to do with the password hashing in newer versions of MySQL being incompatible with clients expecting older version. So, simple solution:
6) Set up a user for ODBC purposes, with the appropriate privileges, and issue the command:

SET PASSWORD FOR odbcuser@localhost = OLD_PASSWORD('mypassword');

7) Try again, and yes, I can browse the databases and tables now! Nice added bonus: ODBC is also available in OpenOffice, so I can access my databases from there too. Let’s just hope that DataArchitect does the job …

The task: provide a neat hierarchical tree as a web interface to a project management system where any project might have a variable number of sub-projects. A tree structure in other words. So I ended up with this:

ID Project Name Parent ID
1 The Mega Project  
2 First Sub-Project 1
3 Second Sub-Project 1
4 Another Sub-Project 2

Nice and neat… the only problem is that on the face of it, relational databases aren’t very good at handling trees, and after a bit of experimentation I was ready to write a whole lot of procedural code in ASP to take a recordset and build a hierarchy from it. But of course I googled first. The initial responses included Joe Celko’s (in)famous nested-set tree structure which was far too complicated for my feeble mind to grasp and didn’t look as if it would provide a simple interface for users to add new projects.

And then I came across Philip Greenspun’s interpretation of the classic problem of an employee table in which some employees are managed by others who in turn are managed by others, all the way up to the big boss. He used the Oracle-specific CONNECT BY PRIOR which I’d come across elsewhere, but explained it in a way I could actually understand. Just the ticket! So with a single SQL statement I now had my neat hierarchical recordset.

Next: how to present it. Serendipitously, I happened to be looking at Matt Kruse’s Javascript Toolbox for another reason and noticed a DHTML Tree. I like his stuff, so I had a look. Magic! A simple bulleted list is transformed into an expanding and collapsing tree with a few lines of CSS and a sprinkling of Javascript. The nice thing is that in non-CSS and non-Javascript browsers it degrades to a simple list so it’s accessible too. My brain steamed lightly as I wrestled with the logic to get the opening and closing list tags in the right places as I read through the recordset, but a couple of hours’ work had given me an ideal solution and in another half hour I had a simple form for adding new projects at any level in the structure. So humble thanks to Philip and Matt for being far cleverer than I will ever be!

I’ve been developing websites with PHP for about 5 years now and I blush to confess that I have only just discovered I can use bind parameters with SQL, instead of painstakingly building up my SQL statements by concatenating strings, while remembering to deal appropriately with all potentially dangerous characters to avoid the risk of crashes or hacks.

Bind parameters are an unreservedly Good Thing. Not only do they make your SQL more readable, maintainable, and portable, but they also protect against the dreaded SQL injection which is the scourge of badly coded PHP websites developed by people who don’t know any better. And as if that wasn’t enough they can improve performance too.

I am no fan of reinventing the wheel, so since I first started using PHP I have used the open-source ADODB database abstraction library for database access. I like it because it’s fast and lightweight compared to its better-known competitor PEAR::DB. Tutorials are thin on the ground though, and I confess I only discovered its support for bind parameters recently, by reading between the lines of the documentation while looking up something else. Since then I’ve been gradually refactoring existing code — whenever I need to edit some code for some other reason I have a look for any embedded SQL and add some data-binding goodness to it. This is simple to do with ADODB. For example:

Before:

$SQL = "SELECT column1, column2, column3 FROM Tablename WHERE column1 = '$myvalue'";
$Result = $DB->GetRow($SQL);

having first carefully cleaned the contents of $myvalue of course!

After:

$SQL = "SELECT column1, column2, column3 FROM Tablename WHERE column1 = ?";
$Result = $DB->GetRow($SQL, array($myvalue));

Note I didn’t need to quote the string in the SQL statement, and the database library deals with any escaping necessary, in the appropriate style for the database being used — no MySQL-specific code here! In addition the SQL statement is pre-compiled — if you are going to execute the same statement a number of times with different values each time (e.g. when doing multiple inserts), this can really boost performance.

So what’s not to like? I don’t know why this technique isn’t better known among PHP developers. For example here is a lengthy thread discussing SQL injection, involving some apparently quite knowledgeable developers, and yet bind parameters are never mentioned; everyone pins their hopes on the stupidly named PHP function mysql_real_escape_string. Perhaps this will change now that support for binding is built into the PDO database access layer in PHP 5. Even if you are still using PHP 4.x you owe it to yourself to try a database abstraction library instead of endlessly hard-coding for one specific database engine.

Of course I should add that none of this means you can dispense with proper data validation before writing it to the database: the golden rule still remains “Never trust user input”!