I have a user table with
username columns, and both are unique.
username, which would be better to use as a foreign key and why?
My Boss wants to use string, is that ok?
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Is string or int preferred for foreign keys?
In the OP’s case, there is both a surrogate key (
int userId) and a natural key (
varchar username). Either column can be used as a Primary key for the table, and either way, you will still be able to enforce uniqueness of the other key.
Here are some considerations when choosing one way or the other:
The case for using Surrogate Keys (e.g. UserId INT AUTO_INCREMENT)
If you use a surrogate, (e.g.
UserId INT AUTO_INCREMENT) as the Primary Key, then all tables referencing table
MyUsers should then use
UserId as the Foreign Key.
You can still however enforce uniqueness of the
username column through use of an additional unique index, e.g.:
CREATE TABLE `MyUsers` ( `userId` int NOT NULL AUTO_INCREMENT, `username` varchar(100) NOT NULL, ... other columns PRIMARY KEY(`userId`), UNIQUE KEY UQ_UserName (`username`)
As per @Dagon, using a narrow primary key (like an
int) has performance and storage benefits over using a wider (and variable length) value like
varchar. This benefit also impacts further tables which reference
MyUsers, as the foreign key to
userid will be narrower (fewer bytes to fetch).
Another benefit of the surrogate integer key is that the username can be changed easily without affecting tables referencing
username was used as a natural key, and other tables are coupled to
username, it makes it very inconvenient to change a username (since the Foreign Key relationship would otherwise be violated). If updating usernames was required on tables using
username as the foreign key, a technique like ON UPDATE CASCADE is needed to retain data integrity.
The case for using Natural Keys (i.e. username)
One downside of using Surrogate Keys is that other tables which reference
MyUsers via a surrogate key will need to be
JOINed back to the
MyUsers table if the
Username column is required. One of the potential benefits of Natural keys is that if a query requires only the
Username column from a table referencing
MyUsers, that it need not join back to
MyUsers to retrieve the user name, which will save some I/O overhead.
int will index faster, may or may not be an issue, hard to say based on what you have provided
An int is 4 bytes, a string can be as many bytes as you like. Because of that, an int will always perform better. Unless ofcourse if you stick with usernames that are less than 4 characters long 🙂
Besides, you should never use a column as PK/FK if the data within the column itself can change. Users tend to change their usernames, and even if that functionality doesn’t exist in your app right now, maby it will in a few years. When that day comes, you might have 1000 tables that reference that user-table, and then you’ll have to update all 1000 tables within a transaction, and that’s just bad.
It depends on the foreign key: If your company has control over it, then I recommend using an Int if there is an ID field for it. However, sometimes an ID field is not on a table because another key makes sense as an alternate unique key. So, the ID field might be a surrogate key in that case.
Rule of thumb: Your foreign key data type should match your primary key data type.
Here’s an exception: what about foreign keys that don’t belong to your company? What about foreign keys to databases and APIs that you have no control over? Those IDs should always be strings IMO.
To convince you, I ask these questions:
Are you doing math on it? Are you incrementing it? Do you have control over it? APIs are notorious for change, even data types CAN be changed in someone else’s database… so how much will it mess you up when an int ID becomes a hex?