You’ll have to bear with me here for possibly getting some of the terminology slightly wrong as I wasn’t even aware that this fell into the whole ‘multi-tenant’ ‘software as a service’ category, but here it does.
I’ve developed a membership system (in PHP) for a client. We’re now looking at offering it as a completely hosted solution for our other clients, providing a subdomain (or even their own domain).
The options I seem to have on the table, as far as data storage goes are:
Option 1 – Store everything in 1 big database, and have a ‘client_id’ field on the tables that need it (there would be around 30 tables that it would apply to), and have a ‘clients’ table storing their main settings, details, etc and the domain to map to them. This then just sets a globally accessible variable containing their individual client id – I’d obviously have to modify every single query to check for the client_id column.
Option 2 – Have a master table with the ‘shared reference’ tables, and the ‘clients’ table. Then have ‘blocks’ of other databases, which each contain, say 10 clients. The client would get their own database tables, prefixed with their client ID. This adds a little bit of security to protect against seeing other client data if something went really wrong.
Option 3 – Exactly the same as option 2, except you have 1 database for each and every client, completely isolating them from other clients, and theoretically providing a bit more protection that if 1 client’s tables were hacked or otherwise damaged, it wouldn’t affect anyone else. The biggest downside is that when deploying a new client, an entire database, user and password need setting up, etc. Could this possibly also cause a fair amount of overhead, or would it be pretty much the same as if you had everyone in one database?
A few points as well – some of these clients will have 5000+ ‘customers’ along with all the details for those customers – this is why option 1 may be a bit of an issue – if I’ve got 100 clients, that could equal over half a million rows in 1 table.
Am I correct in thinking Option 3 would be the best way to go in a situation where security of customer data (and payment information) is key. From recommendations I’ve had, a few people have said to go with option 1 because ‘its easier’ however I really don’t see it that way. I see it as a potential bottleneck down the line, as surely I can move clients around much easier if they have their own database.
(FYI The system is PHP based with MySQL)
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Option 3 is the most scalable. While at first it may seem more complicated, it can be completely automated and will save you headaches on the future. You can also scale more efficiently by having client databases on multiple servers for increased performance.
Method 2
I agree with Ozzy – I did this for an online database product. We had one master database that basically had a glorified user table. Each customer had their own database. What was great about this is that I could move one customers database from server A to server B easily [mysql] and could do so with command line tools in a pinch. Also doing maintenance on large tables, dropping/adding indexes can really screw up your application, especially if say, adding an index locks the table [mysql]. It affects everyone. With, presumably, smaller databases you are more immune to this and have more options when you need to roll out schema level changes. I just like the flexibility.
Method 3
When many years ago I designed a platform for building SaaS applications in PHP, I opted for the third option: multi tenant code and single tenant databases.
In my experience, that is the most scalable option, but it also needs a set of scripts to propagate changes when updating code, DB schemes, enabling applications to a tenant, etc.
So a lot of my effort went in building a component based, extensible engine to fully automate all those tasks and minimize system administration stuff. I strongly advise to build such an architecture if you want to adopt the third option.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0