Three ways of building and altering a users table

Synopsis

John Doe just started a new job as a CTO in a Uber-like startup. He will have to scale, disrupt and innovate. But for now, he needs a users table.

John could have met the day-to-day needs in an infinite number of ways. Let’s examine 3 scenarii amongst others (John chooses PostgreSQL).

1st Scenario — Bigmouth Buffalo’s Path

Day 1

John creates a users table with 6 columns: id (uuid, primary key), email (varchar), password (varchar), role (enum: customer, driver, admin), created_at and updated_at (timestamps).

Day 2

John alters the users table and adds a phone column (varchar, nullable). Phone is nullable: the users created on day one do not have a phone number, and John will maybe never know their phone numbers.

Day 3

John alters again the users table and adds country (varchar, nullable), gender (enum, nullable), and company (varchar, nullable).

Day 4

Biographies are only for drivers. There are at least two possibilities:

  1. Create a nullable bio text column (NULL for non-drivers)
  2. Because it’s only for drivers, create a drivers_biographies table with FK on users and a bio text column.

Each possibility has its pros and cons, John chooses to add a new table because he dislikes having too much NULL and he is certain he never will add biography to other account types. Maybe he is wrong.

Day 5

Clients can now ask for custom fields on their users. So it’s some kind of dynamic variable properties. John could add nullable new column on demand on its users table (client ask for users’ pets names, John adds a pet_name column). But it does not sound scalable to him, maybe he is wrong.

So, he decided to create two new tables: properties and users_properties. The properties table has 2 columns:

The users_properties has 3 columns:

Now every time a client ask for new properties on its users, John’s application adds a new entry in properties table. Then, for each new user with custom properties, some lines are added in users_properties table for its own properties (EAV-like pattern).

Post-mortem — Mary Foobar analysis

There seems to be some problems with this approach:

Quick and dirty implementation of the Bigmouth Buffalo’s Path.

Maybe John could have built a better architecture.

2nd Scenario —Bluefish’s path

Day 1

John is foresighted. As of Buffalo Mouth’s Path, he creates a users table with id, email, password and role. But he also creates immediately a users_properties table (Entity-Attribute-Value model), similar to Wordpress approach. users_properties is a table with user_id (foreign key on users table), key (varchar, example: name, company, phone, etc.), and value (text).

Day 2, 3, 4, 5

John is relaxed, he does not have to alter its table schemas. Its program just adds new users in database with their new properties. He created an admin panel to quickly add new properties.

Post mortem — Mary Foobar Analysis

There is one benefit over the previous design. It’s simple to explain and understand, each property is located in one unique table. EAV is a well known pattern, even if John had not heard about it, he would have invented it himself. There are only two tables for describing users metadata, and there will not be more. But there are also caveats:

SELECTname.meta_value AS name,FROMusers_properties age,users_properties name,users_properties genderWHERE age.key = ’age’AND age.value = ’50'AND age.user_id = name.user_idAND gender.user_id = age.user_idAND gender.key = ’gender’ and gender.value = ’f’

Unreadable random table found on Google Images

3rd Scenario — Rainbow Trout’s path

Day 1

As many else, John took an interest years ago in “NoSQL”. He tried and abandoned MongoDB to return to its first love, PostgreSQL. He remembers feeling schema-less data can have benefits in some cases, especially in variable metadata. So this time, John creates a users table with id, email, password, role, metadata (JSONB) created_at and updated_at. Its metadata column is schema-less, he could store objects like:

{"phone": "+33612345678", "company": "My company", "gender": "f"}

Day 2, 3, 4, 5

Table schema does not change.

Post mortem — Mary Foobar Analysis

There are some benefits with this design over previous paths:

There are some warnings too:

{ "bookings": [{ "id": "aaa-bb-cc", "date": "2017–01–12", "duration": "10 days"},{ "id": "xxx-yy-zz", "date": "2017–02–10", "duration": "1 day"},]}

Final Thoughts

In my tinkerer career, I created users table following these three paths (not precisely, but similar) and some other: we could think about dynamically adding column and tables, changing for another DBMS, etc. For the last few months, I’ve been following the third path. I’m sure there is no « right » way, and this path has many hidden caveats too (I should read more) but I’m OK with it for my day-to-day work. I migrated a legacy system with 60+ half-useless tables to about 10 tables by removing metadata tables. Not sure it’s better anyway, and I’m not saying less table is better, but in this specific case, I think code and database are easier to read, create, update and delete. I know I will discover a new path in a few month and be ashamed of what I have done.

Feel free to comment with advices, feedback and criticism. I would be really happy to learn more.

Sorry for long post, here is a potato