In this article, we are going to talk about a system for performing authentication and authorization securely. To start off with lets understand, what is the difference between Authentication and Authorization.


In this article, we will see:


Why Authentication and Authorization matter?

Lets say we are in a meeting, and you are the leading the conversation. To ask for updates/status for something to the right person, you need to identify (ie Authenticate) the person. Even to share some sensitive data with a person, you need to authenticate the person correctly. And that is where authentication comes in.

Now say, in the same meeting, a few decisions needs to be made. So for that, people who have the right for taking those decisions should be the one taking the call, we cant just allow everyone to do everything. Obviously some people are not catered enough to make some decisions, and some for sure will try to make the worst out of it. So that brings Authorization, that gives certain people the rights permissions for certain activities.

How Authentication works?

To authenticate a person, we can assign a unique phrase to each person, and given the person tells the phrase correctly and their name. We can say that ok, we have identified the person. This is the usual usernames and passwords approach. When the right credentials are given, a system considers the identity valid and grants access. This is known as 1FA or Single-factor authentication(SFA).

SFA is considered fairly insecure. Why? Because users are notoriously bad at keeping their login information secure. Multi-factor authentication (MFA) is a more secure alternative that requires users to prove their identity in more than one way. Some such ways are:

Once authenticated, the person would keep performing actions freely on the application. And the application is expected to have the person recognized throughout their journey without forgetting them. Ideally, it would be too much to ask the user to provide the password everytime they move to a different page, or they do some activity. So we need a way to keep the user authenticated after they have entered their credentials and they have been authenticated once. This is called Session Management.

2 ways to keep the user authenticated:

The main differences between these two approaches would be that token-based authn is Stateless, cause the token neednot be stored on the server side. But for session-based authentication, the token are needed to be stored on the server side as well, which makes it Stateful. Which brings up complications, when the system is scaled or the number of users grows.

For token-based authentication, we mostly use JWTs (JSON Web Tokens).

How Authorization works?

Once the user is authenticated, we would still need to ensure they’re only allowed to access resources that they have permissions to access. Unauthorized access to sensitive data can be a disaster. By the principle of least priviledge, companies would usually set up access policies such that by default you have access to what is required for you absolutely. And then in progression to that you have additional access. Common ways to segment access are:

ACL is frequently used at granular level than either ABAC or RBAC - for example, to grant individual users access to a certain file. ABAC and RBAC are generally instituted as company-wide policies.

Authentication System Design

Requirements

Lets first start with defining the Functional requirements of the system:

A few Non-functional requirements that we are not going to consider for the scope of this article are:

Capacity Estimation

Traffic Estimation

First lets start with Traffic Estimation. Assuming an average traffic of 100,000 per month.We are estimating a 100k user traffic per month. Which translates to 0.04 request per second. We would need to respond to each request within 500ms 90% of the time, ie we require a p90 latency of 500ms.

assumed_traffic_per_month = 100000 #requests
assumed_traffic_per_day = assumed_traffic_per_month / 30
                        ~= 3350 (assuming on higher end; 3333.33 to be precise)
estimated_time_per_request = 500 #ms; P90 of 500ms
traffic_per_second = (assumed_traffic_per_month) / (30*24*60*60) 
                   = 0.04

Service Level Objective (SLO) : 500ms (maximal acceptable latency, immaterial of the load on the system) The average capacity 1 instance can take, based on our calculations is approximately 35ms to serve a request, assuming there are no heavy processing happening for the particular request.

Lets generate two more derived metrics using the above metrics.

Thus,

SLO = 500ms
approx_response_time_for_one_request = 35 #ms
capacity = SLO/approx_response_time_for_one_request
         = 500 / 35
         ~= 20

load_on_one_instance = 0.04
instances_available = 1
demand = traffic_per_second / instances_available
       = 0.04

With the demand and capacity available, lets calculate total number of instances required.

total_units_required = demand / capacity
                     = 0.04 / 20
                     = 0.002
                     ~= 1

Thus, we would be easily be able to handle 100k requests per month, with 0.04 requestsper second, with 1 instance. Where each unit can handle 20 requests per second without compromising SLO.

Storage Estimation

We would ideally need to store the user details for each user for authentication and authorization access. Assuming, 5kb /user

monthly_new_users = 500
monthly_additional_storage = 500 * 5kb
                           = 2500kb 
                           ~= 2GB

So every month, assuming we will onboard 500 new users, we will require 2GB more storage. Incase we would like to keep authentication logs. Each authentication request is expected to take 2kb to store.

auth_request_size = 2kb #assumption
monthly_storage = monthly_visitors * auth_request_size
                = 100,000 * 2KB
                ~= 200MB

Thus, each month we would require an additional of 200MB, assuming a monthly traffic of 100k.

Database Design

Now that we have the capacity estimation done. Lets create the schemas of the database required to support the functional requirements.

Lets quickly go over the tables. We are using 6 tables.

  1. Users - To store all the user information
  2. Credentials - To store the access/refresh credentials once the user has been authorized.
  3. Passwords - To store the user encrypted user passwords.
  4. PasswordRequests - To store the password change requests that comes for a particular user.
  5. Sessions - To store when the user had an active session and when was their last activity.
  6. ActivityApproval - To store approval requests for a activity performed by a particular user, that would be verified by admin.

High-Level Design for the Authentication System

System Endpoints

Endpoint

Description

/login

Authenticate user credentials.

/logout

End user session and revoke authentication tokens.

/register

Create a new user.

/update/:userId

Update user information.

/delete/:userId

Delete a user account.

/grant/:userId/:permission

Grant specific permissions to a user.

/revoke/:userId/:permission

Revoke permissions from a user.

/check/:userId/:resource

Check user’s access to a specific resource.

/create/:userId

Create a new user session.

/expire/:sessionId

Expire a user session.

/validate/:sessionId

Validate an active user session.

Requirements Fulfilment

Now, with all the things in place lets see how we can complete all the requirements.

Registration

Login

Session Management

Password Recovery

Access Control

Audit Trail

Performance

Conclusion

In this article, we started by understanding what is the difference between Authentication and Authorization. Next, we created a Authentication and Authorization System. That is safe, secure, delivers performance while catering to industry standards and meeting all the desired requirements. Going forward i might update certain parts of the article to make it stay relevant as well as to cover more information and insights in building such a system.