In this article, we are going to talk about a system for performing authentication and authorization securely. To start off with lets understand, what is the difference between Authentication and Authorization.
In this article, we will see:
Authentication and Authorization (source: Jeffrey Marvin Forones|Geek Culture. Modified)
Lets say we are in a meeting, and you are the leading the conversation. To ask for updates/status for something to the right person, you need to identify (ie Authenticate) the person. Even to share some sensitive data with a person, you need to authenticate the person correctly. And that is where authentication comes in.
Now say, in the same meeting, a few decisions needs to be made. So for that, people who have the right for taking those decisions should be the one taking the call, we cant just allow everyone to do everything. Obviously some people are not catered enough to make some decisions, and some for sure will try to make the worst out of it. So that brings Authorization, that gives certain people the rights permissions for certain activities.
Token Based Authentication; Access Token and Refresh Token [SPA=SinglePageApplication, RT=RefreshToken, AT=AccessToken, RS=RefreshServer, AS=AuthorizationServer] (Source : Okta)
To authenticate a person, we can assign a unique phrase to each person, and given the person tells the phrase correctly and their name. We can say that ok, we have identified the person. This is the usual usernames and passwords approach. When the right credentials are given, a system considers the identity valid and grants access. This is known as 1FA or Single-factor authentication(SFA).
SFA is considered fairly insecure. Why? Because users are notoriously bad at keeping their login information secure. Multi-factor authentication (MFA) is a more secure alternative that requires users to prove their identity in more than one way. Some such ways are:
Once authenticated, the person would keep performing actions freely on the application. And the application is expected to have the person recognized throughout their journey without forgetting them. Ideally, it would be too much to ask the user to provide the password everytime they move to a different page, or they do some activity. So we need a way to keep the user authenticated after they have entered their credentials and they have been authenticated once. This is called Session Management.
2 ways to keep the user authenticated:
The main differences between these two approaches would be that token-based authn is Stateless, cause the token neednot be stored on the server side. But for session-based authentication, the token are needed to be stored on the server side as well, which makes it Stateful. Which brings up complications, when the system is scaled or the number of users grows.
For token-based authentication, we mostly use JWTs (JSON Web Tokens).
Role Based Authorization Control (RBAC) (source: Ajay Shekhawat | Dribble)
Once the user is authenticated, we would still need to ensure they’re only allowed to access resources that they have permissions to access. Unauthorized access to sensitive data can be a disaster. By the principle of least priviledge, companies would usually set up access policies such that by default you have access to what is required for you absolutely. And then in progression to that you have additional access. Common ways to segment access are:
A screen for Access Control Lists (source: Povio)
ACL is frequently used at granular level than either ABAC or RBAC – for example, to grant individual users access to a certain file. ABAC and RBAC are generally instituted as company-wide policies.
Authentication and Authorization System Design (source: InterviewPen. Modified)
Lets first start with defining the Functional requirements of the system:
A few Non-functional requirements that we are not going to consider for the scope of this article are:
Traffic Estimation
First lets start with Traffic Estimation. Assuming an average traffic of 100,000 per month.We are estimating a 100k user traffic per month. Which translates to 0.04 request per second. We would need to respond to each request within 500ms 90% of the time, ie we require a p90 latency of 500ms.
assumed_traffic_per_month = 100000 #requests
assumed_traffic_per_day = assumed_traffic_per_month / 30
~= 3350 (assuming on higher end; 3333.33 to be precise)
estimated_time_per_request = 500 #ms; P90 of 500ms
traffic_per_second = (assumed_traffic_per_month) / (30*24*60*60)
= 0.04
Service Level Objective (SLO) : 500ms (maximal acceptable latency, immaterial of the load on the system) The average capacity 1 instance can take, based on our calculations is approximately 35ms to serve a request, assuming there are no heavy processing happening for the particular request.
Lets generate two more derived metrics using the above metrics.
Thus,
SLO = 500ms
approx_response_time_for_one_request = 35 #ms
capacity = SLO/approx_response_time_for_one_request
= 500 / 35
~= 20
load_on_one_instance = 0.04
instances_available = 1
demand = traffic_per_second / instances_available
= 0.04
With the demand and capacity available, lets calculate total number of instances required.
total_units_required = demand / capacity
= 0.04 / 20
= 0.002
~= 1
Thus, we would be easily be able to handle 100k requests per month, with 0.04 requestsper second, with 1 instance. Where each unit can handle 20 requests per second without compromising SLO.
Storage Estimation
We would ideally need to store the user details for each user for authentication and authorization access. Assuming, 5kb /user
monthly_new_users = 500
monthly_additional_storage = 500 * 5kb
= 2500kb
~= 2GB
So every month, assuming we will onboard 500 new users, we will require 2GB more storage. Incase we would like to keep authentication logs. Each authentication request is expected to take 2kb to store.
auth_request_size = 2kb #assumption
monthly_storage = monthly_visitors * auth_request_size
= 100,000 * 2KB
~= 200MB
Thus, each month we would require an additional of 200MB, assuming a monthly traffic of 100k.
Now that we have the capacity estimation done. Lets create the schemas of the database required to support the functional requirements.
Authentication and Authotization Database Schema
Lets quickly go over the tables. We are using 6 tables.
Authentication and Authotization HLD
System Endpoints
Endpoint | Description |
---|---|
/login | Authenticate user credentials. |
/logout | End user session and revoke authentication tokens. |
/register | Create a new user. |
/update/:userId | Update user information. |
/delete/:userId | Delete a user account. |
/grant/:userId/:permission | Grant specific permissions to a user. |
/revoke/:userId/:permission | Revoke permissions from a user. |
/check/:userId/:resource | Check user’s access to a specific resource. |
/create/:userId | Create a new user session. |
/expire/:sessionId | Expire a user session. |
/validate/:sessionId | Validate an active user session. |
Requirements Fulfilment
Now, with all the things in place lets see how we can complete all the requirements.
Registration
Login
Session Management
Password Recovery
Access Control
Audit Trail
Performance
Authentication vs Authorization (source: OutSystems)
In this article, we started by understanding what is the difference between Authentication and Authorization. Next, we created a Authentication and Authorization System. That is safe, secure, delivers performance while catering to industry standards and meeting all the desired requirements. Going forward i might update certain parts of the article to make it stay relevant as well as to cover more information and insights in building such a system.
This article was originally published by Arunava on HackerNoon.
The internet user base in India is set to surpass 900 million by 2025, driven…
Varaha, an Indian company developing carbon removal projects in Asia, has sold 100,000 carbon dioxide…
Ever wondered what happens when quantum computing takes a giant leap forward? Google’s latest quantum…
Does AI need to be reined in? Will putting regulations on AI curb the progress…
By definition of the Merriam-Webster dictionary, ‘technology’ means ‘the practical application of knowledge especially in…
This is the second-last edition of this year's "Tech, What the Heck!?" newsletter. To commemorate…