You may be surprised, but the way you manage data in your app plays a crucial role in delivering a positive user experience. At the end of the day, it doesn’t matter how well your app’s interface is designed and how clean your code is unless your application is capable of quickly retrieving, processing,and delivering information. Moreover, all of this data should be protected so that intruders can’t get their hands on it. Luckily, this can be achieved with a wisely chosen database management system.
A database is a place where you store and organize all the data you’ve collected through your app, while a database management system (DBMS) is software for conveniently managing this database.
There are more than 300 solutions on the market. Going through such a crazy number of tools to choose the best option is truly overwhelming. But the nice thing is that you don’t have to. We’ve done it for you and will share our findings. So if you’re about to start a new project and need to pick the right database management system, this article is for you.
Types of databases
Let’s find out what types of databases exist. Different databases offer different structures, data models, and data stores. Below, we discuss types of databases you can use in your product.
A relational database is a set of tables that have predefined relations between them. To maintain and query the database, it uses Structured Query Language (SQL), that is a common user application that provides an easy programming interface for database interaction.
Relational databases consist of rows called tuples and columns called attributes. Tuples of a relation have the same attributes. A relational database is ideal for storing structured data. Structured data is highly organized and has a strict set of length-delimited values, automatically generated or prompted by a user. Zip codes, credit card numbers, dates, and ID numbers are typical examples of structured data.
Relational databases are a mature technology: they’re well-documented, boast great support, and work well with most modern frameworks and libraries. The brightest examples are PostgreSQL and MySQL. Both have proven stable and secure. MySQL is supported by Oracle and has libraries such as MeekroDB to ease development.
Another great advantage of relational databases is security. They support access permissions, which define who is allowed to read and edit a database. A database administrator can grant a particular user privileges to access, select, insert, or delete data. This gives no chance for third parties to steal information from a database.
Using an RDBMS protects against data loss and corruption thanks to a relational database’s ACID compliance: atomicity, consistency, isolation, and durability. To better understand what this means, let’s assume that two buyers are trying to buy a red dress of the same size simultaneously. ACID compliance ensures that these transactions won’t overlap each other.
Atomicity means that each transaction (a sequence of one or more SQL operations) is treated as a unit. It can either fail completely or succeed completely, and if one of the operations fails, the whole transaction fails. When a user purchases an item, money is withdrawn from the user’s account and deposited to the merchant’s account. Atomicity ensures that if the deposit transaction fails, the withdrawal operation won’t take place.
Consistency means that only valid data that follows all rules can be written in the database. If input data is incorrect, the database returns to its state before the transaction. This ensures that illegal transactions can’t corrupt the database.
Isolation means that unfinished transactions remain isolated. It ensures that all transactions are processed securely and independently.
Durability means that the data is saved by the system even if the transaction fails. Thanks to durability, data won’t be lost even if the system crashes.
[This is what ACID stands for]
ACID compliance is beneficial for apps handling financial, healthcare, and personal data since it automatically provides safety and privacy to users. Thanks to all these advantages, relational databases are a perfect fit for financial and healthcare projects.
But relational databases have disadvantages as well:
1. Relational databases lack flexibility; they don’t work efficiently with semi-structured or unstructured data, so they aren’t a good fit for large loads and IoT analytics.
2. When a relational database becomes complex, it becomes harder to share information from one large system to another. At big institutions, relational databases often grow independently in separate divisions.
3. Relational databases are run only on one server, which means that if you want your DBMS to cope with a larger amount of data, you need to invest in costly physical equipment.
These drawbacks forced developers to search for alternatives to relational databases. As a result, NoSQL and NewSQL databases emerged.
NoSQL databases, also called non-relational or distributed databases, serve as an alternative to relational databases. In contrast to relational databases, NoSQL databases can store and process unstructured data, giving developers more flexibility and scalability.
Unstructured or semi-structured data doesn’t have a predefined data model but still has some kind of structure. There are lots of examples of unstructured and semi-structured data: emails, data from social media, photos, MP3 files, and so on.
Data in non-relational databases can be changed on the fly without affecting existing data. NoSQL databases can be run across several servers, so scaling them is cheaper and easier than scaling SQL databases. And since NoSQL databases don’t rely on a single server, they’re more fault-tolerant. This means that in case one component fails, the database can continue operating.
But NoSQL databases are less mature than SQL databases, and the NoSQL community isn’t as well defined. Also, NoSQL databases often sacrifice ACID compliance for availability and flexibility.
NoSQL databases can be divided into four types:
This is the simplest type of NoSQL database, which can store only key-value pairs and offers basic functionality for retrieving the value associated with a specific key. A key-value store is a great option if you want to quickly find information with a key. Amazon DynamoDB and Redis are the brightest examples of key-value stores.
Their simple structure makes these databases extremely scalable. There’s no connection between values and they don’t require construction schemes, so the number of values is limited only by computing power. That’s why key-value stores are used by hosting providers like ScaleGrid, Compose, and Redis Labs. They’re also a good option for storing blog comments, product reviews, user profiles, and settings. Developers use key-value stores to cache data, since cached files of different users aren’t connected. Key-value stores are optimized for horizontal scaling, which means that you need to add more machines to store more data. This is less costly than scaling relational databases, but may lead to high utility costs for cooling and electricity.
But the simplicity of key-value stores can also become a disadvantage. With a key-value store, it’s hard or even impossible to perform the majority of operations available in other types of databases. While searching by keys is really fast, it can take much longer to search by values.
In most cases, key-value stores are used in combination with a database of another type. In the eCuris and KPMG apps, we used the Redis key-value store in combination with the PostgreSQL relational database management system.
Document-oriented databases store all information related to a given object in a single BSON, JSON, or XML file. Documents of the same type can be grouped into so-called collections or lists. These databases allow developers not to worry about data types and strong relations.
A document-oriented database usually has a tree or forest structure. A tree structure means that a root node has one or several leaf nodes. A forest structure consists of several trees. These data structures help document stores perform fast search. While it makes it difficult to manage complicated systems with numerous connections between elements, it lets developers create document collections by topic or type.
For instance, if you’re creating a music streaming app, you can use a document-oriented database to create a collection of songs by Rihanna so users can easily and quickly find her tracks in your app.
To be flexible, document-oriented databases neglect ACID guarantees.
Thanks to their structure and flexibility, document-oriented databases are commonly used for content management, rapid prototyping, and data analysis.
A columnar database is optimized for fast retrieval of columns of data. Column-oriented storage inverts the relational approach, so column-oriented databases store each column as a logical array of values. Databases of this type provide high scalability and can be easily duplicated.
[And example of wide-column store]
A column store deals well with both structured and unstructured data, making database exploration as simple as possible. Columnar databases process analytical operations really fast but show bad results when handling transactions. Apache Cassandra and Scylla are among the most popular column stores.
In a graph store, each entity, which is called a node, is an isolated document with free-form data. Nodes are connected by edges that specify their relationships.
This approach facilitates data visualization and graph analytics. Usually, graph databases are used to determine the relationships between data points. Most graph databases provide features such as to find a node with the most connections and to find all connected nodes.
[An example of graph store]
NewSQL – combining the best of SQL and NoSQL databases
Particular attention should be given to NewSQL, a class of relational databases that combines features of both SQL and NoSQL databases.
NewSQL databases are geared toward solving common problems of SQL databases related to traditional online transaction processing. From NoSQL, NewSQL inherited optimization for online transaction processing, scalability, flexibility, and a serverless architecture. Like relatonal databases, NewSQL databases are ACID-compliant and consistent. They have the ability to scale, often on demand, without affecting application logic or violating the transaction model.
NewSQL was introduced only in 2011, and it still isn’t that popular. It has only partial access to the rich SQL tooling. Flexibility and a serverless architecture combined with high security and availability without requiring redundant system increase the chances for NewSQL databases to become a next-gen solution for cloud technologies.
ClustrixDB, CockroachDB, NuoDB, MemSQL, and VoltDB are the most popular NewSQL databases.
Databases Yalantis uses
At Yalantis, we use four of these top-notch systems: Redis, PostgreSQL, MongoDB, and MySQL. Let’s take a closer look at them.
[Databases Yalantis uses]
MySQL is an open-source relational database created back in 1995 and now managed by Oracle. MySQL has a huge user base and great support, and it works well with most libraries and frameworks. It’s free, but it offers additional functionality for a fixed price.
Developers can install and use MySQL without spending long hours setting it up. Most tasks can be done in the command line. MySQL is a well-structured database with regular updates.
MySQL works perfectly with structured data at the basic level. But if you’re considering scaling your product in the future, you may need additional support, which costs a pretty penny. Also, it takes a lot of time to create incremental backups or change the data architecture in MySQL, while its rivals can do this automatically.
Uber, Facebook, Tesla, YouTube, Netflix, Spotify, Airbnb, and many other companies use MySQL for their services.
This is an object-relational database, which means that it’s similar to relational databases, only all data is represented in the form of objects instead of columns and rows.
PostgreSQL is an ideal solution for large systems, since it’s scalable and designed to handle terabytes of data. A hierarchy of roles to maintain user permissions means advanced security.
Unlike MySQL, PostgreSQL is completely free to use. Its open-source nature means that all documentation and support are provided by enthusiastic volunteers. It also means that in case you have problems with PostgreSQL, you’ll need to search for an expert who can solve your problem.
We migrated World Cleanup, an app for managing the World Cleanup Day event, from CouchDB to PostgreSQL. Migrating to PostgreSQL let us not only perform in and out operations simultaneously but also easily handle high loads.
Redis is an open-source key-value store that’s often used as a caching layer to work with another database. The main reason why developers opt for this tool is its speed, which far outstrips other database management systems. It’s easy to set up, configure, and use.
But Redis lacks built-in encryption and stores only five data types: lists, sets, sorted sets, hashes, and strings. The main purpose of Redis is to store data sets without a complex structure. That’s why this tool is usually paired with another database management system. Since Redis is a great solution for caching, we use it for this purpose in most of our projects, including in the KPMG, Husto, Mikitsune, and eCuris apps.
MongoDB is the #1 NoSQL database we use in our projects. All data is stored in BSON (Binary JSON) documents. Thanks to this, data can be easily transferred between web applications and servers in human-readable format.
MongoDB has onboard replication, providing high scalability and availability. Auto-sharding means that you can easily distribute data to servers connected with your app. In general, MongoDB is a great solution for dealing with massive unstruсtured data sets. It can underpin most big data systems, not only as a real-time, operational data store but in offline capacities as well.
But there are also several pitfalls of MongoDB. It stores key names for each value pair, increasing memory use. Also, there are no foreign key constraints to enforce consistency. And you can perform nesting for no more than 100 levels.
In combination with Redis, we used this database in Boothapp, a social e-commerce platform for the Middle Eastern market.
Mixing and matching databases
You can use several databases in one project. But combining two databases is not always a good idea. Developers should make this decision only after carefully analyzing a project’s needs and defining the product’s tech stack.
Redis is often used in combination with another databases. We used Redis in combination with PostgreSQL for eCuris, a medical platform to connect patients and medical professionals. We chose Redis for cache and token storage since it works faster than most modern databases. For the same reason, we used Redis together with PostgreSQL when developing an app for KPMG. We commonly use this pair in our projects, since we can quickly and easily make references from Redis to PostgreSQL.
Using MongoDB and PostgreSQL is a bad idea, since these databases are equal in terms of resource use and data storage. For instance, say you have a social network like Instagram and need to store information about posts, likes, followers, and user profiles. You store data about likes and posts in MongoDB, while user profiles and followers are stored in PostgreSQL. First, you need to take data about profiles from PostgreSQL, then data about posts from MongoDB, which is a time-consuming and inefficient solution.
There are several aspects you should pay attention to when choosing your database management tool:
Data types. This is probably the first thing you should consider when deciding on a database for your project. SQL database management systems are perfectly suited for storing and processing structured data, while NoSQL databases are the best solution for working with unstructured or semi-structured data. If you will be managing both structured and unstructured data, you can opt for mixing SQL and NoSQL databases.
Scalability. As your web product grows,its database should grow as well. Your choice of database may be affected by the type of scaling you prefer, whether horizontal or vertical. Non-relational databases with their key-value stores are optimized for horizontal scaling, while relational databases are optimized for vertical scaling.
Integration. Make sure that your database management system can be integrated with other tools and services within your project. In most cases, poor integration with other solutions can stall development. For instance, ArangoDB has excellent performance, but libraries for this DBMS are still very young and lack support. Using ArangoDB in combination with other tools may be risky, so the community suggests avoiding ArangoDB for complex projects.
Security. As it stores all user data, a database should be well-protected. ACID-compliant relational databases are more secure than non-relational databases, which trade consistency and security for performance and scalability. So you need to take additional steps to secure your NoSQL database.
As you can see, the choice of a database depends on many factors, including the types of data you’re going to collect and process, integrations with other tools, and the scaling approach you follow. It’s not just a question of SQL or NoSQL, as many think. And even though proper data management may not be the first thing you consider when optimizing the user experience, it definitely should be. We can help you find the best possible database solution for your web or mobile project.