NoSQL getting started about NoSQL-Alibaba Cloud Developer Community

It may have been almost a year since I applied for the NoSQL column, and I haven't filled in any articles. I saw it today. I 'd better write one first and put it in. Now there are a lot of people applying NoSQL, and people may no longer be unfamiliar with it. Chinese materials have been flying all over the world. However, the number of answers to NoSQL-related topics in Zhihu is very small. Perhaps everyone pays more attention to the application of relevant practical technologies and ignores the essence of this concept.

What is NoSQL?

In Baidu Encyclopedia, NoSQL refers to non-relational databases. Chinese name: non-relational database, foreign name: NoSQL = Not Only SQL

see wikipedia: A NoSQL (originally referring to "non SQL "or "non relational") database provides a mechanism storage and retrieval of data which is modeled in means other than the tabular relations used in relational databases .

NoSQL (originally referred to as "non-SQL" or "non-relational") databases provide a mechanism for storing and retrieving data in models, different from the table relationship method used in relational databases.

See the NoSQL Ultimate Guide ( nosql-database.org ) in the said: NoSQL DEFINITION: Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable .

NoSQL DEFINITION: The Next-Generation database mainly solves some key points: non-relational, distributed, open source code and horizontal scaling .

The original intention has been modern web-scale databases . The movement began early 2009 and is growing rapidly. Often more characteristics apply such: schema-free, easy replication support, simple API, eventually consistent / BASE (not ACID), a huge amount of data and more. So the misleading term "nosql "(the community now translates it mostly" not only SQL ") should be seen as an alias to something like the definition above.

The original intention is a database of modern network scale. The campaign began in early 2009 and is growing rapidly. Usually features (common features), such as open architecture without Architecture (without predefined mode), easy replication, simple API, consistent and basic (ACID is not supported), and supports massive data storage. Therefore, the misleading term "NoSQL" (now the Society mostly translates it into "not only SQL"), it should be considered as an alias similar to the preceding definition.

Past Lives

NoSQL has only become popular in recent years and has grown rapidly. When did it begin to exist?

Such databases have existed since the late 1960s, but did not obtain the "NoSQL" moniker until a surge of popularity in the early twenty-first century.

Good morning, such databases have existed since the late 1960 s, but they have not obtained the nickname "NoSQL.

However, previous application scenarios are more suitable for relational databases, so NoSQL databases are not required by most people and are not known by most people.

The word NoSQL first appeared in 1998. It is a lightweight, open-source relational database developed by Carlo Strozzi and does not provide SQL functions. (he believes that because NoSQL deviates from the traditional relational database model, therefore, it should have a brand new name, such as "NoREL" or similar name). 2009, Last year. The Johan Oskarsson of fm initiated a discussion about distributed open source databases. Rackspace from Eric Evans once again proposed the concept of NoSQL. At this time, NoSQL mainly refers to non-relational, distributed, ACID Database design mode is not provided. The "no: SQL (east)" seminar held in Atlanta in 2009 was a milestone, and its slogan was "select fun, profit from real_world where relational = false;". Therefore, the most common explanation for NoSQL is "non-relational", which emphasizes the advantages of key-value storage and document database, rather than simply opposing relational database.

Reason for birth

with the rise of Web 2.0 websites on the Internet, traditional relational databases have been unable to cope with Web 2.0 websites, especially the ultra-large and highly concurrent SNS-type Web 2.0 pure dynamic websites, which have exposed many insurmountable problems, however, non-relational databases have developed rapidly due to their own characteristics. NoSQL databases are created to solve the challenges of multiple types of data in large-scale data sets, especially the big data application challenges.

Four categories of NoSQL databases

Key-Value storage database

this type of database mainly uses a hash table, which has a specific key and a pointer pointing to specific data. The Key/value model has the advantages of simplicity and easy deployment for IT systems. However, if DBA only queries or updates some values, the Key/value is inefficient. [3] Examples: Tokyo Cabinet/Tyrant, Redis, Voldemort, Oracle BDB.

The column storage database.

These databases are usually used to deal with massive amounts of data stored in distributed storage. Keys still exist, but they point to multiple columns. These columns are arranged by the column family. For example, Cassandra, HBase, and Riak.

Document Database

the document database is inspired by Lotus Notes office software, and it is similar to the first key-value storage. This type of data model is a versioning document. Semi-structured documents are stored in a specific format, such as JSON. Document databases can be considered as an upgraded version of key-value databases that allow nested key-value pairs. Document databases are more efficient than key-value databases. For example, CouchDB, MongoDb. There are also document-based database SequoiaDB in China, which are already open source.

Graph Database

unlike SQL databases of other rows and columns and rigid structures, graphical databases use flexible graphical models and can be extended to multiple servers. NoSQL databases do not have a standard query language (SQL), so a data model needs to be developed for database query. Many NoSQL databases have REST-type data interfaces or query APIs. [2] For example: Neo4J, InfoGrid, Infinite Graph.

Therefore, we conclude that NoSQL databases are applicable in the following situations: 1. The data model is relatively simple; 2. It Systems with stronger flexibility are required; 3. Database performance requirements are relatively high; 4. High data consistency is not required. 5. For a given key, it is easier to map the environment with complex values.

Comparative analysis of four categories

common features

  · Simple data model. Unlike distributed databases, most NoSQL systems adopt simpler data models. In this data model, each record has a unique key, and the system only supports atomicity at the single record level, foreign keys and cross-record relationships are not supported. This restriction of obtaining a single record at a time greatly enhances the scalability of the system, and data operations can be performed on a single machine without the overhead of distributed transactions.

  · Separation of metadata and application data. The NoSQL data management system needs to maintain two types of data: metadata and application data. Metadata is used for system management, such as data partitioning to nodes and replicas in the cluster. Application data is the business data that users store in the system. The reason why the system separates these two types of data is that they have different consistency requirements. If the system is running properly, metadata must be consistent and real-time, and the consistency requirements of application data vary depending on the application occasion. Therefore, to achieve scalability, NoSQL adopts different strategies for managing two types of data. Some NoSQL systems do not have metadata. They use other methods to map data to nodes.

  • Weak consistency. NoSQL achieves consistency by copying application data. This design makes the replica synchronization overhead high when updating data. To reduce this synchronization overhead, weak consistency models such as final consistency and timeline consistency are widely used.

With these technologies, NoSQL can well cope with the challenges of massive data. Compared with relational databases, NoSQL has the following advantages:

  avoid unnecessary complexity. Relational databases provide a variety of features and strong consistency, but many features can only be used in certain specific applications, and most features are rarely used. NoSQL provides fewer features to improve performance.

  · High throughput. Some NoSQL data systems have a much higher throughput than traditional relational data management systems. For example, Google uses MapReduce to process 20PB of data stored in Bigtable every day.

  • High-level scalability and low-end hardware clusters. NoSQL data systems can be scaled horizontally. Unlike relational database clusters, such scaling does not cost much. The design concept based on low-end hardware saves a lot of hardware overhead for users using NoSQL data systems.

  · Avoid expensive object-relational mapping. Many NoSQL systems can store data objects, which avoids the cost of converting relational models in databases and object models in programs.

Main disadvantages

although NoSQL provides high scalability and flexibility, it also has its own disadvantages:

  · Data models and query languages have not been mathematically verified. SQL, a query structure based on relational algebra and relational calculus, has a solid mathematical guarantee. Even if a structured query itself is complex, it can obtain all the data that meets the conditions. Because NoSQL systems do not use SQL, some models used do not have a complete mathematical basis. This is also one of the main reasons why NoSQL systems are relatively chaotic.

  · ACID is not supported. This brings both advantages and disadvantages to NoSQL. After all, transactions are still needed in many occasions. ACID allows the system to ensure accurate execution of online transactions in case of interruption.

ACID,指数据库事务正确执行的四个基本要素的缩写。包含:原子性(Atomicity)、一致性(Consistency)、隔离性(Isolation)、持久性(Durability)。一个支持事务(Transaction)的数据库,必需要具有这四种特性,否则在事务过程(Transaction processing)当中无法保证数据的正确性,交易过程极可能达不到交易方的要求。
   · Simple functions. Most NoSQL systems provide simple functions, which increases the burden on the application layer. For example, if ACID is implemented at the application layer, programmers writing code must be extremely painful.

 · There is no unified query model. NoSQL generally provides different query models, which increases the burden on developers to some extent.

Concluding Remarks

NoSQL may be a stunt at first, but with the lifting of Web 2.0, the demand for non-relational databases has increased rapidly, and the related databases have sprung up rapidly, at this time, as a group opposite to or above relational databases, what is used to represent it? NoSQL comes on stage.

Reference:

baidu Encyclopedia entry: NoSQL

wikipedia:NoSQL

big data management system: NoSQL database past and present

NoSQL Ultimate Guide ( nosql-database.org )

Selected, One-Stop Store for Enterprise Applications
Support various scenarios to meet companies' needs at different stages of development

Start Building Today with a Free Trial to 50+ Products

Learn and experience the power of Alibaba Cloud.

Sign Up Now