SQL stands for Structured Query Language and is used to communicate within and manipulate relational databases. It is good at handling structured data that includes relations between entities and variables of those entities.
Another Brief History Lesson
SQL slowly became the universal standard for relational database management systems(RDBMS) and remains an important player in the database world, but it wasn’t actually available to the public until 1979. A company called Relational Software released its commercial version of SQL called Oracle V2. If you haven’t already guessed, Relational Software would later become Oracle. After SQL became commercialized, products like MySQL were born out the desire of companies to not spend time managing their own database or writing their own SQL from scratch. It remains one of the most popular SQL database management systems today and set a standard for all database management systems that exist today.
To SQL or To NoSQL, That is THE Question
Ah, the question. THE question that everyone and anyone has heard, debated, and discussed if they know even a sliver of information about databases. First off, I will share my favorite blog on comparing SQL and NoSQL, written by HarperDB CTO, Kyle. He does an excellent job explaining the difference and where HarperDB fits in (we are kind of an oddball ? in a good way). We also have an excellent article written by my colleague, Margo, on the different types of database architectures.
So when I was learning about SQL and NoSQL my first question was, if SQL is great, why do we need NoSQL? Well, like almost anything in the world, SQL has limitations. Relational databases typically scale better vertically than they do horizontally. What do I mean when I say scaling horizontally and vertically? Well that could in itself be an entire blog, so I will share this great resource summarizing scalability and what it means to scale in both directions. Keep in mind that scalability “is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth.”
SQL is powerful and can handle a variety of data. It can handle the data loads that we used to see back in the 1970’s, 80’s, and 90’s. As the world became more advanced and new technology emerged, we started to see more and more AND more data. Data sets grew, and they are still growing today especially with the rise of IoT and edge computing. Relational databases couldn’t keep up and it doesn’t help that SQL is, ehehmmm, structured and requires a schema for your data. This is where NoSQL comes in.
Large, usually unstructured data can be anything from tweets, emails, images, and documents, to video and audio files. As for SQL and RDBMS, relational literally means data is related rather than random, and these types of databases are typically tabular so at their core, they are not designed for unstructured data. (However, some relational databases today have built in functionality to handle this problem). Not to mention, the scale! As Kyle stated in his blog, relational databases can scale vertically, but this has limits and usually means migrating to a larger server. This process can be expensive, tedious, and frankly terrifying because so many things can go wrong and in 2020, data is gold, so losing it is kind of a big deal. NoSQL means, ehehmmm, no structure or schemaless, meaning you can just throw your data in, no structure necessary. You may also hear the term dynamic schema, which means the data structure changes dynamically as new data is ingested.
Think about the structure of NoSQL being a bucket you just dump data into and that data can still be sorted and interacted with, while SQL is a book shelf that you carefully plan out how you want to categorize your data.
The other main differences include SQL being table based while NoSQL is key value or document based, both of these relating back to the structured versus unstructured concepts. Think about the structure of NoSQL being a bucket you just dump data into and that data can still be sorted and interacted with, while SQL is a book shelf that you carefully plan out how you want to categorize your data. It’s faster and easier to dump data into the bucket and this bucket can hold a lot more data without the walls of the bookcase, but it’s harder to quickly find what you need. The book case is great because it keeps everything in a neat and tidy system, but what if piece of data could go on two different shelves or NEEDs to be on both shelves?
What is your favorite thing about SQL databases? What about NoSQL? Here is another great source, where they break SQL and NoSQL down like two different towns.
SQL is Here to Stay: NewSQL
Many databases claim to be NewSQL as it seems to be a hot new buzzword to some folks. It is a valid database category in my opinion, but there is confusion around how to group databases under NewSQL. As one of our co-founders at HarperDB wrote in his recent blog about NewSQL:
When we look at what defines a “NewSQL” database we see a few key things:
- “…a class of relational database management systems that seek to provide the scalability of NoSQL systems… “
- “….while maintaining the ACID guarantees…”
- “Many enterprise systems … are too large for conventional relational databases, but … are not practical for NoSQL systems.”
- “The only options previously available for these organizations were to either purchase more powerful computers”
- “…or to develop custom middleware…”
- “Both approaches feature high infrastructure costs and/or development costs. NewSQL systems attempt to reconcile the conflicts.”
When we first started HarperDB, we discussed having the capabilities to perform SQL and NoSQL and we thought people would instantly get how AWESOME that was…but they didn’t. They would respond with, So? This is because NewSQL was still a newer and not well defined term, people believed that databases like MongoDB had both because they provided something SQL-ish. It wasn’t really SQL, but to most folks it meant they could do both. (Tools like MongoDB actually incorporate complex integrations to try to achieve SQL functionality, but HarperDB provides that out-of-the-box in one simple tool). So NewSQL is the true blending of both NoSQL and SQL. Data has become so complex, with massive amounts of it coming in from millions of sources. NewSQL is the answer to our increasingly data focused world.
NewSQL is only going to keep growing and evolving and it’s an excellent choice for many use cases. For one, it’s great for startups and newer companies that haven’t dug their heels into a legacy system with complex architectures that have cost them a lot of money. Individual developers working on side projects might choose NewSQL as well. Another example is use cases where traditional ACID compliant SQLf is needed (as opposed to BASE compliance more commonly related to NoSQL databases), that also have a growing need for high performance and larger scaling capabilities at lower costs than traditional RDBMS systems stacked on top of servers. NoSQL database users might also see a need to implement ACID compliant SQL while maintaining the performance and scale that they have come to love with NoSQL.Even coding schools are looking at using NewSQL products so that they can easily expose students to SQL and NoSQL with one tool.
Wrap it Up
In conclusion, as our world evolves and we become more and more technologically advanced, our data evolves right along with us. SQL is never going away, it was the first type of universally acknowledged language for databases after all. NoSQL and SQL databases are vast, mainly including their structure, scalability, and performance. Each have important roles to play in 2020 and will continue to be critical to running a business or project moving forward. NewSQL is a modern solution to a problem that didn’t really exist in the 1980’s, but now, one could argue that needing the “best of both worlds” is THE biggest problem for many enterprises and startups alike. What do you think about NewSQL? Let me know on the original post!