I'm glad that the article concluded "No" to it's own headline. Calling SQL "syntactic sugar" is an insult to sugar. The "helpful diagram explaining how the scoping rules work" alone should make people blanch. The language is a syntactic disaster that we've been saddled with out of habit and inertia.
Call a masochist but I love me good old SQL. I find most problems with systems I have worked with are the result of managements putting people who never paid attention in database classes in charge. Or maybe nobody really cares about doing a good job. In any case, my professors would have given all the legacy SQL databases that I have come across F-.
The worse part is nobody dares changing anything because it would require application code changes. Don’t blame SQL. Blame incompetent people doing SQL badly.
I didn’t “get” SQL for the first 6-8 years of professional full-stack engineering. I was chasing, along with many in my cohort, the ORM abstraction dragon we thought we needed to work on j2ee, django, or sqlalchemy teams quickly.
It’s perplexing to me now why SQL was perplexing. Even knowing about relational theory and enough DBA to pass interviews, it somehow always seemed like we should have been able to treat databases like our OOP’s.
To think of the hours I wasted. Maybe I heard in a meeting “it’s the join table” and another colleague said “just use the association table,” and I was trying the n^2 debugging approach to solve it with model classes.
The best advice I got on this made sense later: “I wouldn’t even try to manage that [dataset] myself. Just figure out how to tell a database to do it—that’s its only job.”
Understanding relational algebra helps you write better SQL the way understanding lambda calculus helps you write better JavaScript. The language is warty and but a shadow of the pristine purity underneath, but it gets the job done and the theory guides your reasoning on how to use the language.
In particular, I find the closure property of SQL very helpful and always think in terms of queries gettings tables and producing tables (derived most of the time). https://en.wikipedia.org/wiki/Closure_(mathematics)
I really "got" SQL about 15 years after I started using it. The explanation is very simple: until that moment, SQL was a minor component for me in a big pie. Then one day I got into the position to troubleshoot some apps with >1 TB database behind them and I figured that SQL is more than simple CRUD operations. For most full stack developers that I know, SQL is still that minor component in the back of their mind and they are usually right.
Agreed, though disagree about the database class thing. I got a C in my class. I was a DBA at the time and definitely knew my shit. That was kind of the issue. He was teaching by the book with no regard to the database engine used. I was using SQL Server and MySQL daily.
There was a fundamental disconnect. I approached a class project differently not only because I was a DBA, but I am neurodivergent. Because my design didn’t align with his expectations, I got a C.
I had a 4.0 GPA prior to that, but that made me realize I wasn’t a fit for college.
I also got a C in my cs relational databases class. I also got more out of that class than just about any other in CS. I'm still kind of mad about it though. We had a group project assignment and our group had far-and-away the best project, of which I did 90% of the work. My bad grade came down to just one thing: the class required us to use Oracle and I refused on moral grounds.
I failed COBOL in college in a similar situation: I was professionally writing COBOL, my project was too divergent from what the teacher expected. I took that exam 3 times to barely pass. Good times, good times.
i took compilers and prof wanted us to write lexer code by hand for a specific language. instead i wrote a generic lexer generator and fed in the language definition to generate the lexer code, got a bad grade for not following instructions. it seems kind of funny now but at the time i was rather cross.
In my experience few CS students seem to take a relational databases course. It was optional in my program. Unfortunately, that means many of those databases were created by folks with no formal grounding in databases. Some folks pick it up on their own, but likely not as thoroughly as those who've taken a good course on the topic.
Everything you spoke to is around people doing relational algebra badly. That has little to do with SQL. The same mistakes would lead to the same problems if the database used, say, QUEL instead.
Yes, people who don't understand SQL databases can easily create "F-" situations. The problem is that people who are great at SQL databases can't do any better than a C+ because that's as good as SQL gets.
Strongly disagree, complete newbies can get up and running with SQL very quickly and it's expressive enough that experienced people can do things that no ORM would ever hope of being able to do. One criticism I have though is that whoever thought "SELECT" should come before "FROM" should be fired.
I sort of feel the same…but on the other hand if you consider ”delete from” exists also, it’s not completely unsensible to consider you first tell what operation you’re about to perform to the data. Would be nice to start with the source entity name for sure. Dunno what ”select 1” would look like, I guess the from foo would be optional.
DELETE FROM is even worse. Accidentally/mindlessly press cmd+enter before you wrote the WHERE? Poof, data gone. Make it FROM … DELETE!
I also wish we needed to explicitly take locks. In PostgreSQL at least (I think other dialects/engines too), figuring out what operation will take what lock and potentially mean downtime is left as an exercise to the reader. Force me to write WITH EXCLUSIVE TABLE LOCK when I add a non nullable column with a default!
If you're going to run commands that modify data directly on the cli, do it in a transaction so you can roll back. Also, start with `--` to make it a comment. Once you have the correct query and someone's checked your work, go back to the beginning of the line and remove the `--` so you can run it. It's also a good idea to write a select * or count first to check that things make sense, and then start your transaction, go up and modify your select into a delete/update, check affected rows looks good again, maybe do another select or two to check that data looks like you expect, commit.
Well, to be fair ”from foo delete” would do the same I suppose :-D Unless there’d be an explicit end to the statement to designate you really want to delete everything. Which might not be a bad idea. Or make ”where …” mandatory and bring in ”delete_all” or ”delete everything from foo” as a syntactic guardrail. This is equally implementable, whichever the order of ”delete” and ”from” would be.
the Google SQL variant has where required for deletes. on my first encounter of it I was just like "huh, neat", you can always use 1=1 as the predicate if needed.
And yet I agree with both. SQL is at the same time easy to get started, incredibly powerful and yet also a complete disaster. Just like JavaScript but with standards you have to pay to read and implementers take their sweet time to implement. ORMs (with the possible exception of JOOQ, whose open source version is unfortunately also very limited) are hopelessly limited and have been a drag on the development of the field ever since.
i think “disaster” is hyperbole. my brain thinks in concert with sql so it feels natural, though i will concede that the parsing/syntax order differences was an odd choice.
programmers and language users all have a mental translation layers that they use to go from conceptual->concrete using the syntax and idioms of their language of choice.
javascript doesn’t make sense in a lot of ways. calling it a disaster is hyperbole. go doesn’t make sense in a lot of ways. it’s not a disaster. unfamiliarity or disagreement with choices does not necessarily mean it’s a disaster
the only language with which i’m familiar that is a complete, unmitigated disaster is the english language, but i suppose that’s outside of the scope of this conversation
> the only language with which i’m familiar that is a complete, unmitigated disaster is the english language
Out of curiosity, are you a native speaker of english or did you learn it as a second language?
I’m a native speaker and feel sorry for anyone who has to learn all its irregularities, etc. But the few times I’ve mentioned this to anyone who had to learn as a second language, they’d correct me and tell it was easy. Which baffled me.
Obviously as a native speaker, I’m in no place to judge it.
SQL syntax is only on the surface level similar to natural language. Besides, if we actually considered similarity to English to be an important criteria for a programming language, we would all still be using Cobol.
native english speaker. learned german (fluent) and italian (conversational). english is an insane amalgamation which may be why it’s become near lingua franca
Yeah I maintain several BI reports and the codebase is about 20 years old, so multiple teams have taken creative liberties with the naming conventions for tables. Without intellisense I'd probably have gone insane by now.
I like it starting with the desired output columns, often when I work with a db quite a bit and have common naming conventions that’s all I need to know.
Even writing queries I think in terms of outputs then build the query from there.
I am blanking about the advantages of starting with from, people clearly share your view, but I have no idea why.
Maybe it comes from my C++ background where return type comes first, then function name, then inputs.
Ability to tab complete columns in an interactive settings.
If I’m in psql I can tab complete columns in the where clause but not in the select because I haven’t actually given any information about what I’m selecting from yet.
Imagine if methods on your C++ classes had to be called like method(object) instead of object.method(). While it's a meaningless syntactic difference, the latter makes autocomplete in an IDE possible.
I don't think visibility is important, since you want autocomplete to work both in public and private methods. It would be a crappy IDE experience if you couldn't autocomplete on private methods within other private methods.
And sure, multimethods are too complex for naive autocompletion. But it's useful for people to have tooling that can say "given an input tell me what outputs are possible" because that's how we write code.
Reminds me of reading music actually… many first time noob’s, mostly out of frustration, realize how sub optimal staff notation is and seek to write something better. In the end it’s ‘Good enough’ and accept it’s flaws because no rewrite has proven to be a 10x better solution.
I agree that SQL syntax is terrible, but for some potentially masochistic reason I still like writing it. It could certainly be done better though and I don’t understand how there hasn’t been more innovation in this space.
> I don’t understand how there hasn’t been more innovation in this space
I think it's simply that most businesses and investors don't register SQL as having any real problems, and especially now with a resurgent interest in SQL the idea of attempting anything novel feels too risky.
Having been forced to write queries as JSON in elastic I definitely prefer writing SQL. I would actively avoid utilizing any query language where I have to count brackets.
Handcrafting JSON is undoubtedly always a pain, but the idea with XTQL is rather that it can be easily generated from any regular programming language.
> I would actively avoid utilizing any query language where I have to count brackets
That's really an editor/tooling problem, solvable in many ways, but I guess a Python-like/Parinfer approach would be your preference? (where whitespace/indentation is significant)
True enough, tooling is key. I'm not a developer but I think SQL is a decent language that's easy to pick up. But I also like xpath, regex, and jq syntax too, so maybe I'm the weird one.
There's PRQL which is really nice. But it has pretty big downsides: it only supports querying data, not modifying it; and since it's not database-specific it doesn't support all the features of whatever database you're using. E.g. it can't access complex data types like structs/arrays in DuckDB.
PG’s blub essay isn’t especially charitable. There’s a good reason to want to stick with what you know and are comfortable with—-climbing endless learning curves means not getting things done and it isn’t simple to know ex ante what is worth the time
Which is fine when everything is handcrafted and human-scale, but as soon as you start going down the path of machine generated SQL (both the generation and analysis thereof) the tax is non-trivial.
It's not bad enough that it's worth actually trying to overturn, and the inertia from everything speaking SQL means there's a lot less pushback if you also do that.
SQL actually killed alternatives back in the 80s (and 90s, Postgres used to use a dialect of Ingres's QUEL, the "SQL" bit of PostgreSQL specifically denotes the switch over from QUEL to SQL).
Agreed, I wish that SQL was more concise and organized more left-to-right, like shell pipelines or the F# pipe operator. But what are the alternatives? The only one I know is PRQL, and that doesn't seem to be getting popular (correct me if I'm wrong).
It is organized, especially if you write it correctly (split lines and indent). Minified, it is a nightmare, but with a proper layout it is a pleasure to read it. I just did a code review for a 20,000 LoC piece and it was quite a breeze after I improved the layout a bit.
"Disaster" may be an exaggeration, but I think the syntax could be better, proper layout or not. In particular, I find that complex SQL queries often need to be read in a "spiral" order due to SQL's attempt to resemble natural language. I find PRQL's consistent left-to-right/top-to-bottom reading order a clear improvement.
I can understand that the statement order can be awkward, such as choosing what to select before the source.
But help me with "more concise". SQL has no boilerplate, its keywords are brief, you express the exact logic of the result set you're seeking. There are tons of SQL overlays or replacements embedded in programming languages or BI/reporting tools, and they are universally, more difficult to work with than straight SQL.
see their comments on WITHIN GROUP and redundant declaration of identifiers. As for boiler plate, SQL's turf war over plain English words as reserved names creates this interesting situation of boiler plate identifiers in SQL to disambiguate from reserved keywords.
This is nothing more than one guy's opinion dressed up to look like an academic study. He actually admits a few times that SQL features are concise. SQL definitely isn't perfect, and some vendors' implementations are better than others, but things like JOINs are pretty perfect. He considers specifying an INNER JOIN as syntactic sugar (over using WHERE equality), which seems backwards, as opposed to a more consistent practice of always using the JOIN keyword regardless of the type of join.
That one was more of a nitpick, so maybe I shouldn't have included that. What I had in mind was that I would like a bit more symbols instead of words, for example value IN [1, 10] instead of value BETWEEN 10 AND 10.
Some operations on groups are also too verbose, like getting top 5 items in each group.
Definitely disagree on this one. ORMs are a leaky abstraction, they hide the wall you’re going to bump into anyway, and they bring their own gotchas on top
I worked with two frameworks so far - SQLalchemy core and a much bigger "orchestration" one (dagster / airflow equivalent). SQLalchemy got me type checking, column name checking, and DB migrations versioning with Alembic.
The bigger framework got me more static analysis and dependency graphs on top of that. Those features saved me lots of time and headaches (type checking, dependency graphs, etc). Mind you, the frameworks I worked with still allowed for raw SQL for those edge cases that still pop up.