It's an interesting article, as a tour of some Python internals.
But it comes off as a rant without a real suggestion.
Python 3 was an internal interpreter cleanup. That's actually part of its lack of widespread popularity. The core developers didn't add a ton of unique functionality (and most of the new stuff was backported to 2.7 anyway), but they fixed some annoying problems in the CPython codebase and broke some things in the name of cleanup/unification of concepts. They made Python easier to maintain and easier to build new features atop. They sharpened the axe.
Yes, they didn't clean up Armin's pet thing -- slots. And they created new problems for library developers in their bytes vs unicode changes. But they cleaned up a whole lot of other stuff.
I personally think with the maturity of pypy and the stability of both the 2.7 and 3.4 lines of development, the Python ecosystem has never been more exciting. The advances in pypy make Python attractive for CPU-bound work and the inclusion of asyncio in the stdlib will make it more and more attractive for I/O-bound work over time. Python has long been a winner for mixed workloads, and the ecosystem around Python, especially pydata utilities like numpy and pandas, keep getting better. Stop complaining -- let's just build an awesome future atop this marvelous language.
>But it comes off as a rant without a real suggestion.
Did you actually read TFA? It makes some very real suggestions towards the end, and pinpoints clear issues and how they could be changed all the way through.
>But they cleaned up a whole lot of other stuff.
Which is irrelevant to the current discussion. How is "they cleaned X" a response for "they should clean Y"?
> But it comes off as a rant without a real suggestion.
How exactly is that a rant? It's an exploration about a design decision / mistake that was carried through Python for 25 years and has left an impact. There is no complaint and there is no suggestion for Python.
As I said in the early paragraph it's something that's interesting for people that are interested in language design.
I think I mis-spoke. I didn't really mean "rant", I meant "nit-pick". It's akin to the frequent complaints I hear about `len()` being a function rather than a method on `list` and `str` types.
This internal detail you focused on solves real problems; it's also a wart internally. OK, we get it. It's also not a hugely important wart, IMO. I don't think it's holding Python back.
But, like I said, I think the article was an interesting walkthrough of the internals. I enjoyed it overall :)
I think this is a bad comparison, because len() vs. foo.len() is just syntax, and isolated syntax at that. It's valuable to get syntax right, but this particular decision doesn't really impact anything outside of itself.
In contrast, the article suggests that the implementation of method dispatch slots (not __slots__!) actually constraints the optimizations you can make.
I get that you're trying to suggest that they're both small, but I think that this analogy only serves to confuse things.
It's a rant because the solution was ill defined. If the title of the article is 'The Python I Would Like To See', you should expect more detail of how to implement the solution.
I hope it is valid to describe a situation and voice a wish for a situation that would look different without having to provide a concrete step towards that.
The removal of the slot system would be a backwards incompatible change with very little direct benefit for developers. It could be interesting as a general goal for a hypothetical Python 4 but I do not believe it makes any sense to discuss this as anything more than a hypothetical case for the next few years.
> Instead of having slots and dictionaries as a vtable thing, let's experiment with just dictionaries. Objective-C as a language is entirely based on messages and it has made huge advances in making their calls fast. Their calls are from what I can see much faster than Python's calls in the best case. Strings are interned anyways in Python, making comparisons very fast. I bet you it's not slower and even if it was a tiny bit slower, it's a much simpler system that would be easier to optimize.
Breaking the print statement is probably a large reason for lack of adoption. It used to be much more convenient. They've lost sight of 'practicality beats purity'. What used to be print a,b,c is now print(a + " " + b + " " + c), or using the string formatting method. What used to be
print linewithnonewline,
with the comma indicating the lack of newline, now has to be done through print(line, end=""). Which is more practical and pythonic?
I follow Ronacher's work, open source and posts and I agree with his arguments.
This post remember me of Spolsky's one (http://www.joelonsoftware.com/articles/LeakyAbstractions.htm...) though. When you get to a point where you get to know what's under the hood and why it's not working the way you are trying to use, however in a majority of cases you are just fine and don't need to know what's happening under the hood and maybe if this majority is big enough that's just fine.
In modern usage, this question also serves as a
metaphor for wasting time debating topics of no
practical value, or questions whose answers hold
no intellectual consequence.
Python will (or won't) be used for a particular application regardless of whether some contrived test takes 0.158 usec or takes 0.256 usec per iteration.
This sort of misunderstanding comes up so often that there are a plethora of cliches for it. Here's another one: "Missing the forest for the trees.".
For anyone else who found this as confusing as I did ("wtf, how can proxy actually be 42?"), what's going on here is that it's calling `proxy.__repr__()` when it attempts to display `proxy`, which in turn calls `42.__repr__()`. (Similarly, `proxy + 1` calls `42.__add__(1)`.)
I'm learning Python and I must say it is a wonderful language. Being able to concatenate strings by simply saying "a + b" is a great productivity boost (coming from C++). Python libraries are powerful. I can read an Excel spreadsheet with one line of code. I can create plots in PDF format with a half-dozen lines. Amazing!
However, I am disappointed with the difficulty of turning a program into a Windows EXE. I wrote a small program (couple of thousand lines), tried Py2exe which failed to handle the Numpy (or Matplotlib, I forget which) imbroglio.
PyInstaller works, except that the EXE is 85MB, and takes one minute to start up. Not practical for customer distribution. I can't expect my customers to install the Python runtime. In contrast, my 500KLOC C++ program, with all its third-party libraries, is 19MB. Yes, I know, Python needs everything including the kitchen sink. Still, 85MB is not practical.
> Being able to concatenate strings by simply saying "a + b" is a great productivity boost (coming from C++)
C++ has had this operator overloaded for strings for decades.
With regard to the size of compiled executables, I can't really say much except "that's not what it's made for". If you need to ship compiled executables to people, Python is an extraordinarily inappropriate choice; stick with C++.
Might be, but in Python it is much easier to use and better integrated in the language basics.
Operator overloading is something, that belongs to the 'pro' skill-set of C++ developers and done wrong, can lead to many problems.
In Python it is much easier to overwrite the behavior of such standard methods and the language core can be learned in about 1/4 of the time you need to learn in C++ to just know the basics (C core, C++ object basics, Operator overloading, standard library basics like the string class, ...) -- in Python you get the power with much less learning effort and without the technical troubles.
I just did a test using std:: and old-fashioned char[]. The fancy std:: is fifteen times slower than the strcat(). In a loop with intensive string manipulation, this could cause the program to get back to you in 15 seconds instead of 1 second. You don't mind waiting?
Well of course you're getting a speed difference, look at how much extra work you're doing in the std::string code! The char[64]s are allocated once but you reallocate the std::strings every time. By pulling the declaration of s1, s2, and s3 to the top, I'm getting equivalent speed (g++ 4.4.7 with -g, 0.02s each time, clock() isn't any more accurate). And for essentially no runtime cost, you're getting memory safety and exception safety.
C++ is all about low-cost abstractions. Any speed difference between the C++ way and the C way should be very small, and the C++ way is safer. Use std::string (and std::vector and all the rest) unless you have a very good reason not too, and then you should probably just write your own custom implementation. Reverting to the C way is just asking for issues.
you might like D -it's worth a look at the least. likewise go and nimrod, which also aim for better programmer productivity combined with compilation to efficient native code
I agree that OldStyleClasses might be simpler (while less featureful), but I think I'd care more for the footprint of instances, rather than class objects themselves
I switched to Python 3 this year and I haven't looked back -there are some niggles, but overall its a great improvement.
A great feature that's not really talked about is the __prepare__ function in metaclasses: you can supply a custom type that stores all class members. You could whip up multiple-dispatch using this (it lets you handle classes with duplicate property names) in conjunction with signature annotations, which I think is pretty neat.
This has been one of the reasons why I'm not currently doing anything for Google App Engine, they still lack Python 3 support. It feels awkward to code in Python 2 after Python 3.
This post is surprisingly confused, it is phrased as a complaint about the language, then immediately degrades into CPython implementation specifics that have little bearing on the usability of the language itself.
Ronacher should also know better than to post microbenchmarks like the one provided here, especially without corresponding (C) profiler output. At the C level, slots allow the implementation constant-time access to the most common code paths for an object, and especially when you have C code calling other C code via the type system (IMHO the primary use for Python, and still its strongest use case), "interpreter overhead" is reduced to a few extra memory indirection operations.
In the alternative world, sure, perhaps some microbenchmark may behave faster, but now systemically, and for e.g. "reduce(operator.add, range(1000))" requires more hash table lookups than I can count.
Python is all about providing a lightweight way to compose bits of fast code (the kernel, network stack, NumPy, MySQL, whatever). Unfortunately somewhere along the way solutions like Django got popular, which are almost the antithesis to this old viewpoint. Ronacher seems to be advocating that we should punish the CPython implementation's traditional strong areas in favour of.. actually, he didn't even describe any vision that was impacted by his complaints. He just seems to want the CPython implementation to suffer.
Perhaps his complaint about new-style __getattribute__ would be better suited as a bug report, it seems the only substantial observation made about the language itself in this post.
> This post is surprisingly confused, it is phrased as a complaint about the language, then immediately degrades into CPython implementation-specifics that have little bearing on the actual usability of the language itself.
You might think that, but you are very wrong and I should probably make that point in another blog post. These CPython specifics are enshrined in the language. While PyPy does not have the actual structs, it needs to implement the same user exposed API as it slots were used.
PyPy cannot just say "a + b" means "a.__add__(b)", it needs to implement the exact same dispatch logic that CPython has because people's code depends on it.
//EDIT:
> Ronacher seems to be advocating that we should punish the CPython implementation's traditional strong areas in favour of.. actually, he didn't even describe any vision that was impacted by his complaints.
Maybe I did not make my point very clear but the whole last paragraph advocates about trying a version of Python that abolishes the internal slot system.
The first might be "Let's replace all use of tp_* with hash lookups", which I suppose would be closed quickly, amidst a storm of muffled giggles. But at least by then you'd know why the current implementation works the way it does.
The next might be "__getattribute__ mechanism has surprising behaviour in common use-case" (which itself would imply having a use case where any of this mattered). This one might actually result in some productive discussion regarding a fix.
Your second reply is basically a doc bug, one of "__getattribute__ optimization is not specified in the language reference" or "__getattribute__ optimization causes non-comformance with the language specification", either way, you would get the attention of the few people who can actually help you, rather than attention from the many more who can't
You're assuming I am not aware why the system works the way it does currently which is incorrect. I outlined what a Python could look like, not what I want the CPython guys to implement. I am perfectly aware that this is not going to happen.
I think if you had titled the post differently you would have ended up with a lot less argumentation about how what you propose isn't a good idea for python.
As it is, the title at least primes the reader to think you are asking for what you go on to describe.
>This post is surprisingly confused, it is phrased as a complaint about the language, then immediately degrades into CPython implementation-specifics that have little bearing on the actual usability of the language itself.
Ronacher says:
"Python is definitely a language that is not perfect. However I think what frustrates me about the language are largely problems that have to do with tiny details in the interpreter and less the language itself. These interpreter details however are becoming part of the language and this is why they are important."
He is right, Python has not real standard with clearly defined semantics. Actually, the standard is CPython, but CPython is full of crap that other implementations are then forced to implement thus making that crap part of the language. You should avoid Python at any cost, into the trash it goes.
Your logic does not follow. "There is a problem affecting how implementors have to implement a language, so therefore the language has no use and everybody should just go home" is silly and hyperbolic. From an end-users perspective, Python is still a very useful language. Sure, it has some idiosyncrasies, but that doesn't make it "crap".
His complaint is that implementation details of the interpreter have leaked into the specification of the language, such as it exists. The result is that alternate implementations have to mimic those quirks of the interpreter in order to achieve compatibility.
Having an abstract specification of the language, rather than "whatever CPython does", would give implementors more freedom, and thus allow the use of Python in more places and circumstances.
I think most of the things I'd want to see first would come in the standard library and in unicode handling.
I'd probably argue that most uses of metaclasses are a reason to step back and make the code simpler. There are a few cases where they come in very usefully, but at least in open source code, they create a barrier to understanding and increase complexity.
ython starts to get ugly when you start writing code that does "automagical things", and that's somewhat tolerated because it's not a language for building automagical things.
Or, to say it another way, if you have to think about how the interpreter works, your python has jumped off the idiomatic wagon a while ago.
That being said __slots__ as a performance enhancement is crazy useful - and I'd like to see more things in that area.
Though for most people, the thing that would lead to cleaner code would be a more powerful and elegant standard library.
Sounds as, that you did not understand, that the author did not mean the __slots__ system, that allows to avoid the __dict__ on objects, but the internal slots-system that is intended to speed up the dispatching of special methods.
> That being said __slots__ as a performance enhancement is crazy useful - and I'd like to see more things in that area. Though for most people, the thing that would lead to cleaner code would be a more powerful and elegant standard library.
__slots__ as PyPy shows is completely unnecessary.
The article mentions, a bit unclearly, that there are two types of slots: the slots used to implement method dispatch in the interpreter and __slots__ [1-2]. Armin's article is about method dispatch.
__slots__ are useful to reduce memory usage when you're instantiating lots (tens of thousands or millions) of Python objects. Since objects defined in Python code support the addition of arbitrary attributes, their members are internally represented as a dictionary which isn't necessary when your tens of thousands of objects have a limited set of attributes.
Defining a sequence __slots__ tells the interpreter that your Python class will only need memory for the members defined in that sequence and restricts your ability to arbitrarily add new members (see [1] for details).
>>"In recent years there is a clear trend of making Python more complex as a language. I would like to see the inverse of that trend. I would like to see an internal interpreter design could be based on interpreters that work independent of each other, with local base types and more, similar to how JavaScript works."
Sounds like OP should investigate Lua (and LuaJIT in particular).
He might like Lua more than Python.
(Given that OP created Flask, I'd love to see a Flask equivalent developed in Lua)
Good job not reading the article and skipping to the end, I guess?
>> This is in fact how many other dynamic languages work. For instance this is how lua implementations operate, how javascript engines work etc. The clear advantage is that you can have two interpreters. What a novel concept.
(Just a joke, people. Just a joke.) Anyway, I wonder why Armin's not more involved with core Python things, his perspective should be valuable. As someone doing some Python stuff on the side his articles are always worth reading carefully.
The Python community has always been the best part of the language, but the important subcommunities like NumPy, Twisted and PyPy has always seemed like they are a bit outside looking in. I don't know why. Perhaps the language would have evolved differently if these projects were a bit more involved in the actual language development.
Go and Javascript are "winning"? That's news to me. They might be buzzier right now, but their compiler/interpreters only appeared in 2009 and Python has been around since 1991. Is it really such a surprise that the hype period is over for Python as a language spec?
Now people are just getting a lot of stuff done with this productive language and nobody needs to be convinced any longer how good a language it is. Most people already know.
Now, we're onto the good work of building a huge ecosystem of open source modules atop it. Call me when Go and JavaScript have the equivalent of the PyData stack and rock-solid drivers for every database technology on the planet. Then maybe I'll look at them for anything other than niche use cases. (Obviously JavaScript is still the only game in town inside the browser and Go seems like a pretty interesting alternative to C for UNIX system utilities.)
>Go and Javascript are "winning"? That's news to me. They might be buzzier right now, but their compiler/interpreters only appeared in 2009 and Python has been around since 1991. Is it really such a surprise that the hype period is over for Python as a language spec?
It's not about "hype period". It's about moving on with the times to avoid becoming the next COBOL or TCL. Being even older and problematic than Python haven't stoped a language like C++ to ease its users problems and make them happy with C++11. Javascript, almost as old as Python (1995 vs 1991) got 20-100x speed boost with the 2006-era generation of JITs.
A ho-hum remake, like Python 3, that breaks compatibility without much important to offer, doesn't cut it. If you're gonna break compatibility solve real problems people have. A large speed boost (entirely possible as V8, LuaJIT and co have shown even for a most dynamic language) is a great thing to entice upgrades. A good async/multicore story also. The removal of GIL. Etc. Those are things people have been nagging Python devs about, not improved Unicode or print as a function.
>Now people are just getting a lot of stuff done with this productive language and nobody needs to be convinced any longer how good a language it is. Most people already know.
Languages that nearly vanished from the job market and current use trends, like Smalltalk, TCL and Perl also said the same thing...
A ho-hum remake, like Python 3, that breaks compatibility without much important to offer, doesn't cut it. If you're gonna break compatibility solve real problems people have.
Exactly this. Python 3 was an unfortunate triumph of purity over practicality. We'd be in a much better place if half of the effort expended in the Python 3 transition was instead focused on the issues you listed, and I'd also include browser support by compilation to JS.
You have strong points. I agree with you about Python 3 "not cutting it" in terms of offering some fundamentally new stuff. But I also think there is something admirable in that.
You mention "a good async story" -- that has already been addressed with asyncio. A better multi-processing API has also been addressed in the concurrent.futures module. It's true that "multi-core" is still unsolved in Python, but most practitioners work around this problem by just using process-oriented parallelism instead.
Nonetheless, the frequent complaints about multi-core haven't fallen on deaf ears; they will probably come next! It's pretty much the primary focus of the pypy project and recent development work has begun on a Software Transactional Memory implementation that would remove the GIL.
GvR at PyCon 2014 said that he thinks pypy is the future of core Python development -- and that at some point in the future, pypy might become the reference implementation rather than CPython.
So, to conclude, I agree with your points, but progress is already advancing on many of the fronts you are demanding from the community.
Well... I think javascript is a terrible broken language, but you can't deny that it's been chosen as the flagship language by Microsoft, or that at least 3 major players (Microsoft, Google, Apple) are investing serious time and money in the runtimes for it; or that those runtimes are actively deployed on the future on computing: mobile devices.
As a full time python developer I'm seriously concerned about how viable it is as a platform going into the future; this 'everything is fine' attitude is the problem.
WAKE UP.
Python is not doing fine.
It's growing as the target for scientific computing (which is great~), but I'd argue that's masking a downturn in the use of python for serious software engineering tasks, where the people who traditionally used it (webdevs, backend system tool makers, like disqus) are turning to tools with better performance and distribution tooling (like Go, node-webkit, etc) and less drama (py3).
I'm currently using Nimrod as a replacement for Python on a Bitcoin project. The code so far looks very similar to Python, but it has the performance of C with native code generation not dependent on a VM. The community has been amazingly helpful.
Everyone investing in JS is focused on it as an intermediate language. You support JS so things like asm.js port cleanly. Then you build a JIT which says 'oh hey, run this as native code'.
I would argue this means quite the converse - JS will die while bejng replaced by language ports which compile to JS. Python included.
It's not at all clear to me that asm.js is the endgame. Where are the big projects that are using that strategy? It certainly seems like a possible endgame, but it also looks like many bug companies are writing a lot of JS.
I agree Python is, in my subjective opinion, the best language spec and community in programming. It's simply the most fun, productive environment.
But people are leaving; important people and newcomers and loyal stalwarts. Im on the go right now so can't access links, but there have been a handful of posts here in the past 6mo or so of prominent members leaving, or sharing their disenchantment. Many other "I'm moving to go" posts. And the velocity of npm/go are undeniable (look at the stats). Remember, npm is largely for the server, where js shouldn't have an advantage.
I love Python. I want it to win. Maybe "lose" is too strong of a word, but it's certainly slipping.
Things fall apart; the centre cannot hold;
Mere anarchy is loosed upon the world,
The blood-dimmed tide is loosed, and everywhere
The ceremony of innocence is drowned;
The best lack all conviction, while the worst
Are full of passionate intensity.
I see a lot of Python developers switching to Go, and it's hard to deny that Python 3's adoption has been... underwhelming.
I think the future of programming languages clearly belongs to statically typed languages, and Python will not be part of this future unless it evolves in that direction.
I was very, very critical of Python programmers switching to Go for a while.
I recently got fed up with python and have been doing C, Go and Lua instead. I've started to rediscover programming and I'm enjoying myself much more. Go is very addicting.
It all started when I tried to get one of my projects that requires a lot of concurrent IO working in both Python 2.7 and 3.x. Gevent, of course only works in Python 2, so I tried to switch to the asyncio family. This was a disaster.
I'm sad to see it go, but I've decided to mostly drop Python. I'll probably still use it for Flask apps (I seriously love Flask). But anything that requires heavy lifting will probably be done in Go or C.
> We have PyPy now; why not use the tremendous advances there in the next Python?
Because PyPy as cool as the project is, is terrible to develop on unless you are a PyPy person. I would consider myself a reasonable programmer but PyPy makes me go crazy. Slow iteration times, super complicated code.
I really would like to see, when the suggested, simpler system would be tried. It is sometimes amazing, how by simplification gains could be reached against common wisdom. And even, when no gains or little losses would be there with the new, simpler system, it could be worthwhile to change, since the current system has his huddles, that I also already stumbled about, and which are not clearly communicated, since it is such a deep implementation detail of CPython. Also it was mentioned, that a simpler system could be even more powerful.
Thanks for the very informative post. I've been a full-time Python developer for only the past 7 or 8 months now. I had not in that time learned about the slot system or the CPython implementation.
I moved from Javascript to Python 2.7 and haven't looked back since. I now strictly use Python in all of my projects, Javascript only because browser doesn't support it.
I wish that Python would introduce a few things from the Javascript world. Easy async, and a real-time Meteor like framework.
But it comes off as a rant without a real suggestion.
Python 3 was an internal interpreter cleanup. That's actually part of its lack of widespread popularity. The core developers didn't add a ton of unique functionality (and most of the new stuff was backported to 2.7 anyway), but they fixed some annoying problems in the CPython codebase and broke some things in the name of cleanup/unification of concepts. They made Python easier to maintain and easier to build new features atop. They sharpened the axe.
Yes, they didn't clean up Armin's pet thing -- slots. And they created new problems for library developers in their bytes vs unicode changes. But they cleaned up a whole lot of other stuff.
I personally think with the maturity of pypy and the stability of both the 2.7 and 3.4 lines of development, the Python ecosystem has never been more exciting. The advances in pypy make Python attractive for CPU-bound work and the inclusion of asyncio in the stdlib will make it more and more attractive for I/O-bound work over time. Python has long been a winner for mixed workloads, and the ecosystem around Python, especially pydata utilities like numpy and pandas, keep getting better. Stop complaining -- let's just build an awesome future atop this marvelous language.