That is very strange. I can not reproduce your results in C++, using your code.
On my machine (Threadripper 2950x, 64 bit Windows 10, GCC 10.1.0 from MSYS2 MinGW64) my algorithm performs best and your branchless version ends up slower than your branchy one:
Maybe AMD or Windows doesn't like your branchless code or perhaps GCC is generating inferior code for AMD/Windows, as at least on Intel/Linux your Lomuto branchless code can beat the branchy code. But pdqsort (which uses branchless Hoare partitioning) consistently beats both.
(Author here.) That is incorrect. The difficult case is and the real benchmark is with unpredictable data. The low-entropy cases will be about as fast with both partitioning schemes.
This is only a smart part of the benchmarks I've run because I wanted to drive one point home within a limited space. I've run many tests on various data types and shapes, and Lomuto does better than Hoare on most. (E.g. its improvement on double is even larger than on integrals.)
I'm confused, what exactly is incorrect? I said if we don't have more realistic benchmarks then I wouldn't just assume it's faster. Now you're saying you did in fact do more realistic benchmarks, and you found lower entropy ones to exhibit the same performance (which is not faster). They all seem consistent with each other?
I also said that sorting is often more complicated than just comparing integers by their values, which you didn't really address except to mention doubles, which missed the point I was making (I was trying to say sorting often needs e.g. satellite information).
In any case though, if you've done other benchmarks, it'd be nice if you could post them on GitHub or something. Maybe I'm missing something.
Abstract: Over the years, a few programming paradigms have been successful enough to enter the casual vocabulary of software engineers: procedural, imperative, object-oriented, functional, generic, declarative. There's a B-list, too, that includes paradigms such as logic, constraint-oriented, and symbolic.
The point is, there aren't very many of them altogether. Easy to imagine, then, the immensely humbling pressure one must feel when stumbling upon a way to think about writing code that is at the same time explosively productive and firmly removed from any of the paradigms considered canon.
This talk shares early experience with Design by Introspection, a proposed programming paradigm that has enough demonstrable results to be worth sharing. The tenets of Design by Introspection are:
* The rule of optionality: Component primitives are almost entirely opt-in. A given component is required to implement only a modicum of primitives, and all others are optional. The component is free to implement any subset of the optional primitives.
* The rule of introspection: A component user employs introspection on the component to implement its own functionality using the primitives offered by the component.
* The rule of elastic composition: a component obtained by composing several other components offers capabilities in proportion with the capabilities offered by its individual components.
These rules, and how to use them to build powerful software, are the topic of this talk.
Excellent talk, it really makes one envious of the compile time facilities offered by D.
Do you see any other languages out there being amenable to at least ideologically similar techniques, or do you think this design paradigm will remain unique to D for the immediate future? In particular, I think some of the modern JIT-compiled languages, which often feature extensive reflection and runtime introspection capabilities, could theoretically be used in a similar way, as long as the compilation overhead incurred on the first run remains reasonably small. What are your thoughts on this?
My background was high assurance systems and security plus regular software/system development. Mainly do R&D now while evangelizing field's developments. The academic and industrial side of the field keep making great progress in transformation (see Semantic Designs), static checking (Astree Analyzer), compilation (CompCert), better optimization, and so on. The main thing that hurts such work, though, is when the language and/or its standard get too complex or vague. C++ was horrid in this regard.
So, as you try to formalize it, I encourage you to try to remove as much ambiguity and processing complexity as possible. Maybe also work with academics who specialize in formal specs or methods that caught many defects in prior language or system specs. The result is that researchers interested in building assurance and QA tools for D will have a much easier time. Just look at how long it took to certify a C compiler vs ML and LISP compilers despite latter languages being much more powerful.
Reason it's important, whether an immediate concern or not, is that much of the best stuff comes from cash- and time-strapped, but smart, academics trying to make a name or push state-of-the-art. Easier your language is to work with, the more of them might choose it. And you've done a really, good job on a C++ replacement that I thought would've had more adoption. So, I'd like to see some of those brains get put on your work, too. :)
Looks like the target audience for C++ mindshare is being taken away by Golang and Rust. When I think of "D" I think of nothing but "oh, yeah, they use it at Facebook". Who else uses it? How do you see Golang and Rust as competition?
D would be definitely helped by a large corporate sponsor. I think the foundation will make the language a more serious alternative for such adoption.
Competition is good for all involved. Rust and C++ are the competitors closer to the same turf. I think we have a solid value proposition and several ways to enhance it and differentiate ourselves. Time will tell.
We built all of our machine learning backend in D at AdRoll. Some examples:
- learning of large-scale classifiers and regressors using custom optimizers
- real-time pricing of billions of ads a day using these models on ad exchanges. <.5ms latency to parse complex bid requests and compute sparse and simd dense dot-products
- a real-time event processing system that hits DynamoDB with ~4.5K json queries per sec on a single node
We literally have D systems deployed on hundreds of ec2 instances as we speak and responsible for mission critical tasks of a >100$M run rate company.
D is ready for prime time and works at scale.
Is there a lot of C++ to go traffic ? I have not seen much along those lines. Go doesn't remotely look like a language that a C++ programmer would be interested in. Given its limitation go is an ok enough language but that's about it.
Pardon my ignorance, but do you have some kind of roadmap for D ? What features do you prioritize to develop or enhance ?
Will D focus on concurrency (Erlang's domain) first, RAD web app/service (PHP, Java, and Ruby's domain), or something else ?
I have quite a few things in mind for the immediate future. Organizationally, I want to get the D Language Foundation rolling as an organizational mothership of the language. Technically, I want to focus on: (a) completely defining the language - fuzzy corners such as the meaning of "shared" are a liability; (b) offer a solid experience to users who don't want a garbage collector; and (c) design more libraries using the fledgling Design by Introspection technique.
(EDIT sorry, submitted too soon) With regard to targeted users and uses, D is an ample language that could be used for a large range (heh) of applications. I want to make sure the core language and libraries offers a solid support on which various applications, frameworks, and libraries can build.
>> D is an ample language that could be used for a large range (heh) of applications. I want to make sure the core language and libraries offers a solid support on which various applications, frameworks, and libraries can build.
The advice given in marketing new products is often opposite - find a niche, offer much better value than competitors for that niche ,become the leader of that niche - and use that base to expand.
I wonder if this kind of advice offers a better route for success for a programming language ? What was the case historically ?
Ruby was around for 10 years before Rails released version 1.0. Rails itself didn't get popular until version 2, 4 years later.
It takes a long time for a language to mature to the point to where someone loves it so much they're willing to bet their career on it, as DHH did.
The best thing for a language designer to do IMO is not to try to decide what the killer app is, but rather to make it as nice as possible so that more and more people want to use it to solve problems.
Rails isn't the best web development stack because of any kind of monolithic effort on the part of the Ruby core team. It was built on top of the work of dozens of programmers who did things like write HTTP client libraries, HTML parsing tools, templating languages, all the things that a framework relies completely on.
My advice: You can only focus on so much at once. Let the community build the tools. You build the language. You'll never be able to do both, there just isn't enough time.
There's so much other stuff for you to do. Figure out governance, setting a tone for the community, reaching out to heavyweights so as to raise your profile, working out what your relationship to corporate sponsors is, how you're going to ensure stability across versions, package management. The countless little things that make one language more of a joy to work with than another.
I'm pretty sure there's a market for D on consoles and mobile gaming. This is the first thing I get asked every time I suggest using D at work (we mainly do casual games while also porting AAA titles with a few internal engines).
So far I've only been successful with a few shell scripts calling into the tools of the console's SDK doing mass reflection on 75k shader object files among other things. Yet I do half of my home projects in D and absolutely love it.
I think getting D to interopt with more existing code in all of C#, Java and Objective-C as well as on both mobile devices and game consoles will greatly help its adoption with the "oh its just another library we add to our existing codebase" factor which has tremendously helped Clojure take off.
This means D could make it easy to extend both JVM and CLR applications with lighting fast code while getting free access to their respective ecosystems. I for one would love to use D instead of C# to write Unity code.
Clojure isn't really a fair comparison. Clojure hasn't got low-level interop with all those things, just JVM, and that interop is bought at the cost of closely tying Clojure to JVM. I don't think D can be changed in all the basic ways that would be required to give it the same level of integration with JVM, and that would also basically mean abandoning the entire standard library. JVM's difficulties integrating with things other than JVM aren't something I imagine D can really solve.
I meant things such as extern(C#) and extern(Java) to work the same way extern(C) and extern(C++) do now.
I don't see the point of compiling D to JVM or CLR byte-code for all the efforts it would involve, but rather the CLR or JVM load native code written in D and have each call into each other easily.
Say I can't reach the performance I want in C# and I want to write that part of the system in D and load it in Unity and then call into native D code from there. Without having to mess with the CLR implementation details. That would definitely allow people to try it out and integrate it gradually until it has completely taken over.
1. I got std.experimental.allocator accepted for inclusion. On its basis it's not difficult to build the structures and algorithms necessary for garbage collection.
However, there were two concurrent developments. One is Martin Nowak and others did great work on improving D's existing garbage collector. The other trend is that it became increasingly clear that a category of users will always be weary of a GC, be it for the right or wrong reasons. So the better place to hit is to offer a great experience in D without requiring any garbage collection at all.
2. Mobile is an important area for a language like D. We don't have many experts on team to work on that, but we're looking. One thing I can do soon is to encourage ARM support, which right now works only experimentally and in a science project kind of way.
3. I want to make it possible to use a well-defined subset of the standard library without a GC. (The subset part is for legacy compatibility.)
I want to work on something where I can make a difference, and I think this is it. Paul Graham wrote (paraphrased from http://www.paulgraham.com/procrastination.html) one should at best work on things likely to be part of one's obituary. At this point I feel there'd be little difference if my obituary read "worked for 6 years at Facebook" vs. 7, 8 etc. But if I helped people get code written better, that's a lure I wasn't willing to not seize.
Probably not; D is not a particularly suited language for being a backend target. To be fair, neither is Javascript; it got in that position by accident.
That said, just as a related aside, there's the Daniel Murphy's magicport program that translates real projects from C++ do D semiautomatically (Daniel used it to translate the D compiler itself; we recently committed to bootstrapping).
I think the lack of demand for it has been an indicator that ranges with additions are just enough. We're currently looking at adding optional primitives to ranges in a Design by Introspection manner to support bulk streaming of data.
I would definitely love to see range primitives support bulk streaming!
How about adding a byte buffer concept alongside with ranges? I'm thinking an API similar to Netty's ByteBuf but using D's compile-time capabilities instead. These would be valid ranges as well but offer direct byte access on top.
Also, while on the subject of status updates, what's the status of std.reflection? Any chance UDAs could be included in there? :)
It would be great if the reflection info could be outputted to a separate file for those cases where its only needed during program initialization or used by external tools.
Aren't ranges and streams conceptually different? Ranges are for data traversals and streams are for data transfers.
I want more than empty, front and popFront to manipulate byte buffers with high performance.
I'm thinking about reading data structures, performing bit conversion, and extracting primitive values from untyped byte arrays. Going as far as to reuse the array's memory for the values being read which isn't possible with ranges.