No evidence for nudging =/= nudging doesn't exist.
I'm fairly sure anyone who has done A/B testing at scale has plenty of evidence that nudging works. Perhaps not up to the standard of science, but there are literally people who manipulate choice architecture for a living and I'm fairly convinced a lot of that stuff actually works.
"... evidence that nudging works. Perhaps not up to the standard of science..." That's pretty close to saying it doesn't work. The point of this meta-study was precisely to show that the evidence claimed to support nudging was probably attributable to random variation + unnatural selection, where the unnatural selection was publication choice: either the researchers who got negative (null) results chose not to bother writing it up and submitting it, or papers that reported negative were rejected by publishers.
There are lots of people who do X for a living, but where X doesn't work: palm readers, fortune tellers, horoscope writers, and so on. I'm not even sure that funds managers reliably obtain results much above random.
I think what’s not clear is what’s in those papers and what exactly they have to say about nudging and what definition they’re using. It defies credulity to think that changing defaults in software doesn’t change behavior if only because most users aren’t technically savvy enough to change their settings.
On the other hand the dream of nudge theory is something like a study done in the UK that suggests that adding the line “most of your fellow citizens pay their taxes” will increase the likelihood that people pay taxes. This I’d be more likely to believe the benefits are not clear, and more importantly difficult to replicate across time and culture.
It seems that trying to do a meta-analysis on all of nudge theory (or large categories of it) would indeed show know impact. It’s not like you’re testing one thing, you’re comparing well designed programs, with ones that aren’t.
To say things a different way, I don't think this study will change anything for people actually doing choice architecture in applied settings. They have results that speak for themselves.
This is exactly how a midwife explained to me why she uses magic crystals. She told me that there's science, and there's results, and that she's seen the crystals work.
Obviously they don't work by magical vibration, but are you sure they don't work at all? If the midwife feels and acts more confident from having that tool or the mother feels more relaxed because she thinks they will make the process easier, then the crystals do, in fact, work. They just don't work through the mechanism those individuals think they do.
I mean, yeah, if she has solid RCT data on thousands to millions of childbirths and has found a statistically significant impact from using the magic crystals, I would support their use. A/B as well as scientific research uses the same basis.
The issue is that in fact the midwife will not have such data. The comparison being made is that A/B testing, if run competently, is pretty close to scientific research, in particular for research related to nudging.
I wonder how many engineers crack open a statistics book to find the correct test versus just plotting box plots and saying "see looks pretty different"
"I don't think this study will change anything for people actually doing choice architecture in applied settings." Probably true, but then evidence that horoscopes etc. don't work, doesn't prevent people from drawing horoscopes, or other people from relying on their horoscope to plan out their day.
"They have results that speak for themselves." Let me put my point differently. Suppose that nudges don't have any effect at all (null hypothesis). More concretely--and just to take a random number--suppose that 50% of the time when a nudge is used, the nudgees happen to behave in the direction that the nudge was intended to move them, and 50% of the time they don't move, or they move in the opposite direction. And suppose there are a number of nudgers, maybe 100. Then some nudgers will get better than random results, while others will get no result, or negative results. The former nudgers will have results that appear to speak for themselves, even if the nudges actually have no effect whatsoever.
This is the same as asking if a fair coin is tossed ten times, what is the probability that you'll get at least 7 heads. The probability of such a number of heads in a single run is ~17%. So 17% of those nudgers could be getting apparently significant results, even if their results are actually random.
I think gp and you probably see eye to eye, but gp has a problem with your phrasing.
If the effect does not live up to scientific rigour, that (more or less) implies that the effect is roughly indistinguishable from randomness.
If folks have results that speak for themselves, then the effect more than likely is scientifically rigorously testable. It may already have been - by those very results.
Seriously, what about that kind of publication bias: A/B tests don’t get published.
If you run a useful system where it would be meaningful and interesting to know whether a social science theory actually applied, you might run an A/B test to see if it works. If it works, it is adopted—but it is almost never published. And that is for two reasons: 1. no incentive to publish and 2. major incentive not to publish. #2 is recent (post Facebook experiment) and it is specifically because a large portion of the educated public accepts invisible A/B testing but recoils with moral indignation at the use of A/B testing results in published science. Too bad: Facebook keeps testing social science theories, but no longer publishes the results.
The standards of selecting a result of an A/B test are less stringent than those of publication for the advancement of knowledge. For publication, the goal is to determine whether a model is accurate. For A/B testing, the goal is to select the best design/intervention. The difference is that for scientific testing "inconclusive" means that there isn't enough evidence to consider it a solved problem and it should have more research, while in A/B testing "inconclusive" means that any effect is small so you should pick an option and move on.
As an example, suppose I flip a coin 1000 times and get heads 525 times. The 95% confidence interval for the probability of heads is [0.494, 0.556], so from a scientific standpoint I cannot conclude that the coin is biased. If, however, I am performing an A/B test, I would conclude that I'll bet on heads, because it is at worst equivalent to tails.
I think you are missing the point. With academic publication bias, sometimes an unbiased coin gets heads 600 times by chance. Those studies get published. But, if you ran the test again, you might only get 525. That study won’t get published.
And, in opposition to your assumption: there is nothing to prevent A/B tests being published with high academic standards— like a low p value and tons of n. In an academic context, that’s just fine— it’s a small but significant effect.
A/B tests are simply controlled experiments—which are the gold standard of scientific evidence generation in psychology. My point is that the main generators of this evidence are only permitted to use this evidence to inform commerce not public knowledge. That is a loss for science and public policy, in my opinion.
They note that there is no evidence for nudging as being generally effective. So any individual nudge could be effective (except in finance in which they found that none are effective).
From what I’ve seen there is even more incentive to focus on positive A/B tests. It’s the way you get credit for your work at a company. A negative test is counted as barely anything. So your incentive is to run tons of tests, then cherry pick only the positive ones and announce them widely. Another strategy is to track multiple metrics for each test and not adjust for that when computing p values. But then at the end you only report the one metric that was positive.
> Unreal and unity are monsters that allows beginners to have quick access to a high end rendering system, but then it doesn't leave enough room to add new cool things, like cool gameplay, a good multiplayer system, etc, when you want to ship on console who have limited memory.
I'm not following you here. These engines are exactly what enable you to add new cool things, and not focus on implementing lower level details.
Not to mention the huge library of games from B to AAA that use unreal/unity with things like cool gameplay, multiplayer and multi platform (PC and consoles).. this comment made no sense.
The comment hits closer to the bullseye than you realize. When you buy into an engine like UE and especially Unity you look at the successes on consoles but in reality you're in for a world of hurt if you want to achieve the same level of perf on the older consoles. It's like the difference between a stock WRX and a rally team prepped WRX - same car sure but the perf is vastly different and if you want to compete on that level you have a lot more work ahead of you.
The format to the world championship in chess has historically always been a point of contention, and the current format is no exception.
The Candidates tournament has some seemingly arbitrary qualifications that players must meet, and you could argue that the format doesn't necessarily produce the strongest player to challenge the world champion.
The World Championship match itself is problematic because it gives the defending champion a fairly huge advantage, in that they retain the title if they can draw out the match, although more recently it goes to rapid chess tie breaker rounds. So in practice the Championship is decided by these tie breaker rounds, which doesn't really seem appropriate.
Given the prep time players have and the engines available, players go into these matches extremely well prepared and draws over the board are quite a typical outcome unless someone makes a mistake.
I believe Magnus wants the Championship to become a knockout tournament to reduce the advantage that so much prep time can give. There is a big difference between prepping for a field of 12 players versus prepping for a single opponent.
> The World Championship match itself is problematic because it gives the defending champion a fairly huge advantage, in that they retain the title if they can draw out the match.
This has not been the case for quite a while. All of Carlsen's matches (Anand, Anand, Karjakjn, Caruana, Nepomniatchchi) had tie breakers in the format, to determine a guaranteed clear winner. The Karjakin and Caruana matches were decided in this way.
Interesting to get an more raw insight into Mark's thinking. In some ways its insightful and prescient, but also feels like there is desperation and a kind of throwing spaghetti at the wall and seeing what sticks. I suppose Facebook has/had the resources to do plenty of spaghetti throwing though.
Another observation is that this would have been the moment for Facebook to lean into short video content a la TikTok. But it seems like the video content is just an after thought for Zuck. Hindsight is 20/20 I suppose, but its interesting that they almost got there. Vine already existed at this point and I guess Zuck did not view it as a threat. Perhaps that's one downside of the "defensibility" mindset that seems to pervade this writing and most of the ideas. I get the sense that this is Zuck responding to competitors, and not really crafting a unique vision for Facebook as its own entity.
Gentile here, but even I tense up when I see discussions about the Talmud show up in "Gentile spaces". Some reasons:
1) Conspiracies. There's a millennia-long history of conspiracy theories involving the Talmud, and talk about the Talmud often see conspiracy theorists surface which sidetracks any rational discussion.
2) Religion. To this day, there is a significant group of religion Jews who take the Talmud extremely seriously as Scripture or at least as Scripture-adjacent. This annoys some people and tends to cause aggressive conversations.
3) Politics. Anything that can be connected to politics turns into a political argument. The Talmud seems to serve as easy fodder.
4) Misunderstandings. The Talmud is a complicated set of documents written for a very specific in-group and isn't easy to understand without a lot of help. There are various misunderstandings of it which often someone decides they need to share as though they were facts. Trying to correct or nuance these misunderstandings often then quickly devolves into a discussion of conspiracies, religion, or politics.
There's a lot of prejudgements based off of very little knowledge that goes on. I remember a comment section in a YouTube video about the eruv in New York where all the comments were either like "interesting do you know [innocent question]" and all the replies would be like "not a Jew but [patently wrong information]" or else were just pointing at the othered minority and their silly ways. I'm not Orthodox anymore, but my otherness from the Orthodox community is different from someone who didn't grow up in it and judges it or its teachings or trappings nonetheless.
In this case, the comments are pretty neutral to positive, but I was rather worried there would be a lot of "Jews study this silly ancient text full of nonsense" going about.
Probably due to a pattern of Very Confident Statements from people who know a little bit about the subject. There may be things you are expert in that are similar.
On occasion through out history people misquote, misunderstand or misrepresent stuff about Jews and their writings for a variety of reasons, sometimes with rather negative consequences. Innocuous discussions can turn into Happy Tree Friends real quick.
It's pretty racist. I mean, yeah, no doubt if you read and understand the whole thing like a regular scholar, every word, all that racism disappears. But a casual reader will tend to get the wrong idea.
Upon searching for a few words here, this list appears to have been copy-pasted between a bunch of antisemitic sites. It looks like ‘Hadarine’ isn’t even a real book — the only results are for very similar lists on the aforementioned antisemitic sites. There are also problems in translation: e.g. ‘filth’ is a highly misleading translation of niddah. Besides, this cherry-picked list leaves out any relevant context, which is supremely important in understanding or translating the Talmud. I wouldn’t give this post any credence as to what Judaism is actually about.
> So several sages together came to these cherry picked conclusions.
To clarify my post: I meant that out of the thousands of thousands of opinions recorded in the Talmud, that list selectively quoted five or six phrases which make it sound especially bad when taken out of context, while ignoring absolutely everything else. That’s what I meant by ‘cherry picked’.
(Your other responses, especially the last few lines, are fairly common misunderstandings of Judaism and Jewish history, and I have neither the time nor the inclination to try and debate them online just right now.)
As a hobbyist I found Unreal to be far better than Unity. The technical debt that Unity has continued to accrue starts impacting you even as a casual user. It's a bit of a mess to be honest.
Unreal on the other hand has a very solid architecture and goes out of its way to be reliable and maintainable. Plus you have the full source available in your project to search through if you want to dig into the code. The code is generally well written enough that you can use the source code over documentation if you prefer. Blueprints are also great if you are prototyping things.
Also if you are interested in doing networked multiplayer, there is no comparison. Unreal's networking is fantastic, while Unity doesn't really have networking built in.
I can't take any hobbyist game engine that doesn't implement multiplayer out of the box seriously. It's so stupid. QuakeWorld had it in 1999, and because game devs are too afraid of networking, you get a bunch of people saying "Oh, it's too hard," or "Yeah, but your networking model is dependent on what game you're shipping."
No you dunces, you will always need to serialize something. You will always need some semblance of tickrate, or at least you will always need game events. The more simple features game engine developers don't write increase the risk for users who will inevitably not implement them themselves.
Unreal engine has native support for networking. You add some configuration to your objects to determine how they should be replicated over the net and flip a switch in the settings, and now you have a multiplayer game.
A unsolved problem from three years ago is still an unsolved problem. And you don't actually rebuke any of his claims, you just give a poor summary of MEV.
I'm fairly sure anyone who has done A/B testing at scale has plenty of evidence that nudging works. Perhaps not up to the standard of science, but there are literally people who manipulate choice architecture for a living and I'm fairly convinced a lot of that stuff actually works.