Processing a micropayment takes more resources than serving a simple Web page, even if the payment fails. You would have to put in similar filters, or you would get DoSed using the micropayment service.
The incentive to send http requests is that data comes back. That's why the storm of scrapers hurts website owners. They gather the data and give nothing back.
What would be the incentive to send failing payment requests?
To break the site. But you're right that a lot fewer people will probably want to break it than to scrape it, and that stuff like CAPTCHAs is usually more about the "scraping" case. So basically a mistake on my part.
Additionally, I have read up a bit more on the Lightning Network now, and it seems not possible to send invalid payments in the first place.
The sender does not have a direct communication channel with the receiver. They send the payment to a hop they are connected to (they have a channel with) and it gets routed to the receiver. The first hop would already drop an invalid payment. If they spam them with more invalid payments, all that would happen is that their connection to the Lightning Network would get lost as their channel partners would disconnect from them. The receiver would not receive a single network packet in the whole process.
TLS 1.3 forces PFS, which means that if you want to decrypt a 1.3 stream, you have to actually do a man in the middle attack, not just get a copy of a key. PFS was optional before.
It supports ECH, which lets you hide which service the client is trying to reach on a multitenant host or CDN. Given that Cloudflare supports ECH, and that it's possible to hide the fact that you're using ECH, that makes it possible to have connections that could actually be using any of a huge number of possible sites without passive spying equipment being able to tell which ones.
It removes a bunch of weak old primitives and options, and should generally be harder to misconfigure in a dangerous way.
That sounds like an excellent plan. It's nice to find an article containing information that lets one make that decision, eh? The free market in action.
You wanna prevent gift card fraud? Stop selling gift cards.
Gift cards are a huge fraud vehicle by their nature. Home Depot is just noticing because it fraud against them, rather than the more usual money laundering for scams. Retailers turn a bit of a blind eye, since they make so much money from gift cards that never get used or end up with leftover balances. But really gift cards are an attractive nuisance, and add no value for the (non-sucker) consumer.
And the cameras will have small effectiveness after the first few arrests anyway. "Don't let the LPR catch your car" just becomes part of the tradecraft for these organized operations. Whereas sporadic, opportunistic, individualized ripoffs won't create much of a signature in the LPR stream.
> Database timeout (the database is owned by a separate oncall rotation that has alerts when this happens)
So people writing software are supposed to guess how your organization assigns responsibilities internally? And you're sure that the database timeout always happens because there's something wrong with the database, and never because something is wrong on your end?
No; I’m not understanding your point about guessing. Could you restate?
As for queries that time out, that should definitely be a metric, but not pollute the error loglevel, especially if it’s something that happens at some noisy rate all the time.
I think OP is making two separate but related points, a general point and a specific point. Both involve guessing something that the error-handling code, on the spot, might not know.
1. When I personally see database timeouts at work it's rarely the database's fault, 99 times out of 100 it's the caller's fault for their crappy query; they should have looked at the query plan before deploying it. How is the error-handling code supposed to know? I log timeouts (that still fail after retry) as errors so someone looks at it and we get a stack trace leading me to the bad query. The database itself tracks timeout metrics but the log is much more immediately useful: it takes me straight to the scene of the crime. I think this is OP's primary point: in some cases, investigation is required to determine whether it's your service's fault or not, and the error-handling code doesn't have the information to know that.
2. As with exceptions vs. return values in code, the low-level code often doesn't know how the higher-level caller will classify a particular error. A low-level error may or may not be a high-level error; the low-level code can't know that, but the low-level code is the one doing the logging. The low-level logging might even be a third party library. This is particularly tricky when code reuse enters the picture: the same error might be "page the on-call immediately" level for one consumer, but "ignore, this is expected" for another consumer.
I think the more general point (that you should avoid logging errors for things that aren't your service's fault) stands. It's just tricky in some cases.
> the database is owned by a separate oncall rotation
Not OP, but this part hits the same for me.
In the case your client app is killing the DB through too many calls (e.g. your cache is not working) you should be able to detect it and react, without waiting for the DB team to come to you after they investigated the whole thing.
But you can't know in advance if the DB connection errors are your fault or not, so logging it to cover the worse case scenario (you're the cause) is sensible.
I feel you're thinking about system wide downtime with everything timing out consistently, which would be detected by the generic database server vitals and basic logs.
But what if the timeouts are sparse and only 10 or 20% more than usual from the DB POV, but it affects half of your registration services' requests ? You need it logged application side so the aggregation layer has any chance of catching it.
On writing to ERROR or not, the hresholds should be whatever your dev and oncall teams decides. Nobody outside of them will care, I feel it's like arguing which drawer the socks should go.
I was in an org where any single error below CRITICAL was ignored by the oncall team , and everything below that only triggered alerts on aggregation or special conditions. Pragmatically, we ended up slicing it as ERROR=goes to the APM, anything below=no aggregation, just available when a human wants to look at it for whatever reason. I'd expect most orgs to come with that kind of split, where the levels are hooked to processes, and not some base meaning.
> No; I’m not understanding your point about guessing. Could you restate?
In the general case, the person writing the software has no way of knowing that "the database is owned by a separate oncall rotation". That's about your organization chart.
Admittedly, they'd be justified in assuming that somebody is paying attention to the database. On the other hand, they really can't be sure that the database is going to report anything useful to anybody at all, or whether it's going to report the salient details. The database may not even know that the request was ever made. Maybe the requests are timing out because they never got there. And definitely maybe the requests are timing out because you're sending too many of them.
I mean, no, it doesn't make sense to log a million identical messages, but that's rate limiting. It's still an error if you can't access your database, and for all you know it's an error that your admin will have to fix.
As for metrics, I tend to see those as downstream of logs. You compute the metric by counting the log messages. And a metric can't say "this particular query failed". The ideal "database timeout" message should give the exact operation that timed out.
This sort of thing is pretty meaningless unless the code is all written by people who know how to get performance out of their languages (and they're allowed to do so). Did you use the right representation of the data? Did you use the right library? Did you use the right optimization options? Did you choose the fast compiler or the slow one? Did you know there was a faster or slower one? If you're using fancy stuff, did you use it right?
I did the same sort of thing with the Seive of Eratosthenes once, on a smaller scale. My Haskell and Python implementations varied by almost a factor of 4 (although you could argue that I changed the algorithm too much on the fastest Python one). OK, yes, all the Haskell ones were faster than the fastest Python one, and the C one was another 4 times faster than the fastest Haskell one... but they were still over the place.
It's true this is a microbenchmark and not super informative about "Big Problems" (because nothing is). But it absolutely shows up code generation and interpretation performance in an interesting way.
Note in particular the huge delta between rust 1.92 and nightly. I'm gonna guess that's down to the autovectorizer having a hole that the implementation slipped through, and they fixed it.
The delta there is because the Rust 1.92 version uses the straightforward iterative code and the 1.94-nightly version explicitly uses std::simd vectorization. Compare the source code:
Take their Haskell implementation. If I compile it unoptimized (with a slight change to hardwire the iteration count at one billion instead of reading it from a file), I get bored waiting for it to finish after several minutes of clock time. If I compile it with "-O3", it runs in 4.75 seconds (it's an old machine).
Suppose I remove the strictness annotations (3 exclamation points, in places that aren't obvious to a naive programmer coming from almost any other language). If I then compile it unoptimized, it gets up to over 30GB of resident memory before I get bored (it's an old machine with a lot of memory). It would probably die with an out of memory error if I tried to run it to completion. However, if I compile that same modified code optimized, the compiler infers the strictness and the program runs in exactly the same time as it does with the annotations there. BUT it's far from obvious to the casual observer when the compiler can make those inferences and when it can't.
I had ChatGPT rewrite the Haskell code to use unboxed numeric types. It ran in 1.5 seconds (the C version takes 1.27). The rewrite mostly consists of sprinkling "#" liberally throughout the code, but also requires using a few specialized functions. I had ChatGPT do it because I have never used unboxed types, and you could argue that they're not common idiom. However, anybody who actually wrote that kind of numerical code in Haskell on a regular basis would use unboxed types as a matter of course.
You're acting like this is a gotcha, but the answer is obviously "all of them" and that indeed, this tells you interesting things about the behavior of your compiler. There are lots of variant scores in the linked article that reflect different ways of expressing the problem.
But also, it tells you something about the limitations of your language too. For example, the biggest single reason that C/C++ (and languages like Fortran/Zig/D and sometimes C# and Rust whose code generation is isomorphic to them) sit at the top of the list is that they autovectorize. SIMD code isn't a common idiom either, but the compiler figures it out anyway.
And apparently Haskell isn't capable of doing enough static analysis to fall back to an unboxed implementation (though given the simplicity of this code, that should be possible I think). That's not a "flaw" and it doesn't mean Haskell "loses", but it absolutely is an important thing to note. And charts like this show us where those holes lie.
> You're acting like this is a gotcha, but the answer is obviously "all of them" and that indeed, this tells you interesting things about the behavior of your compiler.
They tell me interesting things if I know enough about the language to know the difference. It tells me things if I'm getting into the weeds with Haskell specifically. That doesn't make the big comparison chart useful in any way.
I still don't know anything that lets me compare anything with any other language unless I actually know that language nearly as well. And I definitely don't get much out of a long list of languages, most of which I know not at all or at most at a "hello world" level, with only a couple of the entries tagged with even minimal information about compilers or their configurations at all. Especially when, on top of that, I don't know how much the person writing the test code knew.
At most I get "this language does a pretty good/poor job on this type of task when given code that may or may not be what a 'native expert' would write.".
And that's not news. Nobody (with any sophistication) would write that code for real in Python, or probably in Haskell either, because most seasoned programmers know that if you want speed on a task like that, you write it in a more traditional compiled procedural language. It's also not a kind of code that most people write to begin with. If you want an arctangent (which is really what it's doing), you use the library function, and the underlying implementation of that is either handcrafted C, or, more likely, a single, crafted CPU instruction with some call handling code wrapped around it.
So what is the overall chart giving me that I can use?
> So what is the overall chart giving me that I can use?
"If you write in C or an analog, your math will autovectorize nicely"
"If you use a runtime with a managed heap, you're likely to take a penalty even on math stuff that doesn't look heap limited"
"rust 1.92 is, surprisingly, well behind clang on autovectorizable code"
I mean, I think that stuff is interesting.
> If you want an arctangent (which is really what it's doing), you use the library function
If you just want to call library functions, you're 100% at the mercy of whatever platform you picked[1]. So sure, don't look at benchmarks, they can only prove you wrong.
[1] Picked without, I'll note, having looked carefully at cross-platform benchmarks before having made the choice! Because you inexplicably don't think they tell you anything useful.
> Note in particular the huge delta between rust 1.92 and nightly. I'm gonna guess that's down to the autovectorizer having a hole that the implementation slipped through, and they fixed it.
The benchmark also includes startup time, file I/O, and console printing. There could have been a one-time startup cost somewhere that got removed.
The benchmark is not really testing the Leibniz loop performance for the very fast languages, it's testing startup, I/O, console printing, etc.
Counterproposal for actually useful, effective Internet regulation that would actually do something about the negative effects of "social media":
1. Ban paid advertising, of all kinds, everywhere, without exception. It's an incentive to optimize for engagement, and it's the root of basically all evil on the Internet. People are just gonna have to pay for things. I do not give a fuck if it destroys Google or whoever.
2. Ban collecting any information not intrinsically necessary to deliver a specific Internet service that the user has actually asked for. That includes the user's name. Require zero-knowledge attribute-based authentication of anything that specifically needs to be proven. Require accepting cryptographically anonymous payments. Even if specific information is necessary, you are just going to have to shut down until you have the infrastructure to collect it without getting anything else. The ability to get this information is another incentive to optimize for engagement.
3. Ban sharing even the actually necessary information, except as necessary to cooperate to provide some service that, again, the user has actually asked for. Considerable work involved in defining what "sharing" means, but something a hell of a lot tighter than the GDPR. And notice that I didn't say "sharing without opt-in". Unless you actually need it to provide a service to the user, you can't do sharing even with permission.
4. Just in case there are some incentives left, ban selection/recommendation algorithms that optimize for engagement or for anything that smells like engagement. You can have an exception for a user getting their own recommendation system from a third party that shares no control with the actual providers or carriers of the content.
5. Ban terms of service that prohibit scraping or third-party clients for centralized services. Consider requiring that everything over a certain size have a stable API that can do anything the regular UI can. This makes it harder to force people to keep using manipulative stuff.
6. Ban carrier NAT. Require IPv6 to be turned on wherever IPv4 is turned on, with every retail subscriber given least thousands of stable addresses. Ban "no servers" contracts. Ban "safety filtering" by ISPs unless customers can disable it trivially. Ban traffic prioritization by ISPs. This may allow the Internet to (slowly and uncertainly) heal back toward being a truly decentralized system.
7. Actually enforce your laws against fraud, unfair business practices, etc.
... or you can just fuck around and make kids' lives miserable, I guess. They don't vote.
You seem to think this is somehow specific to government. It is not. And, no, the market does not eventually destroy the organizations where it happens.
reply