Hacker Newsnew | past | comments | ask | show | jobs | submit | deathanatos's commentslogin

> Exactly! Therefore, use IEC 60027 prefixes like kibi-, because they are the ones that reflect the binary nature of computers. Only use SI if you genuinely respect SI definitions.

You have to sort of remember that these didn't exist at the time that "kilobyte" came around. The binary prefixes are — relatively speaking — very new.


The word "octet" is absolutely the kibibyte of "bits in a byte".

It’s the French word for “byte”. In France your computer has Ko/Mo/Go.

I can go along with that, mostly. When you say "octet", some old-timer with an IBM 650 can't go whining that kids these days can't even read his 7-bit emails.

It's not even really prescriptivist thinking… "Kilobyte" to mean both 1,000 B & 1,024 B is well-established usage, particularly dependent on context (with the context mostly being HDD manufacturers who want to inflate their drive sizes, and … the abomination that is the 1.44 MB diskette…). But a word can be dependent on context, even in prescriptivist settings.

E.g., M-W lists both, with even the 1,024 B definition being listed first. Wiktionary lists the 1,024 B definition, though it is tagged as "informal".

As a prescriptivist myself I would love if the world could standardize on kilo = 1000, kibi = 1024, but that'll likely take some time … and the introduction of the word to the wider public, who I do not think is generally aware of the binary prefixes, and some large companies deciding to use the term, which they likely won't do, since companies are apt to always trade for low-grade perpetual confusion over some short-term confusion during the switch.


Does anyone, other than HDD manufacturers who want to inflate their drive sizes, actually want a 1000-based kilobyte? What would such a unit be useful for? I suspect that a world which standardized on kibi = 1024 would be a world which abandoned the word "kilobyte" altogether.

> with the context mostly being HDD manufacturers who want to inflate their drive sizes

This is a myth. The first IBM harddrive was 5,000,000 characters in 1956 - before bytes were even common usage. Drives have always been base10, it's not a conspiracy.

Drives are base10, lines are base10, clocks are base10, pretty much everything but RAM is base10. Base2 is the exception, not the rule.


Not here, though. The exact code:

  fetch("https://gyrovague.com/?s="+Math.random().toString(36).substring(2,3+Math.random()*8),{ referrerPolicy:"no-referrer",mode:"no-cors" });
"no-cors" means the request will not be preflighted, but also that JS will be denied access to the body. But the body doesn't matter here — the attack only requires the request be sent.

But more to the point, so long as the request meets the requirements of a "simple request", CORS won't preflight it. GETs qualify as a simple request so long as no non-CORS-safelisted headers are sent; since the sent headers are attacker-controlled, we can just assume that to be the case. In a non-preflighted request, the CORS "yes, let JS do this" are just on the response headers of the actual request itself.

Since GETs are idempotent, the browser assumes it safe to make the request. CORS could/would be used to deny JS access to the response.

Things are this way b/c there are, essentially, a myriad of other ways to make the same request. E.g.,

  <img src="https://gyrovague.com/?s=…">
in the document would, for all intents and purposes, emit the same request, and browsers can't ban such requests, or at least, such a ban would be huge breaking change in browsers.

> Did you have to show the airline your ID when checking in?

No.

Most airlines only start asking for ID if you want to check a bag. But not for check-in.


… seems like we the HN community should find a new site to mirror with.

There isn't one. As far as I know, no one really knows for sure how they bypass all these paywalls. (Most credible theory I heard: They actually just pay for the subscriptions.)

Many sites including Bloomberg have evolved such that even archive.today don’t have the full text of any articles. They’re doing no giveaways whatsoever.

Ghostarchive does a decent job for the same sites in my experience: https://ghostarchive.org/

Update: hmm seems like they're involved in this whole thing too somehow, how strange:

https://news.ycombinator.com/item?id=46629646


Most paywalls just allow search engines to read their content just fine. Because they do want discoverability, they want their cake and eat it.

There's a few publications that don't even do that though and archive.is is very good at bypassing them so I do imagine they use logins for those, but for the masses of sites it's not currently necessary.


You can't impersonate Google. Sites check the source IP and they don't overlap with Google Cloud.

Google isn't the only search engine in the world of course. It probably is pretty much the only one that matters in America but the world is not just America either.

It's the only one websites don't block. That's one reason it's so hard to make another search engine.

You can for sites that can't afford the cost of keeping up-to-date with the Google IP list without which they can lose timely indexing. That is many.

What do you mean by “afford the cost”? The list is free of charge (https://support.google.com/a/answer/10026322?hl=en-GB) and maintenance can be fully automated.

I mean cost of server setup and execution.

The server that is providing the content exists already. That's a sunk cost.

"setup and execution".

What serious operator of a service isn't budgeting time to implement and operate critical maintenance functions?

Me for one. Adding an auto-updating IP address blocker to my personal blog site would probably cost more than setting up the whole site did in the first place.

Have you actually priced it, or are you just guessing?

Are you doing regular patching? Automated restarts? Watching for security breaches? Or just praying it stays up forever?

Otherwise, respectfully, I would not classify you as a "serious operator." Your site could live or die, and it would be all the same to you. Or, you've handed it to a third party for management and they don't offer much in the way of resilience or stability.


We're talking about sites that make their living via subscriptions. They should have a great interest at blocking archive.is, which is, by the way, the only service that can reliably bypass many paywalls. Clearly whatever they're doing is not easily replicated.

> We're talking about sites that make their living via subscriptions.

Sorry, but I wasn't. I thought that was clear from "can't afford the cost of keeping up-to-date with the Google IP list".

> They should have a great interest at blocking archive.is

Agreed, and many should have a budget to suit. So I conclude archive.is has put a lot of effort and cost into its defence. And all for free to us, the users.


Then why hasn't anyone built a client-side browser addon that impersonates a suitable search engine?

They have. It's called bypass-paywalls-clean . It works pretty ok.

It just keeps getting banned from the addon catalogs because of complaints from media. The Firefox one was taken down by a french newspaper. So you have to sideload it, which is hard to do on Android.

Edit: it looks like even the github was taken down now: https://github.com/iamadamdev/bypass-paywalls-firefox

But yes it exists. And it works for most sites. It's just hard to get it now.


It's on gitflic.ru now.

Hmm yeah but their adversaries did achieve their goal by pushing it away from the mainstream sites. Now we're into this situation of "how much do I trust this vague Russian site with my browsing activity".

At least the addon declares the sites it's for and ignores the rest but still I'm a lot less comfortable with it. It's more something I'd install in a container now, limiting its usefulness :(

In practice I just use archive.today now.


What's your problem with that theory?

Has people's ability to read messages and formulate sensible replies been going down of late? I see this kind of meaningless replies more and more often these days.

Yes, there's a global intelligence crisis, due to tiktok instagram et al

Meaningless? Its a clear question.

You're accusing him of having a problem with it, which his comment does not imply.

None

I think there are multiple hurdles that make a new competitor very unlikely.

The first one is money. You need lots of it to run such an operation (servers, IPs, paying to bypass all these paywalls, etc.).

The second one is the legality, as no one wants to be hunted by the FBI, especially not for running a website that is also losing money.


> Archive.is has more money, resources and ASN's than Akamai so surely they can mitigate anything anyone can throw at them.

This statement makes me think you're misunderstanding the person above you.

They're saying this blog author, gyrovague, is doxing¹ Archive.is. I am wondering if you are misreading that as DoSing. To "dox" is to reveal the identity of, typically for purposes of harassment. To "DoS" is to spam with requests. Archive.is is not being spammed with requests, nor do I see anyone here suggesting they are except here: "resources and ASN's … mitigate anything anyone can throw at them" … that seems to indicate you're (mis)reading it as "DoS"?

(I.e., gyrovague is doxing the Archive.today owner¹. Archive.today is, in return, DoSing gyrovague.)

(¹I'm not trying to comment on whether that term is being appropriately applied here, or not.)



Which is what makes the headline bait. We start with "The largest number representable in 64 bits" (which obviously depends on the representation, and as the baited comments point out, if that's freely settable, we can just set it arbitrarily high). But the body then moves the goalposts to "using a Turning machine", "using a Turing machine with specific parameters fixed", to "lambda calculus", etc.

This is now (at least) "The largest number representable by a Turning machine of fixed parameters that can then be squeezed into 64 bit."

(I don't remember my lambda calc, so … eh.)


Assuming your representation's infinity is size of ℵ₀, I set my representation's 0xFFFF_FFFF to the size of ℵ₁. Similarly, if you choose ℵ_(n), I'm choosing ℵ_(n+1).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: