Hacker Newsnew | past | comments | ask | show | jobs | submit | Hakkin's commentslogin

A scrub only reads allocated space, so in your 10TB example, a scrub would only read whatever portion of that 10TB is actually occupied by data. It's also usually recommended to keep your usage below 80% of the total pool size to avoid performance issues, so the worst case in your scenario would be more like ~53% assuming you follow the 80% rule.


Is the 80% rule real or just passed down across decades like other “x% free” rules? Those waste enormous amounts of resources on modern systems and I kind of doubt ZFS actually needs a dozen terabytes or more of free space in order to not shit the bed. Just like Linux doesn’t actually need >100 GB of free memory to work properly.


> Is the 80% rule real or just passed down across decades like other “x% free” rules?

As I understand it, the primary reason for the 80% was that you're getting close to another limit, which IIRC was around 90%, where the space allocator would switch from finding a nearby large-enough space to finding the best-fitting space. This second mode tanks performance and could lead to much more fragmentation. And since there's no defrag tool, you're stuck with that fragmentation.

It has also changed, now[1] the switch happens at 96% rather than 90%. Also the code has been improved[2] to better keep track of free space.

However, performance can start to degrade before you reach this algorithm switch[3], as you're more likely to generate fragmentation the less free space you have.

However, it was also a generic advice, which was ignorant to your specific workload. If you have a lot of cold data, low churn but it's fairly equal in size, then you're probably less affected than if you have high churn with lots of files of varied sizes.

[1]: https://openzfs.github.io/openzfs-docs/Performance%20and%20T...

[2]: https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSZpoolFra...

[3]: https://www.bsdcan.org/2016/schedule/attachments/366_ZFS%20A...


In practice you see noticeable degradation of performance for streaming reads of large files written after 85% or so. Files you used to be able to expect to get 500+MB/sec could be down to 50MB/sec. It's fragmentation, and it's fairly scale invariant, in my experience.


Speaking strictly about ZFS internal operations, the free space requirement is closer to 5% on current ZFS versions. That allows for CoW and block reallocations in real-world pools. Heavy churn and very large files will increase that margin.


Still 53% of the useful life of a HDD for just scrubbing is excessive.

You don't lose tracks in 3 months. If you don't read the tracks for a year and if the HDD is operated in high temperatures, then the controller might struggle to read them.

The very act of scrubbing generates heat, so we should use it sparingly.


Is this true? KADOKAWA had a massive hack last year that leaked a large amount of sensitive user data and as far as I know has faced no legal repercussions. Obviously they took a decent financial and reputational hit, but that was just an effect of the hack itself, not any government intervention.


I definitely noticed the performance boost on my Pixel 8, for some reason it seems to really not like wireguard-go, it struggled to pull even 100mbps, maybe something unoptimized on Google's custom hardware. With the new GotaTun version I can pull 500mbps+, though unfortunately it also seems to have introduced a bug that randomly prevents the phone from entering a deep sleep state, so occasionally my battery will randomly start draining at 10x normal speed if I have it enabled until I reboot.


I'm surprised by this comment. I have wireguard on 24/7 on my shitty Samsung A5 and it lasts forever. By comparison the Pixel 8 is a beast. Sounds like an Android bug more than wireguard.


Pixel 6 here. Vanilla wireguard app. It sucks the life out of my phone and nearly halves the already half-life battery (thanks Google for your crappy OEM producers!)


Thank samsung for their shitty modems in the pixels.

However, there’s going to be a large discrepancy for all devices on battery usage based on whether VPN is on wifi or cellular, and additionally when on cellular how close to the tower they are. I live near cell edge and VPN’s roast my batts on cellular no matter the make, in city it’s almost not noticeable to have VPN on. Better to use wifi when far from towers, cellular more efficient if it’s strong signal.


They must have fixed it in recent versions. My pixel 9 pro battery seems the same with proton VPN (wirrguard) on or off.


What app are you using?


It's just called WireGuard, by the "WireGuard Development Team" off google play.


Pretty sure thats the c implementation not the go one


AFAIK the C implementation is a kernel module that's not shipped in stock Android releases. The WireGuard Android app uses that module when available, but otherwise uses wireguard-go.


Good knowledge here, was unaware of this feature of the app. Would there be any case of the app defaulting to the wireguard kernel module if it's not included by any OEM Android release? I would assume that means most users are actually running wireguard-go.


I hope so.


Android kernels 4.19 and higher should all have support included for WireGuard unless the OEM specifically disables it: [0]. The Pixel 8 ships with the android 14 6.1 kernel so it most definitely should have WireGuard kernel support. You can check this in the WireGuard app BTW, if you go to settings it will show the backend that's in use.

[0] https://android-review.googlesource.com/c/kernel/common/+/14...


Kernel support should have no bearing as the apps are purely userspace apps. You can use the kernel mode if you root the phone, but that's not a typical scenario.


Well, the issue isn't kernel vs user space, but you are correct that you still need a custom ROM and/or root unfortunately. I had assumed Android had also allowed netlink sockets for WireGuard but alas they did not. So the app can't communicate with the kernel module, bummer.


Same behavior on raspberry pi 5. Might be just lack of arm optimizations.


It's very likely that VPNs like this are not CPU-bound, even on somewhat whimpy CPUs. I'd wager even some microcontrollers could sling 500megabits/sec around without trouble.


You’re in for a surprise then once you actually go look at the performance.


A Raspberry Pi 4 can manage something like 70Mbps of raw AES en/decryption flow: https://github.com/lelegard/aesbench/blob/main/RESULTS.txt

That CPU is pretty much a toy compared to (say) a brand-new M5 or EPYC chip, but it similarly eclipses almost any MCU you can buy.

Even with fast AES acceleration on the CPU/MCU — which I think some Cortex MCUs have — you’re really going to struggles to get much over 100Mbits of encrypted traffic handling, and that’s before the I/O handling interrupts take over the whole chip to shuttle packets on and off the wire.

Modern crypto is cheap for what you get, but it’s still a lot of extra math in the mix when you’re trying to pump bytes in and out of a constrained device.


You're looking at the wrong thing, WireGuard doesn't use AES, it uses ChaCha20. AES is really, really painful to implement securely in software only, and the result performs poorly. But ChaCha only uses addition rotation and XOR with 32 bit numbers and that makes it pretty performant even on fairly computationally limited devices.

For reference, I have an implementation of ChaCha20 running on the RP2350 at 100MBit/s on a single core at 150Mhz (910/64 = ~14.22 cycles per bytes). That's a lot for a cheap microcontroller costing around 1.5 bucks total. And that's not even taking into account using the other core the RP2350 has, or overclocking (runs fine at 300Mhz also at double the speed).


You’re totally right; I got myself spun around thinking AES instead of of ChaCha because the product I work on (ZeroTier) started with the initially and moved to AES later. I honestly just plain forgot that WireGuard hadn’t followed the same path.

An embarrassing slip, TBH. I’m gonna blame pre-holiday brain fog.


That's off by an order of magnitude. The table lists 0.4 or so Gbit/s but I think that's per core.


Yeah no, this is very much not true, even more so for a Go-based implementation and energy consumption optimized ARM devices.


MTU strikes again. 1320.


Why 1320 and not larger?


For most any 5G network you should be safe to 1420 - 80 = 1340 bytes if using IPv6 transport or 1420 - 60 = 1360 bytes if using IPv4 transport.

For testing I recommend starting from 1280 as a "does this even work" baseline and then tweaking from there. I.e. 1280 either as the "outside" MTU if you only care about IPv4 or as the "inside" MTU if you want IPv6 to work through the tunnel. This leverages that IPv6 demands a 1280 byte MTU to work.


Hah! I just ran into this recently and can confirm. The coax to my DOCSIS ISP was damaged during a storm, which was causing upstream channels to barely work at all. (Amusingly, downstream had no trouble.) While waiting for the cable person to come around later in the week, I hooked my home gateway device up to an old phone instead of the modem. I figured there would be consequences, but surprisingly, everything went pretty smoothly... But my Wireguard-encapsulated connections all hung during the TLS handshake! What gives?

The answer is MTU. The MTU on my network devices were all set to 1500, and my Wireguard devices 1420, as is customary. However, I found that 1340 ( - 80) was the maximum I could use safely.

Wait, though... Why in the heck did that only impact Wireguard? My guess is that TCP connections were discovering the correct MSS value automatically. Realistically that does make sense, but something bothers me:

1. How come my Wireguard packets seemed to get lost entirely? Shouldn't they get fragmented on one end and re-assembled on the other? UDP packets are IP packets, surely they should fragment just fine?

2. Even if they don't, if the Linux TCP stack is determining the appropriate MSS for a given connection then why doesn't that seem to work here? Shouldn't the underlying TCP connection be able to discover the safe MSS relatively easily?

I spelunked through Linux code for a while looking for answers but came up empty. Wonder if anyone here knows.

My best guess is that:

1. A stateless firewall/NAT somewhere didn't like the fragmented UDP packets because it couldn't determine the source/dest ports and just dropped them entirely

2. Maybe MSS discovery relies on ICMP packets that were not able to make it through? (edit: Yeah, on second thought, this makes sense: if the Wireguard UDP packets are not making it to their destination, then the underlying encapsulated packets won't make it out either, which means there won't be any ICMP response when the TCP stack sends a packet with Don't Fragment set.)

But I couldn't find anything to strongly support that.


Basically the only parts of the Internet which actually work reliably, around the globe, are the bits needed so that web pages basically kinda work. If you break literally everything else your service is crap, and some customers might notice, but many won't and also some won't have a choice so, sucks to be them. But if you break the Web, now everybody notices that you broke stuff and they're angry.

This is why DoH (DNS over HTTPS) is a thing. It obviously makes no actual sense to use the web protocol to move DNS packet, but, this works and most things don't work for everybody so eh, this is what we have. Smashing the Path MTU discovery doesn't break the web.

Breaking literally everything so long as the web pages work even means you can't upgrade parts of the web unless you get creative. TLS 1.3 the modern security protocol that is used for most of your web pages today, would not work for most people if it admitted that it's TLS 1.3, if you send packets with TLS version 1.3 on them people's "intelligent" "best in classs security" protective garbage (in the industry we call these "middle boxes") thinks it is being attacked by some unknown and unimaginable dastardly foe and kills the data. So TLS 1.3 really, I am not making this up, always pretends it is a TLS 1.2 re-connection, and despite the fact that no such connection ever existed these same "best in class security" technologies just have no idea what's happening and wave it through. It's very very stupid that they do that, but it was needed to make the web work, which matters, whereas actual security eh, suckers already bought the device, who cares.

This situation is deeply sad but, one piece of good news is that while "This Iranian woman can't even talk confidentially to her own mother without using code words because the people in charge there intercept her communications" won't attract as much sympathy as you'd like from some bearded white guy who has never left Ohio, the fact that those people broke his network protocol to do that interception infuriates him, and he's well up for ensuring they can't do that to the next version.


Your ultimate conclusion is correct, to my understanding. I know wireguard sought to be ultra minimal but I do wish they had included DPLPMTUD as something which is required to be supported (but not mandated to be used e.g. if the user wants to hard set it as they would currently) because it's one of those cases where "do it yourself separately the UNIX way™" or "have the tunneled things do it if they need it" instead are both significantly more complex and fragile.


On that note, from the TCP layer it should just look like an ICMP blackhole, which makes me wonder if enabling `net.ipv4.tcp_mtu_probing` will magically make TCP connections under Wireguard work even with the MTU set wrong. I'd try it, but unfortunately with a similar configuration I am unable to get the fragmentation behavior I was getting before; which makes me wonder if it was my UniFi Security Gateway that actually didn't like the fragmented packets.


Oh, this is the reason the Mullvad app on my Pixel 6a was suddenly able to connect in less than a second where before it would take 5-10 seconds, nice!


I think the problem is on the Android side. I noticed the same behavior with ZeroTier and even with MizuDroid, all totally unrelated.


Do you have wireguard keepalives on?


That only helps so much, some things still won't work if the browser thinks you're talking over an unencrypted connection, like HTTP/2. Technically HTTP/2 allows unencrypted connections (h2c) but as far as I know, no browser implements it (including Tor Browser) and server support is also somewhat limited, so Tor Browser is limited to HTTP/1.[01] on Onion sites unless they have a TLS certificate.


Note that you don't actually need the generated column either, SQLite supports indexes on expressions, so you can do, for example,

  CREATE INDEX subjectidx ON messages(json_extract(headers, '$.Subject'))
and it will use this index anywhere you reference that expression.

I find it useful to create indexes like this, then create VIEWs using these expressions instead of ALTER'ing the main table with generated columns.


And since view and indexes don't change the data, you can use tools like https://github.com/fsaintjacques/recordlite to automate schema management.


This is cool! I quite like this.


What a great timely tip. Was just looking for good direction on how to do this. Thanks!


It sets a cookie with a JWT verifying you completed the proof-of-work along with metadata about the origin of the request, the cookie is valid for a week. This is as far as Anubis goes, once you have this cookie you can do whatever you want on the site. For now it seems like enough to stop a decent portion of web crawlers.

You can do more underneath Anubis using the JWT as a sort of session token though, like rate limiting on a per proof-of-work basis, if a client using X token makes more than Y requests in a period of time, invalidate the token and force them to generate a new one. This would force them to either crawl slowly or use many times more resources to crawl your content.


Much better than infinite Cloudflare captcha loops.


I've never had that, even with something like tor browser. You must be doing something extra suspicious like an user agent spoofer.


Firefox with Enhanced Tracking Protection turned on is enough to trigger it.


You need to whitelist challenges.cloudflare.com for third-party cookies.

If you don't do this, the third-party cookie blocking that strict Enhanced Tracking Protection enables will completely destroy your ability to access websites hosted behind CloudFlare, because it is impossible for CloudFlare to know that you have solved the CAPTCHA.

This is what causes the infinite CAPTCHA loops. It doesn't matter how many of them you solve, Firefox won't let CloudFlare make a note that you have solved it, and then when it reloads the page you obviously must have just tried to load the page again without solving it.

https://i.imgur.com/gMaq0Rx.png


You're telling me cloudflare has to store something on my computer to let them know I passed a captcha?

This sounds like "we only save hashed minutiae of your biometrics"


> You're telling me cloudflare has to store something on my computer to let them know I passed a captcha?

Yes?

HTTP is stateless. It always has been and it always will be. If you want to pass state between page visits (like "I am logged in to account ..." or "My shopping cart contains ..." or "I solved a CAPTCHA at ..."), you need to be given, and return back to the server on subsequent requests, cookies that encapsulate that information, or encapsulate a reference to an identifier that the server can associate with that information.

This is nothing new. Like gruez said in a sibling comment; this is what session cookies do. Almost every website you ever visit will be giving you some form of session cookie.


Then don't visit the site. Cloudflare is in the loop because the owner of the site wanted to buy not build a solution to the problems that Cloudflare solves. This is well within their rights and a perfectly understandable reason for Cloudflare to be there. Just as you are perfectly within your rights to object and avoid the site.

What is not within your rights is to require the site owner to build their own solution to your specs to solve those problems or to require the site owner to just live with those problems because you want to view the content.


That would be a much stronger line of argument if cloudflare wasn't used by everyone and their consultant, including on a bunch of sites I very much don't have an option of not using.


Cloudflare doing a really good job meeting customer needs doesn't impact my argument at all.


When a solution is widely adopted or adopted by essential services it becomes reasonable to place constraints on it. This has happened repeatedly throughout history, often in the form of government regulations.

It usually becomes reasonable to object to the status quo long before the legislature is compelled to move to fix things.


Why? This isn't a contrarian complaint but the problems that Cloudflare solves for an essential service require verifying certain things about the client which places a burden on the client. The problems exist in many cases because the service is essential which makes it a higher profile target. Expecting the client to bear some of that burden for interacting with the service in order to protect that service is not in my mind problematic.

I do think that it's reasonable for the service to provide alternative methods of interacting with it when possible. Phone lines, Mail, Email could all be potential escape hatches. But if a site is on the internet it is going to need protecting eventually.


That's a fair point, but it doesn't follow that the current status quo is necessarily reasonable. You had earlier suggested that the fact that it broadly meets the needs of service operators somehow invalidates objections to it which clearly isn't the case.

I don't know that "3rd party session cookies" or "JS" are reasonable objections, but I definitely have privacy concerns. And I have encountered situations where I wasn't presented with a captcha but was instead unconditionally blocked. That's frustrating but legally acceptable if it's a small time operator. But when it's a contracted tech giant I think it's deserving of scrutiny. Their practices have an outsized footprint.

> service to provide alternative methods of interacting with it when possible

One of the most obvious alternative methods is logging in with an existing account, but on many websites I've found the login portal barricaded behind a screening measure which entirely defeats that.

> if a site is on the internet it is going to need protecting eventually

Ah yes, it needs "protection" from "bots" to ensure that your page visit is "secure". Preventing DoS is understandable, but many operators simply don't want their content scraped for reasons entirely unrelated to service uptime. Yet they try to mislead the visitor regarding the reason for the inconvenience.

Or worse, the government operations that don't care but are blindly implementing a compliance checklist. They sometimes stick captchas in the most nonsensical places.


>You're telling me cloudflare has to store something on my computer to let them know I passed a captcha?

You realize this is the same as session cookies, which are used on nearly every site, even those where you're not logging in?

>This sounds like "we only save hashed minutiae of your biometrics"

A randomly generated identifier is nowhere close to "hashed minutiae of your biometrics".


the idea that cloudflare doesn't know who i am without a cookie is insulting.


The infinite loop or the challenge appearing? I've never had problems with passing the challenge, even with ETP + RFP + ublock origin + VPN enabled.


Cloudflare is too stupid to realize that carrier grade NATs exist a lot in Germany. So there's that, sharing an IP with literally 20000 people around me doesn't make me suspicious when it's them that trigger that behavior.

Your assumption is that anyone at cloudflare cares. But guess what, it's a self fulfilling prophecy of a bot being blocked, because not a single process in the UX/UI allows any real user to complain about it, and therefore all blocked humans must also be bots.

Just pointing out the flaw of bot blocking in general, because you seem to be absolutely unaware of it. Success rate of bot blocking is always 100%, and never less, because that would imply actually realizing that your tech does nothing, really.

Statistically, the ones really using bots can bypass it easily.


>Cloudflare is too stupid to realize that carrier grade NATs exist a lot in Germany. So there's that, sharing an IP with literally 20000 people around me doesn't make me suspicious when it's them that trigger that behavior.

Tor and VPNs arguably have the same issue. I use both and haven't experienced "infinite loops" with either. The same can't be said of google, reddit, or many other sites using other security providers. Those either have outright bans, or show captchas that require far more effort to solve than clicking a checkbox.


If you want to try fighting it, you need to find someone with CF enterprise plan and bot management working, then get blocked and get them to report that as wrong. Yes it sucks and I'm not saying it's a reasonable process. Just in case you want to try fixing the situation for yourself.


Honestly it's a fair assumption on bot filtering software that no more than like 8 people will share an IPv4. This is going to make IP reputation solutions hard. Argh.


Proper response here is "fuck cloudflare", instead of blaming the user.


It's well within your rights to go out of your way to be suspicious (eg. obfuscating your user-agent). At the same time sites are within their rights to refuse service to you, just like banks can refuse service to you if you show up wearing a balaclava.


You're assuming too much. I'm not obfuscating/masking anything. I'm just using Firefox with some (to the user/me) useless web APIs disabled to reduce the attack surface of the browser and CF is not doing feature testing. It's not just websites that need to protect themselves.

Eg. Anubis here works fine for me, completely out-classing the CF interstitial page with its simplicity.


Apparently user-agent switchers don't work for fetch() requests, which means that Anubis can't work with people that do that. I know of someone that set up a version of brave from 2022 with a user-agent saying it's chrome 150 and then complaining about it not working for them.


If a disk is encrypted, you don't have to worry about the contents if you eventually have to RMA or dispose of the disk. For this use case, it makes no difference how the encryption key is input.


I'd guess the most common scenario is for someone giving away the entire computer, not fiddle with components. Or theft of the full machine.

This feels like one of those half-security measures that makes it feel like you're safe, but it's mostly marketing, making you believe *this* device can be both safe and easy to use.


It's pretty fast to destroy all the keys in a TPM. Should take a minute if you know the right place to go. Meanwhile securely deleting a normal drive requires overwriting every sector with random data, which could take hours. So it also helps if you're giving away the whole machine.


Smashing the disk with a hammer takes seconds. Or dropping it in a shredder.

Used disks are not worth hours of time to overwrite with random data. Just physically destroy them.


Or a furnace. That usually leaves no trace of the physical disk having ever existed, which may be important for <insert state actor/military here>.


That’s wasteful, inconvenient, and not even necessarily unrecoverable.


Encrypted data are noise now, maybe, but may be decryptable in the future with advances in computing.

So all this depends on what you worry about.


Most of this concern is around certain public key cryptography algorithms which depend on math problems being extremely hard to solve but could in theory be mathematically solved (decrypted without the key) with a good enough quantum computer.

Disk encryption (AES etc) is symmetric and still only brute-force would work which can be made infeasible with a long enough key.


Brute-forcing symmetric encryption is a somewhat silly concept anyways, because each decryption is equally valid.


> Brute-forcing symmetric encryption is a somewhat silly concept anyways, because each decryption is equally valid.

Each decryption is equally valid as long as the key has the same size as the data. What happens, in practice, is that the key is much smaller than the data. Take a look at your filesystem, it should have hundreds or thousands of bytes of fixed information (known plaintext), or an equivalent amount of verifiable information (the filesystem structure has to make sense, and the checksums must match). That is: for a large enough filesystem (where "large enough" is probably on the order of a small floppy disk), decrypting with the wrong key will result in unrecoverable garbage which does not make sense as a filesystem.

To give an illustration: suppose all filesystems have to start with the four bytes "ABCD", and the key has 256 bits (a very common key size). If you choose a key randomly to decrypt a given cyphertext, there's only one chance in 2^32 that the decryption starts with ABCD, and if it doesn't, you know it's the wrong key. Now suppose the next four bytes have to be "EFGH", that means only one in 2^64 keys can decrypt to something which appears to be valid. It's easy to see that, once you add enough fixed bytes (or even bits), only one key, the correct one, will decrypt to something which appears to be valid.


That's only true for information theoretically secure algorithms like one-time pad. It's not true for algorithms that are more practical to use like AES.


It's fairly rare during general browsing, but Mullvad makes no attempt to hide that their servers are VPN IPs, so any site that explicitly wants to block VPN users will very likely block Mullvad. Mostly an issue if you're trying to use a VPN to view geo-blocked media.


I have an AOC Q27G3XMN and while I do get reduced motion blur from this, I also experience very bad color banding/shifting. Messing with some of the values in the script config makes it slightly better, and changing the overdrive setting on the monitor seems to affect it as well, but there is still pretty strong banding no matter what strength it's on. I tested on my phone (Pixel 8) and it works very well there without any banding or color weirdness, so I guess it's just something about this particular monitor that doesn't work well with this method.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: