Hacker Newsnew | past | comments | ask | show | jobs | submit | nee1r's commentslogin

hmmm wonder if decosting is actually linear vs. discrete jumps in ability (ie. we might just nail fusion or boosts in efficiency)

Love MOFs! Did research about MOFs <=> language modeling a couple years ago and I'm excited to see them getting more coverage https://arxiv.org/abs/2311.07617


We have a custom barebones solution that uses a hashring to route the files!


We use the same nginx rust server to do file writes, it's done via web requests


We did want more pictures!! Recently bought a Sony A7III to capture more fun moments like this.

We're working on pretraining computer action models from the ground up—hence the pretraining data cluster. We're a public benefit corp because we think its important for AGI to built in the public's interest + are planning on automating a lot of the work done on computers!


"The best camera is the one you have with you." Looking forward to the next buildout post!


Thanks for helping!!!


Around 2-5 hours/month, mostly powercycling the servers and replacing hard drives


You should be able to power cycle the servers from their management interfaces.

(But I have the luxury of everything being bought new from HP, so the interfaces are similar.)


We reached out to almost every colocation space in SF/some in Fremont to get quotes. There wasn't a difference between the quote price and what we ended up paying, though we did negotiate terms + one-time costs.


Please consider posting the quotes, even if you have to redact colo names.


Thanks!! :)


Definitely much less redundancy, this was definitely a tradeoff we made for pretraining data and cost.


Did you do any kind of redundancy at least (eg: putting every 10 disks in RAID 5 or RAID Z1)? Or I suppose your training application doesn't mind if you shed a few terabytes of data every so often?


atm we don't and we're a bit unsure whether it's a free lunch wrt adding complexity. there's a really nice property of having isolated hard drives where you can take any individual one and `sudo mount` it and you have a nice chunk of training data, and that's something anyone can feel comfortable touching without any onboarding to some software stack


I wonder if snapraid would work for this. Especially if your data is mostly written once and then just read, it could be an easy way to add redundancy while keeping isolated individual drives.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: