Love MOFs! Did research about MOFs <=> language modeling a couple years ago and I'm excited to see them getting more coverage https://arxiv.org/abs/2311.07617
We did want more pictures!! Recently bought a Sony A7III to capture more fun moments like this.
We're working on pretraining computer action models from the ground up—hence the pretraining data cluster. We're a public benefit corp because we think its important for AGI to built in the public's interest + are planning on automating a lot of the work done on computers!
We reached out to almost every colocation space in SF/some in Fremont to get quotes. There wasn't a difference between the quote price and what we ended up paying, though we did negotiate terms + one-time costs.
Did you do any kind of redundancy at least (eg: putting every 10 disks in RAID 5 or RAID Z1)? Or I suppose your training application doesn't mind if you shed a few terabytes of data every so often?
atm we don't and we're a bit unsure whether it's a free lunch wrt adding complexity. there's a really nice property of having isolated hard drives where you can take any individual one and `sudo mount` it and you have a nice chunk of training data, and that's something anyone can feel comfortable touching without any onboarding to some software stack
I wonder if snapraid would work for this. Especially if your data is mostly written once and then just read, it could be an easy way to add redundancy while keeping isolated individual drives.
reply