Hacker Newsnew | past | comments | ask | show | jobs | submit | lijok's commentslogin

I think you severely overestimate your understanding of how these systems work. We’ve been beating the dead horse of “next character approximation” for the last 5 years in these comments. Global maxima would have been reached long ago if that’s all there was to it.

Play around with some frontier models, you’ll be pleasantly surprised.


Did I miss a fundamental shift in how LLMs work?

Until they change that fundamental piece, they are literally that: programs that use math to determine the most likely next token.


> that's a whole different level of ignorance, that's much more dangerous.

Why? Is it more dangerous to not know how to fry an egg in a teflon pan, or on a stone over a wood fire? Is it acceptable to know the former but not the latter? Do I need to understand materials science so I can understand how to make something nonstick so I’m not dependant on teflon vendors?


It's relative, not absolute. It's definitely more dangerous to not know how to make your own food than to know something about it - you _need_ food, so lacking that skill is more dangerous than having it.

That was my point, really - that you probably don't need to know "materials science" to declare yourself competent enough in cooking so that you can make your own food. Even if you only cooked eggs in teflon pans, you will likely be able to improvise if need arises. But once you become so ignorant that you don't even know what food is unless you see it on a plate in a restaurant, already prepared - then you're in a lot poorer position to survive, should your access to restaurants be suddenly restricted. But perhaps more importantly - you lose the ability to evaluate food by anything other than aspect & taste, and have to completely rely on others to understand what food might be good or bad for you(*).

(*) even now, you can't really "do your own research", that's not how the world works. We stand on shoulders of giants - the reason we have so much is because we trust/take for granted a lot of knowledge that ancestors built up for us. But it's one thing to know /prove everything in detail up until the basic axioms/atoms/etc; nobody does that. And it's a completely different different thing to have your "thoughts" and "conclusions" already delivered to you in final form by something (be it Fox News, ChatGPT, New York Times or anything really) and just take them for granted, without having a framework that allows to do some minimal "understanding" and "critical thinking" of your own.


You do need to be able to understand nonstick coating is unhealthy and not magic. You do need to understand your options for pan frying for not sticking are a film of water or an ice cube if you don't want to add an oil into the mix. Then it really depends what you are cooking on how sticky it will be and what the end product will look like. That's why there are people that can't fry an egg, people that cook, chefs, and Michelin chefs. Because nuance matters, it's just that the domain where each person wants to apply it is different. I dont care about nuance in hockey picks but probably some people do. But some domains should concern everyone.

FUCK NO. Who in their right mind would let an LLM connect to prod?

Maybe at a greenfield startup. Where I work this idea wouldn't be entertained for a millisecond.

Many places have "dev", "test" "prod"... but IMHO you need "sandpit" as well.

From an ops point of view as orgs get big enough, dev wraps around to being prod-like... in the sense that it has the property that there's going to be a lot of annoyed people whose time you're wasting if you break things.

You can take the approach of having more guard rails and controls to stop people breaking things but personally I prefer the "sandpit" approach, where you have accounts / environments where anything goes. Like, if anyone is allowed to complain it's broken, it's not sandpit anymore. That makes them an ok place to let agents loose for "whole system" work.

I see tools like this as a sort of alternative / workaround.


Sandpit should be a personal (often local, if possible) dev environment. The reason people get mad about dev being broken for long periods of time is that they cannot use dev to test their changes if your code (that they depend on) is broken in dev for long periods of time.

Agreed on all points. Local loops are faster and safer wherever possible.

But particularly for devops / systems focused work, you lose too much "test fidelity" if you're not integrating against real services / cloud.


There’s no sandpit, only prod and dev, and you’re not allowed to break prod. Your developers work in partitions of prod. Dev is used for DR and other infra testing.

Well that’s just - dumb

Wanna elaborate?

Account vending machines where every dev can spin up thier own account is a thing and still under the control of some type of guardrails.

Hey, I get it. I don't want LLMs on prod at all. I made this to let agents connect to production cloned sandboxes, not production itself. I hope this helps your concerns, but I understand either way. Lmk with any other questions.

What’s a production cloned sandbox? Take my comment as feedback that the landing page is anaemic

For example, if you had an on-prem footprint with thousands of VMs, a production cloned sandbox would be a clone of a VM to let AI safely make changes, install packages, etc.

Yeah, working on the landing page. Feel free to ask any other questions!


why does it have to connect to prod in order to be useful?

I think you would be very surprised at a) how useful it would be and b) how lax prod can be depending on the company culture and stakes.

Classic prisoner's dilemma

how is there a prisoner's dilemma pattern of payoffs here?

Microsoft defected, now the correct move for the customer is to defect. Really, based on the pattern they learned, they can defect a thousand times before they lose enough customers over it that they have to publicly apologize and reverse a feature, so customers stepping in mid-game think that there's a tit for tat happening, but they don't realize they're starting off in the hole. We all should have left Microsoft a very long time ago.

Does Kessler syndrome also mean ICBMs become nonviable?


No.

It's not a wall. The risk from going through a dangerous orbit is much much less than the risk from staying there.


That depends on how you define risk. If it means the probability of a collision, then you'd be correct. But if a collision does happen, the consequences will be worse than being in the same orbit. Based on an oversimplified model, debris in orbit is likely to have low relative velocities with respect to an intact satellite in the same orbit, since a large deltav would change the orbit. (It's not as simple as this, but it's good enough in practice.)

This is actually what asat weapons take advantage of. They usually don't even reach orbital velocity, just like ballistic missiles (of course, there are exceptions like the golden dome monstrosity). The kill vehicle just maneuvers itself into the path of the satellite and lets the satellite plough into it at hypervelocity.


I remember a short story about Canada preventing total global annihilation in WWIII, by deliberately triggering Kessler syndrome. My google-fu is failing me though.


I would love to read it:)


> to the surprise of absolutely no-one with even the most basic grasp of how economies function

So roughly 98% of the population was surprised?


Like Python?


CUE


Why not Python?


Typing is bolted on rather than a native concept, for one.


Why is that a problem?


Because types are important and having them be a native part of the language creates opportunities for error checking, editor completions, and LLM bounding.


Invisible scoping and turning complete

Python is better than bash in ops, been using more Go in this space

Config is another beast and separate languages


I’m not sold that config is a complex enough domain to necessitate another language. What problems is CUE solving when compared to python and why are those problems substantial enough to make it worth learning a new language?


That's exactly the thing -- complexity. Cue bounds complexity, like json, yaml, and toml. But it offers more composeability than any of them.


Given that we now have TOML, JSON, INI, CSV, YAML, etc it seems we are converging on either JSON, YAML or TOML. There is too much inertia behind those three and not much behind CUE right now.


CUE works with all of those languages, so it doesn't matter what the tools or others are using. I can always apply CUE at any point to output their required format as needed.

Keep your legacy config and mess if you want, you're the one missing out

Also, I don't see TOML in the wild enough and the others have been around long enough, I must chuckle and not take seriously these claims about "inertia"


I’m not claiming inertia makes TOML ‘best’, just that it’s clearly not blocked by inertia either. Cargo standardized on TOML years ago, and GitLab Runner has relied on it for a long time. If a format can win in major ecosystems, “people won’t adopt anything new” isn’t the whole story.”


I guess by "compatible" you mean the data plane.

There are choices that speak the S3 data plane API (GetObject, ListBucket, etc).

There are no alternatives that support most of the AWS S3 functionality such as replication, event notifications.


None? I've seen a few projects that purport to be a drop-in replacement for S3.


Can GitHub change their API response rate? Can they increase it? If they do, they’ll break my code ‘cause it expects to receive responses at least after 1200ms. Any faster than that and I get race conditions. I selected the 1200ms number by measuring response rates.

No, you would call me a moron and tell me to go pound sand.

Weird systems were never supported to begin with.


Not surprising. Take any conference and look at the schedule of some CEO or other “socialite” attending said conference. They’re not in the building, they’re running around town attending meetings. At JPMHC everyone is a “socialite”


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: