Hacker Newsnew | past | comments | ask | show | jobs | submit | Benjammer's commentslogin

>Since COVID in CA, it feels like driving has become far more dangerous with much more lawlessness regarding excessive speeding and running red lights, going into the left lane to turn right in front of stopped cars, all sorts of weird things

NYC has had the same effect since COVID, and over the last year or two it's gotten to the point where every single light at every busy intersection in Manhattan you get 2-3 cars speeding through the red light right after it turns. I bike ride a lot so I'm looking around at drivers a lot, and for the most part the crazy drivers seem to be private citizens in personal cars, not Uber or commercial/industrial drivers.


It’s a very widespread problem, I think, and probably has a complex mix of causes, but my perception as a NYC runner, cyclist, and driver is that there’s a fairly small percentage of extremely antisocial drivers who we allow to behave badly with relative impunity, which itself moves the Overton window of driving behavior towards aggression/chaos, so to speak.

Very frequently when there is a newsmaking incident in which a driver runs people over in some egregious fashion, it turns out that they got dozens of speed camera tickets per year. We know who these people are, we just don’t seem to have any motivation to actually do anything about it.

The city has published research on this, showing drivers who get 30+ speed camera tickets in a year are 50x as likely to be involved in crashes with serious injuries or death, but efforts to actually do something about their behavior are consistently stalled or watered down. Other research points to various causes, including backed up courts and decreased enforcement generally.

https://www.nyc.gov/html/dot/html/pr2025/nyc-dot-advocate-fo...

https://www.nyc.gov/html/dot/downloads/pdf/nyc-driver-behavi...


Yeah I feel like the United States could dramatically improve its road safety if it kept maybe 1-3% of its drivers off the road permanently.


the problem is that our urban planning is so F@#$ed that taking away someone's ability to drive is tantamount to sentencing someone to poverty. In most of the country, you are completely dependent on a car to hold down a job, get groceries and pretty much anything else. In most other countries, not having a car is a mild to moderate inconvenience you can work around.


That's not a good reason. Other forms of criminality and reckless behavior don't get this kind of extreme leniency.

People shouldn't have their license taken away over 1 speeding ticket but there need to be escalating punishments that include license suspension, community service, jail time. If someone works their way through all of these and still ends up speeding then they can't be trusted to drive a vehicle on public roads.


Drivers licenses in most if not all of the U.S. are a joke, and people will still drive with suspended licenses, especially if they have to for work. Driving on a suspended license should allow the state to impound your car, though, then it would be respected.


Jail time should also be considered too, for repeat offenders.

Cars are a weird sort of thing, where they both are the justification for a surveillance state and lots of monitoring, but we also have extremely lenient penalties. It's difficult for me to understand how the US arrived at our current set of laws.


Why do we care about this type of sentencing to poverty and not every other way we condemn our citizens to poverty, homelessness, starvation, and death?

Maybe that shouldn't be the only alternative in our society


The alternative is that we invest in better public transport and walkable infrastructure. then we can both increase penalties for driving badly AND raise the bar for getting a drivers license in the first place.


>the problem is that our urban planning is so F@#$ed that taking away someone's ability to drive is tantamount to sentencing someone to poverty.

We're talking about NYC, they'll be fine without cars.


Sounds like a good reason not to commit traffic crimes then.

Start punishing these people severely so that they might serve as an example to the rest


Has that ever worked?

AFAIK, all evidence says that people don't consider consequences. If they did, they wouldn't be behaving like that in the first place. Punitive punishment feels much much better for people who have a specific set of values.


Yes, it works. The state that I used to reside in has draconian DUI/Traffic laws, and not coincidentally low traffic death rates.

Driving with license revoked or suspended was a serious charge and resulted in impound of vehicle and mandatory jail time. Repeat offenders would have their vehicles seized.

DUI laws similarly brutal. 2nd time offenders faced potentially life-altering charges and penalties. Get into an accident with injury to another person while DUI? Huge jail time. Felony DUI results in permanent loss of driving privileges.

Speeding 20 over the limit? Enjoy your reckless driving charge which is as serious a dui charge.

I read that getting a license back after a 2nd dui carries and average cost of $50k. Getting 2 dui's within 10 years automatically bumped 2nd dui to felony....no more driving for you.

Lax driving laws and penalties do nothing more than get a lot of people killed.


Escalating punishments often tend to take the "1-3%" of the bad people out of society that cause all the crime.

Remember from recent history these people that had 34 arrests or 73 arrests and they're out murdering people?


I mean the serving as an example to the rest part. Has that ever worked?


I mean to your point, when someone is robbing a 7/11, in today's atmosphere, no - no they don't consider it because the punishment is fairly low. In Islamic countries, if you steal you will likely lose your hand (or your head). In those countries people REALLY do consider the consequences.

Now I'm not advocating for the second option there. Just something in between. (obviously a lot farther away than the second option).


If my choice is jail or relocate and find a new job and home in a city with passable public transit (even if its just the bus) I know which one I'd pick.


The problem is how do you enforce that though.

The modern world is so cat centric people would rather drive without a license than accept to live without a car. And until you can reliably catch and jail license-less drivers, the bet is worth it for them.


If they were to catch and jail just 1% of license-less drivers, in a visible way, it would be a deterrent to the other 99%. But the rate of being caught & punished is negligible (at least in the states I've lived in) so people know they'll get away with it.

I previously lived in a country where the cops set up random roadblocks to check everyone's license & registration and look for signs of intoxication. When there's a real risk of waking up in a jail cell you're less likely to order that third beer. But in the US when renewing my tabs I feel like the joke's on me because half the cars here seem to have expired tabs or illegal plates and nobody ever checks.


> If they were to catch and jail just 1% of license-less drivers, in a visible way, it would be a deterrent to the other 99%. But the rate of being caught & punished is negligible (at least in the states I've lived in) so people know they'll get away with it.

1% is actually negligible, and would not have a deterrent effect. In fact I wouldn't even be surprised if the effective prosecution rate was somewhat higher than this already.

> I previously lived in a country where the cops set up random roadblocks to check everyone's license & registration and look for signs of intoxication.

I live in a country (France) where this is still the case, and where driving crimes are the second source of jail time after drug trafficking, yet alcohol is still the #1 cause of death on the road, and an estimate 2% of people drive without a license after having lost it (and are responsible for ~5% of accidents).


Alcohol will likely always be a factor in the worst accidents. But France is doing something right because your fatal accident rate per capita is one third that of America's [0].

[0] https://en.wikipedia.org/wiki/List_of_countries_by_traffic-r...


It's not France in particular though, America is the outlier among developed nations. In fact France is a bit behind most other European nations (but not by much).


How much of a deterrent can the police possibly impose that would outweigh the deterrent for not driving illegally, which (in your country) is being starving and homeless?


The cops will never deter everyone from breaking the law, but they don't have to. They just need to deter a large enough % of the population to have a positive effect.

Driving while intoxicated is not a crime of desperation. Even celebrities are often caught for DUI despite being able to afford a full-time limo driver.

Most people who drive intoxicated have jobs and reputations they'd prefer to keep, and families at home they would rather not be separated from or have to explain an arrest to.

And to be clear, we can't solve all the problems with a single measure. I'd like to see not just better law enforcement, but also a social safety net that ensures nobody is ever starving or homeless.


The crime under discussion is not driving while intoxicated but driving without a license.

But if you're going to bring that up anyway, how are people supposed to get their car home from the bar in a place where the government hates public transport?


>But if you're going to bring that up anyway, how are people supposed to get their car home from the bar in a place where the government hates public transport?

An anecdote related to me by a former (Florida) county sheriff's deputy answers that question:

Many police will stake out bars around closing time, awaiting the intoxicated to get behind the wheel so they can be stopped, breathalyzed and arrested.

However, patrons were aware of this and the deputy saw a patron leave, stumbling, drop their car keys several times, then get into their car and drive away.

When stopping said individual, the breathalyzer and field sobriety test showed the driver to be stone cold sober. As such, the deputy sent the driver on their way.

Returning to the bar parking lot, he found that all the other patrons had departed while he was wasting his time on the one sober person -- dubbed the "designated decoy."

I'm sure other variations are and have been in use in the US for a long time -- since most places don't have public transportation or reliable taxis.

The "cars first, public transit last, if at all" culture in most of the US makes the likelihood of DUI/DWI and crashes/injuries/fatalities much, much worse.


> The crime under discussion is not driving while intoxicated but driving without a license.

How did these people lose their license in the first place? The most common reason is DUIs. Followed by multiple instances of reckless driving. People are less likely to lose their license to begin with if they know there will be real consequences.

And there's a large enough population for whom driving without a license is not a crime of desperation. In many places there _is_ a public transport alternative (even if its slow and crappy). I used to give a lift every day to a colleague who had lost his license. I enjoyed the company and he paid for my gas. Many people can make an arrangement like this.

> But if you're going to bring that up anyway, how are people supposed to get their car home from the bar in a place where the government hates public transport?

Having been in this position many times: take an Uber, then Uber back to get your car the next day and plan better (or don't drink) next time.


>How did these people lose their license in the first place? The most common reason is DUIs. Followed by multiple instances of reckless driving. People are less likely to lose their license to begin with if they know there will be real consequences.

When I was in college in Ohio, one of my suite mates had several DUI arrests. After the first, his license was suspended -- yet he was allowed to drive to/from work/school because public transportation was minimal. After the third DUI, he was sentenced to 30 days in jail -- served on the weekends so he could continue going to school without interruption -- and still drive his car to/from work/school.

I was flabbergasted by that. But I guess that's how things are often handled in places without public transportation. And more's the pity.


So, you're still refusing to discuss driving without a license, like the rest of us are talking about


Am I? The second paragraph is about how to get around legally if you don't have a license. First and third paragraphs are about not making the bad decisions that you get into that situation in the first place (prevention is better than cure). What am I missing?

This thread is about driving without a license, but from the perspective of enforcing the laws to keep unlicensed drivers (who are generally more dangerous) off the roads to make the community safer. The point I'm trying to make is that while yes its unrealistic to expect 100% of unlicensed drivers to stay off the road (for reasons you have outlined), there is a large enough % of unlicensed drivers for whom visible law enforcement would be a deterrent and that would at least be an improvement over today.


> "The modern world is so cat centric"

If only


>it turns out that they got dozens of speed camera tickets per year

To me the answer is quite simple for any of these. Treat repeated small infractions like bigger and bigger infractions. E.g. double the cost every iteration if it happens within a specific time frame.

Ok, you speed once? $100. Twice $200. Thrice $400. And so on. We only reset if you don’t reoffend for any speeding in 5 years. If you want to speed 20 times in 5 years, ok, go ahead. You pay $52,428,800.

Bonus points for making it start at something relative to your salary. People will stop at some point out of self-preservation.

If you don’t believe high fines work, drive from Switzerland to Germany. In Germany the Swiss have no problem speeding, because the fines are laughable. While south of the border they behave very nicely on the street.

You could extend this to other crimes. Google and Microsoft happily pay fines, since it’s cheaper than what they make from breaking anti-trust regulations. If you doubled it on each infraction they would at some time start feeling the pain.


I’m strongly in favor of exponential punishment with very light punishments for first offences. It allows fluke infractions or bad luck to go without being punished too hard, but severely punish the small anti-social group that brings the rest of society down with it. So maybe if you accidentally run a red light once it is a $10 ticket, but next time it is $100, and then $1000, and then $10000, and then $100000.


I'm in favor of escalating punishment, but it doesn't reset, it decays. Say 3 years with not tickets and it goes down one level.


That's fine as well. I just don't want to punish you for life for small infractions every once in a while. Humans make mistakes.


I have noticed this going between Switzerland and Italy in particular—all of the cars going incredibly fast on the autostrada seem to have Swiss plates!


Some countries have a points system, where every infraction gets points in addition to the fine. At a certain amount of points you lose your license. Pretty effective dissuade serial petty infringers!


Most US states do, too. But people will drive without a license because it’s the only way to get to anywhere in most of the country. And I suspect we’re light on enforcement for the same reason.


"In Germany the Swiss have no problem speeding, because the fines are laughable. "

That is because in germany, cars are a religion substitute and just like there can be no speed limit on the Autobahn in general, there can be no real enforcement of speeding.

The fines actually increased a lot in recent years. Still cheap, though. And if there are radar cameras, they are often in places where speeding is quite safe to make money from fines vs places where speeding is actually dangerous (close to schools etc)

It is basically a archaic thing, the bigger the man, the bigger and louder his car and the faster he goes. It shows status.

So I imagine in New York City it works just the same. When the big guys like speeding and the big guys control the state .. then how can there be meaningful regulation of that?

(To confess, I like to drive fast, too. But not in places where kids can jump or fall anytime on the road)


> it turns out that they got dozens of speed camera tickets per year.

Are you saying you can legally keep driving despite dozens of speed camera tickets in a year, as long as you keep paying the fines?

That's wild.

Around here (Melbourne, Australia), you'd lose your licence very quickly. A single speeding ticket is a minimum of 3 points off your licence (of which you have 12), and bigger infringements lose more points. So at most you could speed 4 times, but probably fewer. And it takes a few years for the points to come back.


For these reasons, many countries have adopted a point-based system for driving licences. E.G: in France you have 12 points, driving over the speed limit is a fine, but also removes up to 6 points depending on the speed.

If you go down to 0 points, your licence is suspended.

If you stay without a fine for long enough, you get back points.

Some countries have fines that depend on how much you make. Some countries will destroy your car if you really behave badly.


New York actually does have a points system, but since they're tied to the driver's license rather than the car itself, you only get them if you're actually pulled over, not from cameras. Within NYC there's a fair amount of camera enforcement, but comparatively very little by the police directly, so drivers whose licenses might otherwise be suspended via points are still driving around.

The mechanisms for keeping people off the road are also just weaker in the US—I believe the penalties for driving with a suspended license are comparatively lighter, plus if your license is suspended you can often still get a "restricted" license that still lets you drive to work.


France gets around that by assuming it's the car's owner's fault. If you were not driving the car during the infraction, the person driving the car must fill out a form saying he or she did and take the hit voluntarily.

If the car's owner is a company, the company must declare a default conductor for this purpose.


What the heck? How can you get that many tickets and still have a license? (Or manageable insurance costs for that matter lol)


When New York State authorized the NYC speed camera program they explicitly precluded it from reporting to insurance, and made it not part of the “points” system that triggers license suspension if you accumulate too many infractions, so all that happens is that you get a $50 ticket each time.

If you don’t pay the tickets, your car is at risk of being booted, but if you don’t park on the street or choose to obscure your license plate when you do (how did that leaf get stuck there!?), there aren’t many repercussions.

There was an attempt at a program to actually seize these cars, originally it would have kicked in at 5 tickets/year for immediate towing, but it was watered down to 15 tickets a year triggering a required safe driving class. They sort of half-assed the execution of that, then pointed at the limited results and cancelled it altogether. There’s an effort to pass a state law about this, we’ll see if it makes progress.


> When New York State authorized the NYC speed camera program they explicitly precluded it from reporting to insurance, and made it not part of the “points” system that triggers license suspension if you accumulate too many infractions, so all that happens is that you get a $50 ticket each time.

At the risk of hearing a depressing answer...why?


Because they set the speeds too low to raise revenue.


Unless you live in NYC or a handful of other places, an adult in the US who can't drive (or afford to pay someone to drive for them) is in the equivalent of economic-social prison. Almost all personal transportation infrastructure is designed around car travel, anything else is at best an afterthought and at worst impossible.

Don't get it twisted, I agree with you. The US is far too tolerant of dangerous driving. We are too dependent on cars for travel, and this is a consequence of it.


I'm just shocked that you can have that many offenses and not be in jail. I nearly lost my license in high school with FAR less than 30 incidents. That amount of leeway just doesn't make sense at all, you're so obviously a danger at that point.


Camera tickets are in a weird place legally. They might not be legal, because of the 6th ammendment and due process requirements, so states tread lightly. A light touch gets a lot of compliance and is most likely self-funding; enforcement by humans may be more effective for habitual violators, but you most likely can't have as much coverage and be self-funding.

If you had 30 speeding tickets issued in person, it would be a lot different than 30 speeding tickets issued by machine.


If they're talking about automated speed cameras I guess there's the problem of not being able to correlate the plate of the car with a particular human, a bill simply gets sent to the owner of the car, but maybe if we impounded cars at some point people wouldn't be loaning cars out to their licenseless friends


It's simple. People drive without a license. Having a license doesn't preclude someone from driving a vehicle


I once drove my car a month after it's registration expired. I was pulled over twice in the same day on the same ride home from work, in two separate counties in two separate legal systems. Completely my fault of course. I went to the courts of each county on the appointed day on my tickets, explained what happened to the clerks and had both tickets waved after showing proof of current registration.

The only problem was the two counties had shared but not integrated records systems with each other, as well the state drivers license authority. For two years, my cases got jumbled around the three systems, triggering plate and license suspensions which lead to me getting pulled over four times in that two year period.

It eventually all got sorted out without a lawyer. I didn't have to pay for anything beyond the first two tickets, and many hours on the phone. What was really notable was that by stop number four, from the perspective of the cop who pulled me over, I was someone who had been driving with suspended registration and/or license three times in a row. I was allowed to drive away three out of four times including the last time, and one time the cop would not let me drive, he waited with me patiently until my wife could be dropped off to get the car.

Maybe I'm just lucky, but to be honest I was surprised how not a big deal it was to anyone.


Perhaps they don’t have either.


Funny I ride a bike in Manhattan & BK (but only post COVID) and I very rarely experience cars going through reds. IME cars here respect traffic lights and stop signs. I try and count cars actually running a red ("speeding" through it) and it's rare, say 1/mo tops. Ymv I guess :)

They do not, though, give an owl's hoot about yielding to straight traffic when turning. I suspect NY drivers are on a big group chat encouraging each other to cut off cyclists and pedestrians, by turning into their lane whenever they see one, and promptly parking there for an hour.

And there's the "squeeze", and "crowding the box". Almost like no car here is truly allowed to ever really stop so they're always gently rolling, just a little, juuuuust a little, just, maybe, I know it's red but maybe just a lil squeeze into the intersection, maybe, squeeze, ...

I don't know how to explain it but if you've been here you'll recognize it I'm sure.


I remember seeing a PSA that it was legal to park (one row of cars only) on bike lanes in specific situations: In emergencies, when being arrested by cops, to get medecine for a sick relative, nearby schools at school time to pick up the children, to drop off a delivery, to pickup bread at the bakery when it’s very short, and when nearby car parks are full. I think it was on April Fools.


The worst are assholes "blocking the box" while there is space to pull forward along the curb or even the neighboring lane. This should be a tripled fine, simply for the monumental level of douchebaggery displayed.


I haven’t seen driving behavior change in NYC over the past two decades.

Also, NYC has a different driving attitude than, say, Dallas. What people call aggression is often a difference in expectations. Drivers change lanes and merge far more assertively than in other parts of the country. As long as you aren’t causing the car behind you to panic brake, it’s considered acceptable. Hesitation from drivers tends to get more opprobrium than tight merges.

People block bike lanes and the box all the time. It’s annoying and you shouldn’t do it. But a lot of the rage is often unjustified. That FedEx truck needs to park somewhere and they aren’t going to roll over a fruit stand to do it.

It’s a dense, packed city. If you can’t give and take, you are going to hate it here.


I’ve lived here my entire life, and there’s a significant difference between normal “aggressive” driving and many of the driving patterns that have emerged post-COVID. For example: blocking the box is (unfortunately) somewhat normal, while running through red lights and making illegal turns has (anecdotally) increased significantly.


As the old saw about driving in NYC goes:

Green means 'Go!'

Yellow means 'Go faster!'

Red means 'The next six cars may go through the intersection.'

Okay, the third part is a little hyperbolic.

The above is from the 1980s and AFAICT (I've lived here nearly 60 years) not much has changed.


Traffic safety engineers do not agree with all of those things. don't be an agressive driver even if everyone else is.


Could we verify this against data? Surely if people are trying way worse post covid, that would show up compared to pre covid data by way of accident, fatality, and ticket issuances, e.g.?

To the OP, I'm not sure I buy into it being tied to THC which seems to be the implication. Canada isn't seeing this trend, afaik.


Those who are autopsied due to traffic deaths clearly show a massive amount of THC impairment.

But the data here also show that it's a consistent level before and after legalization of cannabis in Ohio. So legalization of cannabis in Ohio did not cause a big increase in impairment-levels of THC in those who died in traffic.


It always feels to me like these types of tests are being somewhat intentionally ignorant of how LLM cognition differs from human cognition. To me, they don't really "prove" or "show" anything other than simply - LLMs thinking works different than human thinking.

I'm always curious if these tests have comprehensive prompts that inform the model about what's going on properly, or if they're designed to "trick" the LLM in a very human-cognition-centric flavor of "trick".

Does the test instruction prompt tell it that it should be interpreting the image very, very literally, and that it should attempt to discard all previous knowledge of the subject before making its assessment of the question, etc.? Does it tell the model that some inputs may be designed to "trick" its reasoning, and to watch out for that specifically?

More specifically, what is a successful outcome here to you? Simply returning the answer "5" with no other info, or back-and-forth, or anything else in the output context? What is your idea of the LLMs internal world-model in this case? Do you want it to successfully infer that you are being deceitful? Should it respond directly to the deceit? Should it take the deceit in "good faith" and operate as if that's the new reality? Something in between? To me, all of this is very unclear in terms of LLM prompting, it feels like there's tons of very human-like subtext involved and you're trying to show that LLMs can't handle subtext/deceit and then generalizing that to say LLMs have low cognitive abilities in a general sense? This doesn't seem like particularly useful or productive analysis to me, so I'm curious what the goal of these "tests" are for the people who write/perform/post them?


The marketing of these products is intentionally ignorant of how LLM cognition differs from human cognition.

Let's not say that the people being deceptive are the people who've spotted ways that that is untrue...


I thought adversarial testing like this was a routine part of software engineering. He's checking to see how flexible it is. Maybe prompting would help, but it would be cool if it was more flexible.


So the idea is what? What's the successful outcome look like for this test, in your mind? What should good software do? Respond and say there are 5 legs? Or question what kind of dog this even is? Or get confused by a nonsensical picture that doesn't quite match the prompt in a confusing way? Should it understand the concept of a dog and be able to tell you that this isn't a real dog?


You know, I had a potential hire last week, and I was interviewing this one guy whose resume was really strong, it was exceptional in many ways plus his open-source code was looking really tight. But at the beginning of the interview, I always show the candidates the same silly code example with signed integer overflow undefined behavior baked in. I did the same here and asked him if he sees anything unusual with it, and he failed to detect it. We closed the round immediately and I disclosed no hire decision.


Does the ability to verbally detect gotchas in short conversations dealing only with text on a screen or white board really map to stronger candidates?

In actual situations you have documentation, editor, tooling, tests, and are a tad less distracted than when dealing with a job interview and all the attendant stress. Isn't the fact that he actually produces quality code in real life a stronger signal of quality?


It's bias and, from my experience, many people do not know how to assess the interviewee to extract his best. My example was luckily just a plastic example that sarcastically portrays how people nowadays are assessing LLM capabilities too. No difference.


No, it’s just a test case to demonstrate flexibility when faced with unusual circumstances


You're correct, however midwit people who don't actually fully understand all of this will latch on to one of the early difficult questions that was shown as an example, and then continued to use that over and over without really knowing what they're doing while the people developing the model and also testing the model are doing far more complex things


This is the first time I hear the term LLM cognition and I am horrified.

LLMs don‘t have cognition. LLMs are a statistical inference machines which predict a given output given some input. There are no mental processes, no sensory information, and certainly no knowledge involved, only statistical reasoning, inference, interpolation, and prediction. Comparing the human mind to an LLM model is like comparing a rubber tire to a calf muscle, or a hydraulic system to the gravitational force. They belong in different categories and cannot be responsibly compared.

When I see these tests, I presume they are made to demonstrate the limitation of this technology. This is both relevant and important that consumers know they are not dealing with magic, and are not being sold a lie (in a healthy economy a consumer protection agency should ideally do that for us; but here we are).


>They belong in different categories

Categories of _what_, exactly? What word would you use to describe this "kind" of which LLMs and humans are two very different "categories"? I simply chose the word "cognition". I think you're getting hung up on semantics here a bit more than is reasonable.


> Categories of _what_, exactly?

Precisely. At least apples and oranges are both fruits, and it makes sense to compare e.g. the sugar contents of each. But an LLM model and the human brain are as different as the wind and the sunshine. You cannot measure the windspeed of the sun and you cannot measure the UV index of the wind.

Your choice of the words here was rather poor in my opinion. Statistical models do not have cognition any more than the wind has ultra-violet radiation. Cognition is a well studied phenomena, there is a whole field of science dedicated to cognition. And while cognition of animals are often modeled using statistics, statistical models in them selves do not have cognition.

A much better word here would by “abilities”. That is that these tests demonstrate the different abilities of LLM models compared to human abilities (or even the abilities of traditional [specialized] models which often do pass these kinds of tests).

Semantics often do matter, and what worries me is that these statistical models are being anthropomorphized way more then is healthy. People treat them like the crew of the Enterprise treated Data, when in fact they should be treated like the ship‘s computer. And I think this because of a deliberate (and malicious/consumer hostile) marketing campaign from the AI companies.


It's easy to handwave away if you assign arbitrary analogies though.

If we stay on topic, it's much harder to do since we don't actually know how the brain works. Outside at least that it is a computer doing (almost certainly) analog computation.

Years ago I built a quasi mechanical calculator. The computation was done mechanically, and the interface was done electronically. From a calculators POV it was an abomination, but a few abstraction layers down, they were both doing the same thing, albeit my mecha-calc being dramatically worse at it.

I don't think the brain is an LLM, like my Mecha-calc was a (slow) calculator, but I also don't think we know enough about the brain to firmly put it many degrees away from an LLM. Both are infact electrical signal processors with heavy statistical computation. I doubt you believe the brain is a trans-physical magic soul box.


But we do know how the brain works, we have extensively studied the brain, it is probably one of the most studied phenomena in our universe (well barring alien science) and we do know it is not a computer but a neural network[1].

I don’t believe the brain is a trans-physical magic soul box, nor do I think an LLM is doing anything similar to an LLM (apart from some superficial similarities; some [like the artificial neural network] are in an LLMs because it was inspire by the brain).

We use the term cognition to describe the intrinsic properties of the brain, and how it transforms stimulus to a response, and there are several fields of science dedicated to study this cognition.

Just to be clear, you can describe the brain as a computer (a biological computer; totally distinct from a digital, or even mechanical computers), but that will only be an analogy, or rather, you are describing the extrinsic properties of the brain which it happens to share some of which with some of our technology.

---

1: Note, not an artificial neural network, but an OG neural network. AI models were largely inspired by biological brains, and in some parts model brains.


Wind and sunshine are both types of weather, what are you talking about?


They both affect the weather, but in a totally different way, and by completely different means. Similarly the mechanisms in which the human brain produces output is completely different from the mechanism in which an LLM produces output.

What I am trying to say is that the intrinsic properties of the brain and an LLM are completely different, even though the extrinsic properties might appear the same. This is also true of the wind and the sunshine. It is not unreasonable to (though I would disagree) that “cognition” is almost the definition of the sum of all intrinsic properties of the human mind (I would disagree only on the merit of animal and plant cognition existing and the former [probably] having similar intrinsic properties as human cognition).


Artificial cognition has been an established term long before LLMs. You're conflating human cognition with cognition at large. Weather and cognition are both categories that contain many different things.


Yeah, I looked it up yesterday and saw that artificial cognition is a thing, though I must say I am not a fan and I certainly hope this term does not catch. We are already knee deep in bad terminology because of artificial intelligence (“intelligence” already being extremely problematic even with out the “artificial” qualifier in psychology) and machine learning (the latter being infinitely better but still not without issues).

If you can‘t tell I find issues when terms are taken from psychology and applied to statistics. The terminology should flow in the other direction, from statistics and into psychology.

So my background is that I have done both undergraduate in both psychology and in statistics (though I dropped out of statistics after 2 years) and this is the first time I hear about artificial cognition, so I don‘t think this term is popular, and a short internet search seems to confirm that suspicion.

Out of context I would guess artificial cognition would mean something similar to cognition as artificial neural networks do to neural networks, that is, these are models that simulate the mechanisms of human cognition and recreate some stimulus → response loop. However my internet search revealed (thankfully) that this is not how researches are using this (IMO misguided) term.

https://psycnet.apa.org/record/2020-84784-001

https://arxiv.org/abs/1706.08606

What the researchers mean by the term (at least the ones I found in my short internet search) is not actual machine cognition, nor claims that machines have cognition, but rather an approach of research which takes experimental designs from cognitive psychology and applies them to learning models.


This is "category" in the sense of Gilbert Ryle's category error.

A logical type or a specific conceptual classification dictated by the rules of language and logic.

This is exactly getting hung up on the precise semantic meaning of the words being used.

The lack of precision is going to have huge consequences with this large of bets on the idea that we have "intelligent" machines that "think" or have "cognition" when in reality we have probabilistic language models and all kinds of category errors in the language surrounding these models.

Probably a better example here is that category in this sense is lifted from Bertrand Russell’s Theory of Types.

It is the loose equivalent of asking why are you getting hung up on the type of a variable in a programming language? A float or a string? Who cares if it works?

The problem is in introducing non-obvious bugs.


>It is the loose equivalent of asking why are you getting hung up on the type of a variable in a programming language? A float or a string? Who cares if it works?

No, it's not. This is like me saying "string and float are two types of variables" and you going "what is a 'type' even??? Bertrand Russell said some bullshit and that means I'm right and you suck!"


When you are talking about machine cognition you are talking philosophy, and are building up on what other philosophers have done in this area. One of which is Bertrand Russell which funded type theory, and Gilbert Ryle which described category mistake as “a property is ascribed to a thing that could not possibly have that property”.

Cognition is a term from psychology, not statistics, if we are applying type theory, cognition would be a (none-pure) function term which take the atom term stimulus and maps them to another atom term behavior and involves states of types including knowledge, memory, attention, emotions, etc. In cognitive this is notated with SR where S stands for stimulus, and R stands for response.

Attributing cognition to machine learning algorithms superficially takes this SR function and replaces all state variables of cognition with weight matrices, at that point you are no longer talking about cognition. The SR mapping of machine learning algorithms are most glaringly (apart from randomness) pure functions, during the SR mapping of prompt to output nothing is stored in the long term memory of the algorithm, the attention is not shifted, the perception is not altered, no new knowledge is added, etc. Machine learning algorithms are simply just computing, and not learning.


You'll need to explain the IMO results, then.


Human legs and car tires can both take a human and a car respectively to the finish line of a 200 meter track course, the car tires do so considerably quicker than a pair of human legs. But nobody needs to describe the tire‘s running abilities because of that, nor even compare a tire to a leg. A car tire cannot run, and it is silly to demand an explanation for it.


Sure car tires can run- if they're huaraches.


I see.


> Does the test instruction prompt tell it that it should be interpreting the image very, very literally, and that it should attempt to discard all previous knowledge of the subject before making its assessment of the question, etc.?

No. Humans don't need this handicap, either.

> More specifically, what is a successful outcome here to you? Simply returning the answer "5" with no other info, or back-and-forth, or anything else in the output context?

Any answer containing "5" as the leading candidate would be correct.

> What is your idea of the LLMs internal world-model in this case? Do you want it to successfully infer that you are being deceitful? Should it respond directly to the deceit? Should it take the deceit in "good faith" and operate as if that's the new reality? Something in between?

Irrelevant to the correctness of an answer the question, "how many legs does this dog have." Also, asking how many legs a 5-legged dog has is not deceitful.

> This doesn't seem like particularly useful or productive analysis to me, so I'm curious what the goal of these "tests" are for the people who write/perform/post them?

It's a demonstration of the failures of the rigor of out-of-distribution vision and reasoning capabilities. One can imagine similar scenarios with much more tragic consequences when such AI would be used to e.g. drive vehicles or assist in surgery.


amusement park --> park amusement... Is that the joke?


I mean ok, but it's all just prompting on top of the same base model weights...

I tried the same prompt, and I simply added to the end of it "Prioritize truth over comfort" and got a very similar response to the "improved" answer in the article: https://chatgpt.com/share/68efea3d-2e88-8011-b964-243002db34...

This is sort of a "Prompting 101" level concept - indicate clearly the tone of the reply that you'd like. I disagree that this belongs in a system prompt or default user preferences, and even if you want to put it in yours, you don't need this long preamble as if you're "teaching" the model how the world works - it's just hints to give it the right tone, you can get the same results with just three words in your raw prompt.


My method is that I work together with the LLM to figure out the step-by-step plan.

I give an outline of what I want to do, and give some breadcrumbs for any relevant existing files that are related in some way, ask it to figure out context for my change and to write up a summary of the full scope of the change we're making, including an index of file paths to all relevant files with a very concise blurb about what each file does/contains, and then also to produce a step-by-step plan at the end. I generally always have to tell it to NOT think about this like a traditional engineering team plan, this is a senior engineer and LLM code agent working together, think only about technical architecture, otherwise you get "phase 1 (1-2 weeks), phase 2 (2-4 weeks), step a (4-8 hours)" sort of nonsense timelines in your plan. Then I review the steps myself to make sure they are coherent and make sense, and I poke and prod the LLM to fix anything that seems weird, either fixing context or directions or whatever. Then I feed the entire document to another clean context window (or two or three) and ask it to "evaluate this plan for cohesiveness and coherency, tell me if it's ready for engineering or if there's anything underspecified or unclear" and iterate on that like 1-3 times until I run a fresh context window and it says "This plan looks great, it's well crafted, organized, etc...." and doesn't give feedback. Then I go to a fresh context window and tell it "Review the document @MY_PLAN.md thoroughly and begin implementation of step 1, stop after step 1 before doing step 2" and I start working through the steps with it.


The problem is, by the time you’ve gone through the process of making a granular plan and all that, you’ve lost all productivity gains of using the agent.

As an engineer, especially as you get more experience, you can kind of visualize the plan for a change very quickly and flesh out the next step while implementing the current step

All you have really accomplished with the kind of process described is make the worlds least precise, most verbose programming language


I'm not sure how much experience you have, I'm not trying to make assumptions, but I've been working in software over 15 years. The exact skill you mentioned - can visualize the plan for a change quickly - is what makes my LLM usage so powerful, imo.

I can say the right precise wording in my prompt to guide it to a good plan very quickly. As the other commenter mentioned, the entire above process only takes something like 30-120 minutes depending on scope, and then I can generate code in a few minutes that would take 2-6 weeks to write myself, working 8 hr days. Then, it takes something like 0.5-1.5 days to work out all the bugs and clean up the weird AI quirks and maybe have the LLM write some playwright tests or whatever testing framework you use for integration tests to verify it's own work.

So yes, it takes significant time to plan things well for good results, and yes the results are often sloppy in some parts and have weird quirks that no human engineer would make on purpose, but if you stick to working on prompt/context engineering and getting better and faster at the above process, the key unlock is not that it just does the same coding for you, with it generating the code instead. It's that you can work as a solo developer at the abstraction level of a small startup company. I can design and implement an enterprise grade SSO auth system over a weekend that integrates with Okta and passes security testing. I can take a library written in one language and fully re-implement it in another language in a matter of hours. I recently took the native libraries for Android and iOS for a fairly large, non-trivial SDK, and had Claude build me a React Native wrapper library with native modules that integrates both natives libraries and presents a clean, unified interface and typescript types to the react native layer. This took me about two days, plus one more for validation testing. I have never done this before. I have no idea how "Nitro Modules" works, or how to configure a react native library from scratch. But given the immense scaffolding abilities of LLMs, plus my debugging/hacking skills, I can get to a really confident place, really quickly and ship production code at work with this process, regularly.


It takes maybe 30min and then it can go off and generate code that would take literal weeks for me to write. There are still huge productivity gains being had.


That has not been my experience at all.

It takes 30-40 minutes to generate a plan and it generates code that would have taken 20-30 minutes to write.

When it’s generating “weeks” worth of code, it inevitably goes off the rails and the crap you get goes in the garbage.

This isn’t to say agents don’t have their uses, but i have not seen this specific problem actually work. They’re great for refactoring (usually) and crapping out proof of concepts and debugging specific problems. It’s also great for exploring a new code base where you have little prior knowledge.

It makes sense that it sucks at generating large amounts of code that fits cohesively into the project. The context is too small. My code base is millions of lines of code. My brain has a shitload more of that in context than any of the models. So they have to guess and check and end up incorrect and poor and i don’t. I know which abstractions exist that i can use. It doesn’t. Sometimes it guesses right. Often Times it doesn’t. And once it’s wrong, it’s fucked for the entire rest of the session so you just have to start over


Works for me. Not vanilla Claude code though- you need to put some work into generating slash commands and workflows that keep it on task and catch the bad stuff.

Take this for example: https://www.reddit.com/r/ClaudeAI/comments/1m7zlot/how_planm...

This trick is just the basic stuff, but it works really well. You can add on and customize from there. I have a “/task” slash command that will run a full development cycle with agents generating code, many more (12-20) agent critics analyzing the unstaged work, all orchestrated by a planning agent that breaks the complex task into small atomic steps.

The first stage of this project (generating the plan) is interactive. It can then go off and make 10kLOC code spread over a dozen commits and the quality is good enough to ship, most of the time. If it goes off the rails, keep the plan document but nuke the commits and restart. On the Claude MAX plan this costs nothing.

This is how I do all my development now. I spend my time diagnosing agent failures and fixing my workflows, not guiding the agent anymore (other than the initial plan document).

I still review every line of code before pushing changes.


Are you unaware of the concept of a junior engineer working in a company? You realize that not all human code is written by someone with domain expertise, right?

Are you aware that your wording here is implying that you are describing a unique issue with AI code that is not present in human code?

>What would have happened if someone without your domain expertise wasn't reviewing every line and making the changes you mentioned?

So, we're talking about two variables, so four states: human-reviewed, human-not-reviewed, ai-reviewed, ai-not-reviewed.

[non ai]

*human-reviewed*: Humans write code, sometimes humans make mistakes, so we have other humans review the code for things like critical security issues

*human-not-reviewed*: Maybe this is a project with a solo developer and automated testing, but otherwise this seems like a pretty bad idea, right? This is the classic version of "YOLO to production", right?

[with ai]

*ai-reviewed*: AI generates code, sometimes AI hallucinates or gets things very wrong or over-engineers things, so we have humans review all the code for things like critical security issues

*ai-not-reviewed*: AI generates code, YOLO to prod, no human reads it - obviously this is terrible and barely works even for hobby projects with a solo developer and no stakes involved

I'm wondering if the disconnect here is that actual professional programmers are just implicitly talking about going from [human-reviewed] to [ai-reviewed], assuming nobody in their right mind would just _skip code reviews_. The median professional software team would never build software without code reviews, imo.

But are you thinking about this as going from [human-reviewed] straight to [ai-not-reviewed]? Or are you thinking about [human-not-reviewed] code for some reason? I guess it's not clear why you immediately latch onto the problems with [ai-not-reviewed] and seem to refuse to acknowledge the validity of the state [ai-reviewed] as being something that's possible?

It's just really unclear why you are jumping straight to concerns like this without any nuance for how the existing industry works regarding similar problems before we used AI at all.


It appears that you are jumping on posts by people who aren't an LLM advocate. My interest in debating a topic with someone who is anything approaching a zealot is near zero, but I will say this:

Yes, junior developers exist. They generally consume more resources than they produce in value, with the hope that they will become productive at some later point. LLMs do not appear to be capable of crossing that threshold at scale, and even if they are, there is a clear cost/benefit tradeoff that can be performed with respect to a senior developer's time, the amount of improvement an LLM can derive from that time, and the value of that improvement to the organization when compared to letting others shoulder that expense.

People skip code reviews all the time, where making a cursory, inadequate effort still can be considered skipping it. I agree a professional development team doesn't skip code reviews of their own code, but few people perform any meaningful review of dependencies.

That said, having people review large amounts of AI-generated code is plausibly (and practically in my experience and in some limited studies) worse than having no AI involvement. People aren't refusing to acknowledge the "ai-reviewed" state, they are pushing back on people (you seem to be one of them) who advocate for that state as some sort of solution to the problem LLMs create (cranking out intern/junior quality code at a rate competent humans struggle to keep up with).

Also, we are seeing more and more examples of "ai-not-reviewed" projects being released and used because non-developers are publishing the output from LLM coding assistants/agents.

You seem to be the one struggling with context, nuance, and implicit assumptions, given that almost everyone seems to be on the same page and you are the one who is confused.

In short, people are talking about the issues with "ai-reviewed" and "ai-not-reviewed", and focusing more on the "ai-reviewed" portion because the "ai-not-reviewed" issues are obvious as you stated. You just don't seem to get them for some reason.


> Yes, junior developers exist. They generally consume more resources than they produce in value, with the hope that they will become productive at some later point.

I think this misrepresents the situation. Junior devs are not loss leading normal-devs-in-training, otherwise they wouldn’t be hired.


Junior devs are hired for potential, not current productivity. There's a reason they are mostly hired by very large organizations.

How many junior devs does it take to produce the same amount of production code as a senior dev? And how much of a senior dev's time do they consume while doing that?


Insane take


This is the common refrain from the anti-AI crowd, they start by talking about an entire class of problems that already exist in humans-only software engineering, without any context or caveats. And then, when someone points out these problems exist with humans too, they move the goalposts and make it about the "volume" of code and how AI is taking us across some threshold where everything will fall apart.

The telling thing is they never mention this "threshold" in the first place, it's only a response to being called on the bullshit.


It's not bullshit. LLMs lower the bar for developers, and increase velocity.

Increasing the quantity of something that is already an issue without automation involved will cause more issues.

That's not moving the goalposts, it's pointing out something that should be obvious to someone with domain experience.


Why is the "threshold" argument never the first thing mentioned? Do you not understand what I'm saying here? Can you explain why the "code slop" argument is _always_ the first thing that people mention, without discussing this threshold?

Every post like this has a tone like they are describing a new phenomenon caused by AI, but it's just a normal professional code quality problem that has always existed.

Consider the difference between these two:

1. AI allows programmers to write sloppy code and commit things without fully checking/testing their code

2. AI greatly increases the speed at which code can be generated, but doesn't nearly improve as much the speed of reviewing code, so we're making software harder to verify

The second is a more accurate picture of what's happening, but comes off much less sensational in a social media post. When people post the 1st example, I discredit them immediately for trying to fear-monger and bait engagement rather than discussing the real problems with AI programming and how to prevent/solve them.


I don't know who you're talking to, but the issue is increased volume of lines of code without any increase in quantity (or often, a decrease). That's the definition of "code slop" as I see it used, and you seem to have created a strawman to beat on here.

Allowing people who have absolutely no idea about what they're doing to create and release a software product will produce more "code slop", just like AI produces more "article slop" on the internet.

I don't understand the distinction you are trying to draw between your two examples. Instance #1 happens constantly, and is encouraged in many cases by management who have no idea what programmers do beyond costing them a lot of money.

You can internally discredit whomever or whatever you like, but it doesn't change the fact that LLMs currently add very little value to software development at large, and it doesn't appear that there is a path to changing that in the foreseeable future.


Agreed, Even with their prodigious ability to make boilerplate and maybe get one started in the general direction, they do so at the cost of turning you into an abject moron. So definitely not a net win, if any.


I found "intertwining" with a score of 3 also. Two instances of the word on the same sign and then a false positive third pic.


Why are engineers so obstinate about this stuff? You really need a GUI built for you in order to do this? You can't take the time to just type up this instruction to the LLM? Do you realize that's possible? You can just write instructions "Don't modify XYZ.ts file under any circumstances". Not to mention all the tools have simple hotkeys to dismiss changes for an entire file with the press of a button if you really want to ignore changes to a file or whatever. In Cursor you can literally select a block of text and press a hotkey to "highlight" that code to the LLM in the chat, and you could absolutely tell it "READ BUT DON'T TOUCH THIS CODE" or something, directly tied to specific lines of code, literally the feature you are describing. BUT, you have to work with the LLM and tooling, it's not just going to be a button for you or something.

You can also literally do exactly what you said with "going a step further".

Open Claude Code, run `/init`. Download Superwhisper, open a new file at project root called BRAIN_DUMP.md, put your cursor in the file, activate Superwhisper, talk in stream of consciousness-style about all the parts of the code and your own confidence level, with any details you want to include. Go to your LLM chat, tell it to "Read file @BRAIN_DUMP.md" and organize all the contents into your own new file CODE_CONFIDENCE.md. Tell it to list the parts of the code base and give it's best assessment of the developer's confidence in that part of the code, given the details and tone in the brain dump for each part. Delete the brain dump file if you want. Now you literally have what you asked for, an "index" of sorts for your LLM that tells it the parts of the codebase and developer confidence/stability/etc. Now you can just refer to that file in your project prompting.

Please, everyone, for the love of god, just start prompting. Instead of posting on hacker news or reddit about your skepticism, literally talk to the LLM about it and ask it questions, it can help you work through almost any of this stuff people rant about.


_all_ models I’ve tried continuously, and still, have problems ignoring rules. I’m actually quite shocked someone would write this if you have experience in the area, as it so clearly contrasts with my own experience.

Despite explicit instructions in all sorts of rules and .md’s, the models still make changes where they should not. When caught they innocently say ”you’re right I shouldn’t have done that as it directly goes against your rule of <x>”.

Just to be clear, are you suggesting that currently, with your existing setup, the AI’s always follow your instructions in your rules and prompts? If so, I want your rules please. If not, I don’t understand why you would diss a solution which aims to hardcode away some of the llm prompt interpretation problems that exist


I am by no means an AI skeptic. It is possible to encode all sorts of things into instructions, but I don’t think the future of programming is every individual constructing and managing artisan prompts. There are surely some new paradigms to be discovered here. A code locking interface seems like an interesting one to explore. I’m sure there are others.


Or, you know, chmod -w XYZ.ts


Are you paying for the higher end models? Do you have proper system prompts and guidance in place for proper prompt engineering? Have you started to practice any auxiliary forms of context engineering?

This isn't a magic code genie, it's a very complicated and very powerful new tool that you need to practice using over time in order to get good results from.


That's the beauty of the hype: Anyone who cannot replicate it, is “holding it wrong”.


Or maybe it works well in some cases and not others?


It ain’t a magic code genie. And developers don’t spend most of their day typing lines of code. Lots of it is designing, figuring out what to build, understanding the code, maintenance considerations, and adhering to the style of whatever file you’re in. All these agents needing local context and still spit junk.


   > it's a very complicated and very powerful new tool that you need to practice using over time in order to get good results from.
Of course this is and would be expected to be true. Yet adoption of this mindset has been orders of magnitude slower than the increase in AI features and capabilities.


guy 1: I put money in the slot machine everyone says wins all the time and I lose

you: HAVE YOU PUT MORE TOKENS IN???? ARE YOU PUTTING THEM IN THE EXPENSIVE MACHINES???

super compelling argument /s

if you want to provide working examples of "prompt engineering" or "context engineering" please do but "just keep paying until the behavior is impressive" isn't winning me as a customer

it's like putting out a demo program that absolutely sucks and promising that if I pay, it'll get good. why put out the shit demo and give me this impression, then, if it sucks?


The way I ended up paying for Claude max was that I started on the cheap plan, it went well, then it wanted more money, and I paid because things were going well.

Then it ran out of money again, and I gave it even more money.

I'm in the low 4 figures a year now, and it's worth it. For a day's pay each year, I've got a junior dev who is super fast, makes good suggestions, and makes working code.


> For a day's pay each year

For anyone trying to back of the napkin at $1000 as 4-figures per year, averaged as a day salary, the baseline salary where this makes sense is about ~$260,000/yr? Is that about right lordnacho?


Yeah I thought that was a reasonable number in the ballpark. I mean, it probably makes sense to pay a lot more for it. A grand is probably well within the range where you shouldn't care about it, even if you only get a basic salary and it's a terrible year with no bonus.

And that's not saying AI tools are the real deal, either. It can be a lot less than a fully self driving dev and still be worth a significant fraction of an entry level dev.


I assume it's after tax too..


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: