I was wondering how they'd casually veer into social media and leverage their intelligence in a way that connects with the user. Like everyone else ITT, it seems like an incredibly sticky idea that leaves me feeling highly unsettled about individuals building any sense of deep emotions around ChatGPT.
My question is: Given Apple is one of the most valuable companies on the planet, they can (and surely have) hired some of the best designers in the world. Articles like this one and many others are virtually sharing what we all think and every time a new beta comes out, it's strange to see some of the decisions that are made. The first beta came out and it was _very hard_ to see the lock screen if you had notifications. How was that missed? Or keep liquid glass, but don't make the text bright blue, so it's so hard to see. Or trigger frosted glass if dependent on whatever the background is? I sincerely do find designers to be in a hard position (especially having worked with so many of them in the past directly), but a lot of these things seem like novice mistakes. Maybe it's not even in the designing, it's on the QA front? I'm not even sure here. I'm by no means a designer, but I have to believe that they are testing this as much as we are internally and have been for a long time now... I'd like to believe they aren't just changing UI elements on the fly based on what X / Twitter feels is good or bad.
Two theories are that Apple had to put something together quickly as a headliner because Apple Intelligence was clearly going to be a dud. So this is basically a hacked-together panic project.
Or someone high up has a Vision™, and they're so set on that Vision™ they're not listening to what underlings and users are saying.
Consider a parallel reality in which Apple did the next round of updates as a maintenance release and added some minor new features and UI tweaks. Would that have been a more positive outcome for the company?
My guess is there would have been some grumbling about not having anything new to offer, but also relief that bugs were being fixed. It would have been a bit of a non-event.
This seems more like a seismic negative event, with a lot of criticism from all quarters. (And some stanning, but less than usual.)
> My guess is there would have been some grumbling about not having anything new to offer, but also relief that bugs were being fixed. It would have been a bit of a non-event.
Depending on what Google has to say about Pixel & Gemini in August, I think it would have been much more than grumbling. Apple is in a damned if they do damned if they don't situation. Under the surface of liquid glass, there really isn't even anything new coming unless they have some hardware limited features planned for the iPhone 17 launch.
It's clear this "redesign" was as you said, a panic project to cover for not delivering on AI, again for a second year and having nothing to show for WWDC. Just coming out with "we fixed some bugs" would cause a PR shitstorm. Even more so if Google gets any further ahead integrating Gemini into Pixel w/ personal context like what Apple wanted to achieve with Siri/AI, plus their own redesign (Material 3 Expressive, which is actually looking really nice IMO).
> This seems more like a seismic negative event, with a lot of criticism from all quarters.
Except from normal users/non enthusiasts. My kids and her friends all installed the dev beta and are absolutely enamored with liquid glass and think it's the coolest thing ever. Mind you, these are generations of folks that weren't around for Vista/7 Aero, etc and are now obsessed with that era from a fashion and design POV. "Fruitigier aero aesthetic" and all that. These are also people that would never switch platforms no matter what Apple does because of iMessage and social status/social pressure, so Apple is in no danger of losing any marketshare over this unless Google/Android somehow becomes "cool" again and can generate enough social pressure amongst the youth.
My wife is emphatically not a tech enthusiast. She hates what she's seen on the screenshots and demos so far, and is dreading the moment when it's out and she'll have to update.
> It's clear this "redesign" was as you said, a panic project to cover for not delivering on AI, again for a second year and having nothing to show for WWDC.
Hm... So is their current system universally regarded as absolute shit, or what? Or does everyone[1] think it's pretty great now, but will switch to "it's shit!" immediately as of the WWDC?
Like, WTF is wrong with "We have a great system, it's still just as great, and even better now that we've worked mostly on stability and bugfixes."?
Are corporations nowadays all freaking Cinderella, or what?
___
[1]: Well, everyone who would consider buying into the Apple ecosystem.
Has to be. It has that Musky smell of banning yellow safety paint i.e. too stupid to be a team effort.
Legibility issues with translucency is such a basic thing and I expect Apple designers have gone deep on the topic e.g. mathematical models using human colour perception to determine hard limits for different type weights. I don't think the heavy frosting in past versions was an accident.
But form over function is the core of why Apple is such hugely successful company with just few products. Focus on emotions rather than technical aspects. Design over usability. Less choices for users, just compare how much you can tweak in android vs ios. Removals of buttons, 3.5mm jack, sim card, removable batteries and so on and on just in phone area.
You may not like it (certainly I don't) but its extremely well received behavior. Humans are mostly emotional beings, just look at politics if you think otherwise.
I understand betas very well, but something as critical as that seems more fitting for an alpha. Liquid glass notifications on top of a bright wallpaper, bleeding together so you couldn't read or see anything shouldn't be in a beta.
The initial beta design had so many obvious issues that it's wild that it made it as far as it did. Hell, the readability of many UI elements was obviously terrible in the initial reveal, where you'd expect everything to be shown in the best possible light.
Obviously Apple can improve things for the final release (and it seems like they're taking some steps in that direction). But these issues should have been identified long before the beta was released, and the fact that they weren't does not inspire confidence.
The first beta often ships with core features missing or broken. It exists to get as many new features in front of third party developers as soon as possible, because Apple has very little time to accept feedback before they are locked in for shipping.
At the same time, there seems to be precious little time between when Apple decides a feature is going to ship in the next release, and when WWDC happens.
Even if there was common knowledge inside the company that a new UI was coming, it may have not been merged into mainline until closer to WWDC. At that point, individual teams will need to alter their code to build and be usable on top of the UI as part of continuing their own development - but were likely still focused on the death march for their own WWDC-launched features.
So are we not supposed to criticize a beta at all? How are they to know what to fix unless someone actually looks at it and makes clear what's wrong? Obviously they missed a pretty critical readability issue here.
You apparently have. Beta releases are supposed to be "we believe this to be ready to ship, but need to sort out bugs." What you describe has traditionally been alpha or even pre-alpha releases.
For those of us who have moved the vast majority of our Google searches to ChatGPT / only use Google periodically for one-off questions, is there still a reason to switch to Kagi?
What kind of search does Kagi excel at compared to Perplexity? I've been using Perplexity as a google replacement for about a year now, so I haven't tried Kagi, but seeing several people mention they use both has piqued my interest.
To me, personally, it's about the use case: searching for a page on the internet (Kagi) or researching a particular question or topic (Perplexity).
If I know what info I want (say, that particular blog post that mentioned topic XYZ, or the web page for a car dealership, or docs for something where the site search is worse than a web search), using Kagi is quicker and easier.
Edit to add: I just noticed I always use Kagi to search YouTube instead of YTs search directly (!yt <whatever>). I do the same for Wikipedia, Yahoo Finance, GoodReads, Roger Ebert movie review site, and probably a few other sites I can't recall right now. And I also have some sites boosted and some others blocked, but I haven't been tweaking that for a long time now...
If I'm interested in a topic but don't know exactly what or where, or want a longer explanation aggregated over multiple sources, then I use Perplexity. I usually fire off my question, let it work in the background, and come back a bit later.
That's just my use case, I don't presume that everyone else behaves the same. Also I just recently got access to Kagi's assistant on my plan, which may cannibalize my Perplexity use (we'll see).
For me ChatGPT is great when I don’t really know what I don’t know. I still end up having to do a google search after to verify that the AI result isn’t insane. So for me ChatGPT often is just adding an extra step.
The biggest complaint I (and several others) have is that we continuously hit the limit via the UI after even just a few intensive queries. Of course, we can use the console API, but then we lose ability to have things like Projects, etc.
Do you foresee these limitations increasing anytime soon?
Quick Edit: Just wanted to also say thank you for all your hard work, Claude has been phenomenal.
I'm sure many of us would gladly pay more to get 3-5x the limit.
And I'm also sure that you're working on it, but some kind of auto-summarization of facts to reduce the context in order to avoid penalizing long threads would be sweet.
I don't know if your internal users are dogfooding the product that has user limits, so you may not have had this feedback - it makes me irritable/stressed to know that I'm running up close to the limit without having gotten to the bottom of a bug. I don't think stress response in your users is a desirable thing :).
It takes time to grow capacity to meet growing revenue/usage. As parent is saying, if you are in a growth market at time T with capacity X, you would rather have more people using it even if that means they can each use less.
The problem with the API is that it, as it says in the documentation, could cost $100/hr.
I would pay $50/mo or something to be able to have reasonable use of Claude Code in a limited (but not as limited) way as through the web UI, but all of these coding tools seem to work only with the API and are therefore either too expensive or too limited.
> The problem with the API is that it, as it says in the documentation, could cost $100/hr.
I've used https://github.com/cline/cline to get a similar workflow to their Claude Code demo, and yes it's amazing how quickly the token counts add up. Claude seems to have capacity issues so I'm guessing they decided to charge a premium for what they can serve up.
+1 on the too expensive or too limited sentiment. I subscribed to Claude for quite a while but got frustrated the few times I would use it heavily I'd get stuck due to the rate limits.
I could stomach a $20-$50 subscription for something like 3.7 that I could use a lot when coding, and not worry about hitting limits (or I suspect being pushed on to a quantized/smaller model when used too much).
Claude Code does caching well fwiw. Looking my costs after a few code sessions (totaling $6 or so) the vast majority is cache read, which is great to see. Without caching it'd be wildly more expensive.
Like $5+ was cache read ($0.05/token vs $3/token) so it would have cost $300+
I paid for it for a while, but I kept running out of usage limits right in the middle of work every day. I'd end up pasting the context into ChatGPT to continue. It was so frustrating, especially because I really liked it and used it a lot.
It became such an anti-pattern that I stopped paying. Now, when people ask me which one to use, I always say I like Claude more than others, but I don’t recommend using it in a professional setting.
The gateway is integrated directly into our chat (https://glama.ai/chat). So you can use most of the things that you are used to having with Claude. And if anything is missing, just let me know and I will prioritize it. If you check our Discord, I have a decent track record of being receptive to feedback and quickly turning around features.
Long term, Glama's focus is predominantly on MCPs, but chat, gateway and LLM routing is integral to the greater vision.
I would love feedback if you are going to give a try frank@glama.ai
The issue isn't API limits, but web UI limits. We can always get around the web interface's limits by using the claude API directly but then you need to have some other interface...
The API still has limits. Even if you are on the highest tier, you will quickly run into those limits when using coding assistants.
The value proposition of Glama is that it combines UI and API.
While everyone focuses on either one or the other, I've been splitting my time equally working on both.
Glama UI would not win against Anthropic if we were to compare them by the number of features. However, the components that I developed were created with craft and love.
You have access to:
* Switch models between OpenAI/Anthropic, etc.
* Side-by-side conversations
* Full-text search of all your conversations
* Integration of LaTeX, Mermaid, rich-text editing
Ok, but that's not the issue the parent was mentioning. I've never hit API limits but, like the original comment mentioned, I too constantly hit the web interface limits particularly when discussing relatively large modules.
Your chat idea is a little similar to Abacus AI. I wish you had a similarly affordable monthly plan for chat only, but your UI seems much better. I may give it a try!
Who is glama.ai though? Could not find company info on the site, the Frank name writing the blog posts seems to be an alias for Popeye the sailor. Am I missing something there? How can a user vet the company?
As another commenter in this tread said, we are just a 'frontend wrapper' around other people services. Therefore, it is not particularly difficult to add models that are already supported by other providers.
The benefit of using our wrapper is that you can use a single API key and you get one bill for all your AI bills, you don't need to hack together your own logic for routing requests between different providers, failovers, keeping track of their costs, worry what happens if a provider goes down, etc.
The market at the moment is hugely fragmented, with many providers unstable, constantly shifting prices, etc. The benefit of a router is that you don't need to worry about those things.
Scaling infrastructure to handle billions of tokens is no joke.
I believe they are approaching 1 trillion tokens per week.
Glama is way smaller. We only recently crossed 10bn tokens per day.
However, I have invested a lot more into UX/UI of that chat itself, i.e. while OpenRouter is entirely focused on API gateway (which is working for them), I am going for a hybrid approach.
The market is big enough for both projects to co-exist.
this is also my problem, ive only used the UI with $20 subscription, can I use the same subscription to use the cli? I'm afraid its like those aws api billing where there is no limit to how much I can use then get a surprise bill
It is API billing like AWS - you pay for what you use. Every time you exit a session we print the cost, and in the middle of a session you can do /cost to see your cost so far that session!
What I really want (as a current Pro subscriber) is a subscription tier ("Ultimate" at ~$120/month ?) that gives me priority access to the usual chat interface, but _also_ a bunch of API credits that would ensure Claude and I can code together for most of the average working month (reasonable estimate would be 4 hours a day, 15 days a month).
i.e I'd like my chat and API usage to be all included under a flat-rate subscription.
Currenty Pro doesn't give me any API credits to use with coding assistants (Claude Code included ?) which is completely disjointed. And I need to be a business to use the API still ?
Honestly, Claude is so good, just please take my money and make it easy to do the above !
I don’t think you need to be a business to use the API? At least I’m fairly certain I’m using it in a personal capacity. You are never going to hit $120/month even with full-time usage (no guarantees of course, but I get to like $40/month).
$1500 is 100 million output tokens, or 500 million input tokens for Claude 3.7.
The entire LOTR trilogy is ~.55 million tokens (1,200 pages, published).
If you are sending and receiving the text equivalent of several hundred copies of the LOTR trilogy every week, I don't think you are actually using AI for anything useful, or you are providing far too much context.
You can do this yourself. Anyone can buy API credits. I literally just did this with my personal credit card using my gmail based account earlier today.
1. Subscribe to Claude Pro for $20 month
2. Separately, Buy $100 worth of API credits.
Now you have a Claude "ultimate" subscription where the credits roll over as an added bonus.
As someone who only uses the APIs, and not the subscription services for AI, I can tell you that $100 is A LOT of usage. Quite frankly, I've never used anywhere close to $20 in a month which is why I don't subscribe. I mostly just use text though, so if you do a lot of image generation that can add up quickly
I don't think you can generate images with claude. just asked it for pink elephant: "I can't generate images directly, but I can create an SVG representation of a pink elephant for you." And it did it :)
But I still hit limits, I use Claudemind with jetbrains stuff and there is a max of input tokens (j believe), I am ‘tier 2’ but doesn’t look like I can go past this without an enterprise agreement
Can't wait to try this. What's amazing to me is that when this was revealed just one short month ago, the AI landscape looked very different than it does today with more AI companies jumping into the fray with very compelling models. I wonder how the AI shift has affected this release internally, future releases and their mindset moving forward... How does the efficiency change, the scope of their models, etc.
If they were the same, I would have expected explicit references to o3 in the system card and how o3-mini is distilled or built from o3 - https://cdn.openai.com/o3-mini-system-card.pdf - but there are no references.
Excited at the pace all the same. Excited to dig in. The model naming all around is so confusing. Very difficult to tell what breakthrough innovations occurred.
Yeah - the naming is confusing. We're seeing o3-mini. o3 yields marginally better performance given exponentially more compute. Unlike OpenAI, customers will not have an option to throw an endless amount of money at specific tasks/prompts.
I really don't think this is true. OpenAI has no moat because they have nothing unique; they're using mostly other people's (like Transformers) architectures and other companies hardware.
Their value-prop (moat) is that they've burnt more money than everybody else. That moat is trivially circumvented by lighting a larger pile of money and less trivially by lighting the pile more efficently.
OpenAI isn't the only company. The Tech companies being beaten massively by Microsoft in #of H100s purchases are the ones with a moat. Google / Amazon with their custom AI chips are going to have a better performance per cost than others and that will be a moat. If you want to get the same performance per cost then you need to spend the time making your own chips which is years of effort (=moat).
That's a shame on Google, Apple, Samsung, etc. Voice and other activation methods should be open to any app that claims to be an assistant. An ugly way of "gatekeeping".
When you want to use AI in business you need some guarantees that the integration will not break because the ai company goes down or because of some breaking changes in a year. There is a reason why MSFT is in business. Similarly you will not buy Google because they do not like keeping products forever, you will not buy some unknown product just because it is 5% cheaper. OpenAI has a strong brand at the moment and this is their thing, until companies go to MSFT or AMZ to use their services with the ability to choose any model.
Capex was the theoretical moat, same as TSMC and similar businesses. DeepSeek poked a hole in this theory. OpenAI will need to deliver massive improvements to justify a 1 billion dollar training cost relative to 5 million dollars.
I don't know if you are, but a lot of people are still comparing one Deepseek training run to the entire costs of OpenAI.
The deepseek paper states that the $5mil number doesn't include development costs, only the final training run. And it doesn't include the estimated $1.4billion cost of the infrastructure/chips Deepseek owns.
Most of OpenAI's billion dollar costs is in inference, not training. It takes a lot of compute to serve so many users.
Dario said recently that Claude was in the tens of millions (and that it was a year earlier, so some cost decline is expected), do we have some reason to think OpenAI was so vastly different?
Anthropic’s ceo was predicting billion dollar training runs for 2025. Current training runs were likely in the tens/hundreds of millions of dollars USD.
Inference capex costs are not a defensive moat as I can rent gpus and sell inference with linear scaling costs. A hypothetical 10 billion dollar training run on proprietary data was a massive moat.
It is still curious though as far as what is actually being automated?
I find huge value in these models as an augmentation of my intelligence and as a kind of cybernetic partner.
I can't think of anything that can actually be automated though in terms of white collar jobs.
The white collar model test case I have in mind is a bank analyst under a bank operations manger. I have done both in the past but there is something really lacking with the idea of the operations manager replacing the analyst with a reasoning model even though DeepSeek annihilates every bank analyst reasoning I ever worked with right now.
If you can't even arbitrage the average bank analyst there might be these really non-intuitive no AI arbitrage conditions with white color work.
I don’t want to pretend I know how bank analysts work, but at the very least I would assume that 4 bank analysts with reasoning models would outperform 5 bank analysts without.
AICrete | Richmond, CA | Hybrid / Remote (North America) | Full-time | Frontend / Backend / Fullstack Engineer
At AICrete, we are committed to revolutionizing the global concrete and construction industry. Leveraging AI, machine learning, computer vision, and sophisticated automation, AICreteOS is our innovative solution designed to enhance sustainability, profitability, efficiency, and productivity in real-time. Proudly standing as the first company in the world to introduce AI to the concrete materials industry, we are at the forefront of technological advancement, setting new standards and pioneering changes that drive real impact.
We're a small team of 10-15 and we're looking for great engineers to help us build, grow, and maintain products used some of the largest concrete customers across the US (and soon abroad), as well as those interested in working at the intersection of sustainability, green tech and AI.
There's a really great epidemiologist who has been publishing data-driven articles on substack on the subject of COVID, from samples from waste water to control groups across the world. Her insights are often super valuable and she only speaks to what she knows and the rest goes off of data she comes across. I'd really recommend her thoughts on anything COVID-related
As someone who has moved to the middle east (from the bay area), a large portion of this society heavily relies on Facebook, particularly for FB groups and also FB marketplace. Aside from that, this entire society basically functions on Instagram for any news / personal connections, but as it relates to Facebook, having groups is a great way to connect with people and ask questions and since we don't really have a good alternative to Craigslist here, FB marketplace fills that void.
That said, I'm sure people here use Facebook for other reasons, but anecdotally, this is what I'm seeing. Instagram and Facebook aside, the most used app I've found here is WhatsApp, so the society here is deeply integrated into the Facebook web of services
Can someone enlighten me on why manufacturers who have experience with software, such as Microsoft, don’t create their own OS for their mobile phones? I recognize that it’s easier and faster to iterate to just use Android with some sprinkles on top, but even if it meant spending 4-5 years developing it, the potential market share is absolutely massive. I can see the first year or two would not be great since app developers would need to build their apps, but after that initial hurdle, then I’d imagine it wouldn’t be as bad. After that, it comes down to sales, marketing and mindshare adoption.
To me regarding LG, I was never a fan of their phones, but less competition is always bad in my book.
Edit: Yes I remember the Windows Phone and it’s failures, was thinking more of starting a newer OS these days rather than several years ago.
And there was also WebOS which I know was used in atleast one tablet, maybe some phones too, and which is currently owned by LG.
Just think about how manufacturers are struggling to find success with their own smartphones. Imagine trying to do that with the added burden of developing a custom operating system.
And before Bada . And Samsung had a bad history of promising stuff for users of Bada and Tizen, that never got shipped. This is one of the reasons that I would never bought a Samsung phone again.
Microsoft, of course, famously did create their own phone OS. It wasn’t bad either, but among other problems they were a little late to market and there were no apps for it. A real chicken and egg problem. Not worth the engineering effort to build an app for a tiny marketshare and no one wants a smartphone that can’t even hail an Uber.
Having developed for windows phone 8, 8.1 and 10 i can say with certainty that it was bad in a plethora of ways and microsoft's constant over promising and under delivering made it worse.
Compared to competing OSes at the time? I don’t remember any being particularly fun. (But also I didn’t spend much time in the Windows Phone ecosystem)
But I think you’re right that MS lost focus on developers on mobile. Ironic given "developers, developers, developers" was the literal mantra.
Just to expand on Windows Mobile: For the problems it had (probably the biggest was trying to enter a duopoly), I have yet to hear of anyone who had one who didn’t like it. I never used one, but those who did seem to have liked it better than iOS and Android.
Microsoft tried; they weren't able to capture that market share. Much of that probably comes down to it being tough to catch up to the 3rd party app ecosystem Apple and Android each have.
I believe Hauwei is being forced to develop their own OS due to the threat of US sanctions. However, I think it'd be a losing battle to try to steal enough market share of IOS/Android. They'd simply copy (or one up) your competitive advantage leaving you little room to compete.
I mean, "fake it till you make it" is a good strategy (although you probably shouldn't lie). It is a sound engineering decision to rewrite Android piece by piece, like replacing Linux with LiteOS.