If your application is pricing sensitive, check out DeepInfra.com - they have a variety of models in the pennies-per-mil range. Not quite as fast as Mercury, Groq or Samba Nova though.
(I have no affiliation with this company aside from being a happy customer the last few years)
DeepInfra is amazing in terms of price, like really, they have the Qwen3 embedding model for $0.002 per mn tokens. That's an order of magnitude cheaper than most alternatives with better benchmark scores. But the performance P99 is slow and the variance is huge. For latency sensitive workloads it's problematic, if they can fix that it'll be a no-brainer to use them. DeepInfra does tend to have the lowest prices of any API provider.
Sounds great to me! Just be sure the AI is capable of taking on various personas because not everyone you meet is a believer in nonviolent communication.
A tip, Miami guy to Miami guy - the best devs eschew PHP like the plague. The remaining PHPers are desperate work-a-day sorts who will not be bringing the latest hotness to your project. Hire an Elixir or Haskell expert and you're gonna get a much more well-traveled coder.
Kudos on your bold undertaking! I've been a side-lined QNX admirer for some time, though not a potential user in most cases. A good next step would be a series of blog posts where the author takes on common types of enthusiast projects and unpacks how QNX's strengths can be applied in those scenarios.
Do you find you really need that level of “resolution” with memories?
On our [1] chatbots we use one long memories text field per chatbot <-> user relationship.
Each bot response cycle suggests a new memory to add as part of its prompt (along with the message etc)
Then we take that new memory and the existing memories text and feed it to a separate “memory archivist” LLM prompt cycle that’s tasked with adding this new memory and resummarizing the whole thing, yielding a replacement for the stored memories, with this new memory added.
Maybe overly simplistic but easy to manage and pretty inexpensive. The archiving part is async and fast. The LLM seems pretty good sussing out what’s important and what isn’t.
I have already tried what you're doing, and it didn't perform well enough for me. I've been developing this project for a two years now. Its memory isn't going to fit in a single prompt.
I imagine that your AI chatbots aren't as cheap or performant as they can be with your potentially enormous prompts. Technical details aside, just like when talking to real people, it feels nice when they recall minor details you mentioned a long time ago.
If it's your personal assassinate and is helping you for months it means pretty fast it will start forget the details and only have a vogue view of you and your preferences. So instead of being you personal assassinate it practically cluster your personality and give you general help with no reliance on real historical data.
I have read a lot of reports that the job market is pretty bad.
But, bad or good, I think all you can really do is keep trying! Don't let a few rejections stymie your long term goals. Your family needs you to keep putting one foot in front of the other and applying to more and more places until you find that perfect role.
Maybe use this downtime to build yourself up: open source some stuff that defines you as a subject matter expert, or blog about some of your experiences, etc.
Wouldn't hurt to share your resume here too - lotta industry people lurking. :)
Sad day! This guy was a hilarious and talented writer. If anyone is looking for a book to pick up this weekend, I'd recommend checking out some of his work, especially if you like hard drinking Jewish nihilist detectives.
I loved Kinky. I first encountered him in a Washington Post interview in the early 1980s, in which he remarked "I'm searching for a lifestyle which does not require my presence." That's been my lodestar ever since. R.I.P. Kinkster.
(I have no affiliation with this company aside from being a happy customer the last few years)