>but where do you think LLMs pick up these patterns?
>LLMs are able to capably mimic the common patterns that how to write books have suggested for the last 100 years as ways to make your writing more "impactful" and attention-grabbing. So are humans. They learned it from watching us.
Don't forget that LLMs (at least the "instruct" versions) undergo substantial post-training to align them with the authors' objectives, so they are not a 100% pure reflection of the distribution seen on the internet. For example, it's common for LLMs to respond with "You're absolutely right!" to every second message, which isn't what humans usually do. It's a result of some kind of RLHF: human labelers liked to hear that they're right, so they preferred answers containing such phrases, and those responses became amplified. People recognize LLM-generated writing because LLMs' pattern distribution is different from the actual pattern distribution found in articles written by humans.
ASML, which is based in the Netherlands, produces chip-making machines which TSMC and everyone else use to produce said chips. I think they got some expertise too :)
For logins, we're already used to the fact that they're expected to be in Latin. Having them in the native alphabet is more trouble than it's worth (one system supports it, another breaks etc., easier to remember one, in Latin, across systems) I'd be irritated though if I couldn't use my native alphabet in the user profile for the first name/last name
>Please provide your name exactly as it is in your government documents.
>This is extremely important. Failure to comply will lead to termination of your service with no refund, criminal prosecution, our CEO calling you in tears and a hitman being informed about your last known location
Heh, I had this exact thing when getting certified at Microsoft (remotely). They required me to enter my name exactly as it appears on my government ID (not a single Latin character), but their registration site... simply blocked any characters outside of Latin. I had to obtain an international travel passport to get the "official" transliteration of my name
I've gotten a visa to a country that doesn't use Latin characters. My name got transliterated. At the bottom of the visa there's the machine-readable field that uses ASCII characters, and my name lost a character (a OU became just U).
It's also fun when the official transliteration rules suddenly change: a visa/passport issued in one year has a different name in Latin than a passport issued in another year. I was once two separate people :)
Orcaman is a very straightforward implementation (just sharded RW locks and backing maps), but it limits the number of shards to a fixed 32. I wonder what the benchmarks would look like if the shard count were increased to 64, 128, etc.
It potentially still might make a difference due to reduced contention: if we have more shards the chances of two or more goroutines hitting the same shard would be lower. In my mind the only downside to having more shards is the upfront cost, so it might slow down the smallest example only
Were those 16 mln sessions used only for alignment, chat format, reasoning, etc.? Or it's possible to train a base model too? If a single session is at least 32k tokens, then it's already 0.5 trillion tokens to train on, interesting.
It's something we debated in our team: if there's an API that returns data based on filters, what's the better behavior if no filters are provided - return everything or return nothing?
The consensus was that returning everything is rarely what's desired, for two reasons: first, if the system grows, allowing API users to return everything at once can be a problem both for our server (lots of data in RAM when fetching from the DB => OOM, and additional stress on the DB) and for the user (the same problem on their side). Second, it's easy to forget to specify filters, especially in cases like "let's delete something based on some filters."
So the standard practice now is to return nothing if no filters are provided, and we pay attention to it during code reviews. If the user does really want all the data, you can add pagination to your API. With pagination, it's very unlikely for the user to accidentally fetch everything because they must explicitly work with pagination tokens, etc.
Another option, if you don't want pagination, is to have a separate method named accordingly, like ListAllObjects, without any filters.
Returning an empty result in that case may cause a more subtle failure. I would think returning an error would be a bit better as it would clearly communicate that the caller called the API endpoint incorrectly. If it’s HTTP a 400 Bad Request status code would seem appropriate.
>allowing API users to return everything at once can be a problem both for our server (lots of data in RAM when fetching from the DB => OOM, and additional stress on the DB)
You can limit stress on RAM by streaming the data. You should ideally stream rows for any large dataset. Otherwise, like you say you are loading the entire thing into RAM.
Buffering up the entire data set before encoding it to JSON and sending it is one of the biggest sources of latency in API based software. Streaming can get latencies down to tens of microseconds!
I like your thought process around the ‘empty’ case. While the opposite of a filter is no filter, to your point, that is probably not really the desire when it comes to data retrieval. We might have to revisit that ourselves.
how about returning an error ? It’s the generic “client sent something wrong” bucket. Missing a required filter param is unambiguously a client mistake according to your own docs/contract → client error → 4xx family → 400 is the safest/default member of that family.
I run Qwen3-32b locally without any tools (just llama.cpp) and it can do basic arithmetic for smaller numbers ( like 134566) but I didn't check it for much larger numbers. I'm not at the PC right now but trying to do it via OpenRouter on much larger numbers overflows the context and it stops without giving a result :)
Not a security researcher, but I once found an open Redis port without auth on a large portal. Redis was used to cache all views, so one could technically modify any post and add malicious links, etc. I found the portal admin's email, emailed them directly, and got a response within an hour: "Thanks, I closed the port." I didn't need a bounty or anything, so sometimes it may be easier and safer to just skip all those management layers and communicate with an actual fellow engineer directly
Dunno, in my Go+HTMX project, it was pretty trivial to add SSE streaming. When you open a new chat tab, we load existing data from the DB and then HTMX initiates SSE streaming with a single tag. When the server receives a SSE request from HTMX, it registers a goroutine and a new Go channel for this tab. The goroutine blocks and waits for new events in the channel. When something triggers a new message, there's a dispatcher which saves the event to the DB and then iterates over registered Go channels and sends the event to it. On a new event in the tab's channel, the tab's goroutine unblocks and passes the event from the channel to the SSE stream. HTMX handles inserting new data to the DOM. When a tab closes, the goroutine receives the notification via the request's context (another Go primitive), deregisters the channel and exits. If the server restarts, HTMX automatically reopens the SSE stream. It took probably one evening to implement.
8b models are great at converting unstructured data to a structured format. Say, you want to transcribe all your customer calls and get a list of issues they discussed most often. Currently with the larger models it takes me hours.
A chatbot which tells you various fun facts is not the only use case for LLMs. They're language models first and foremost, so they're good at language processing tasks (where they don't "hallucinate" as much).
Their ability to memorize various facts (with some "hallucinations") is an interesting side effect which is now abused to make them into "AI agents" and what not but they're just general-purpose language processing machines at their core.
>LLMs are able to capably mimic the common patterns that how to write books have suggested for the last 100 years as ways to make your writing more "impactful" and attention-grabbing. So are humans. They learned it from watching us.
Don't forget that LLMs (at least the "instruct" versions) undergo substantial post-training to align them with the authors' objectives, so they are not a 100% pure reflection of the distribution seen on the internet. For example, it's common for LLMs to respond with "You're absolutely right!" to every second message, which isn't what humans usually do. It's a result of some kind of RLHF: human labelers liked to hear that they're right, so they preferred answers containing such phrases, and those responses became amplified. People recognize LLM-generated writing because LLMs' pattern distribution is different from the actual pattern distribution found in articles written by humans.
reply