Hacker Newsnew | past | comments | ask | show | jobs | submit | tkone's commentslogin

Ironically this is EXACTLY what the journald receiver for OpenTelemetry does, which, as they noted, is written in go.

Specifically because you're only supposed to use that OR the c bindings by design because they want the ability to change in the internal format when it's necessary.


Why not just use pouchdb? It's pretty battle-tested, syncs with couchdb if you want a path to a more robust backend?

edit: https://pouchdb.com/


Scale really. GoatDB easily handles hundreds of thousands of items being edited in realtime by multiple users


Hundreds of thousands of items and multiple users could be done on a $5 PI zero 2 w (1Ghz quad-core A53) with the C++ standard libary and a mutex.

People were working at this scale 30 years ago on 486 web servers.


I swear we've been going backwards for the past 15 years


Doing concurrent editing AND supporting offline operation?


What do you mean by "offline operation"? Which part is non-trivial?


Your server/network goes down, but you still want to maintain availability and let your users view and manipulate their data. So now users make edits while offline, and when they come back online you discover they made edits to the same rows in the DB. Now what do you do?

The problem really is about concurrency control - a DB creates a single source of truth so it can be either on or off. But with GoatDB we have multiple sources of truth which are equally valid, and a way to merge their states after the fact.

Think about what Git does for code - if GitHub somehow lost all their data, every dev in the world still has a valid copy and can safely restore GitHub's state. GoatDB does the same but for your app's data rather than source code


So now users make edits while offline, and when they come back online you discover they made edits to the same rows in the DB. Now what do you do?

Store changes in a queue as commands and apply them in between reads if that's what you want. This is really simple stuff. A few hundred thousand items and a few users is not a large scale or a concurrency problem.


Yup. Go ahead and try it, then you'll discover that:

- The queue introduces delays so this doesn't play nice with modern collaborative editing experience (think google docs, slack, etc)

- Let's say change A set a field to 1, and change B set the same field to 2. GoatDB allows you to easily get either 1, 2 or 3 (sum) or apply a custom resolution rule

Your only choices before goat to solve this were: Operational Transformation, raw CRDTs or differential synchronization. GoatDB combines CRDTs with commit graphs so it can do stuff other approaches don't at an unmatched speed


Go ahead and try it,

I tried it and much more a long time ago.

The queue introduces delays so this doesn't play nice with modern collaborative editing experience

Things that can be done millions of times per second per core don't "introduce delays" that a handful of people are going to see.

unmatched speed

Are you seriously trying to say that the database you created in a scripting language that uses linear scanning of arrays is 'unmatched' compared to high performance C++? You may have other features but you have no benchmarks and the scenario you were bragging about is trivial.


Things that can be done millions of times per second per core don't "introduce delays" that a handful of people are going to see.

Oh but they can't. If you tried it, then you surely know that both OT and CRDTs need to consider the entire change history at some key points in order to derive the current value. Diff sync doesn't suffer from the same issue, however the way it keeps track of client shadows introduces writes on the read path, making it horribly expansive to run at scale.

Are you seriously trying to say that the database you created in a scripting language that uses linear scanning of arrays is 'unmatched' compared to high performance C++?

It's not about the language, but about the underlying algorithm. Yes, JS is slower, and surely linear scan is slower than typical DB queries. But what GoatDB does, which is quite unique today, is it's able to resume query execution from the last point a query ran, so you get super efficient incremental updates which are very useful when running on the client side (clients tend to issue the same queries over and over again).


I'm not sure what the point of all this is. Linear scanning arrays does not scale, this is basic computer science. Javascript is going to run at 1/10th the speed of a native language at best. You don't have any benchmarks and are bragging about stuff that was typical 30 years ago. You realize that people have done shared document editing for decades and that every video game keeps a synced state right?

The most important thing here is benchmarks. If you want to claim you have "unmatched" speed, you need benchmarks.


so can couch/pouch? (pouch is a façade over leveldb on the backend and client-side storage in your browser)

have you done benchmarks to compare the two?

i know from personal experience leveldb is quite performant (it's what chrome uses internally), and the node bindings are very top notch.


GoatDB is web scale. PouchDB isn't web scale.


But how many goats does pouchdb have? I'm betting 0.


you can fit a lot of goats into a pouch, depending on the size of the pouch


"A pouch is most useful when it is empty" - Confuseus


[flagged]


You can do whatever you want, but if you reach out to other people because you want them to use it, you better be able to convince them why


the scratch non-stick pans, which also are horrible for your health.

cast iron, stainless steel.


Transmit is a file transfer client (like FTP). It needs access to your entire drive because you might want to copy something to/from anywhere in your drive.


Google docs was originally ot based as well. I'm not sure about the current state of it.


If you’re debugging something simple or non-distributed, this product isn’t for you.

If you’re working on anything distributed, log aggregation becomes a must. But, also, if you’re working on anything distributed and you’re looking at logs, you’re desperate. Distributed traces are so much higher quality.


When I formed these opinions I was working on Materialize, which is basically the polar opposite of "simple and non-distributed". However it was still quite common that I knew exactly which process was doing something weird and unexpected.


Maybe it’s the difference between tracking a bug (abnormal operation) vs understanding behavior of a complex system (normal operation)?


Yup and the reason no one markets something like "tail the logs for server X" is because, if you're talking in the context of an individual server, you're too small for anyone to care about.


I've got logs from hundreds of servers that I use standard tools to look at, and that's a small system. Centralising logs has been a thing for decades.


Which is fine, I'm just saying you're not the target market for the big observability vendors.

The current generation of observability tools is built for distributed systems that are basically too complex to reason about, and so you have other ways of monitoring and debugging them. When you have 10's of k's of ephemeral containers running hundreds of services, you can't just look at some logs for a server to understand what's going on (ignoring the fact that servers aren't even a primitive in this system).

10's of GBs of logs a day just doesn't move the needle on pricing. They want the customers that are going to generate 7 figures in revenue and those customers aren't talking about aggregating logs from a few hundred servers.


Sorry, did plenty of "distributed" tracing back in the day and this is just not the case. I can't help but feel like you're after-the-fact rationalizing as if you need this for diagnosing anything "distributed" or "complicated".

Distributed anything is actually easier in most cases because you will always have input and output. Sure, if you're debugging a complicated and coordinated "dance" between two concurrent threads/processes then yeah fully agreed, but then you're deep in uncharted territory and you need all the help you can get.


Arg this article is wrong.

TrueType was FREELY licensed. You could use TrueType fonts in Windows 3.1. (I know, I was there.)

What Apple didn't license was their Advanced Type Tech (Quickdraw GX?) which allowed for further refinements to glyph positioning.


Thank you for this clarification, I was scratching my head at this, wondering if I had misremembered my entire childhood desktop publishing experience. IIRC the relative openness of TrueType encouraged/forced Adobe to open up their previously proprietary Type 1 fonts.


As a postscript (see what I did there?), Apple did eventually donate a bunch of QuickDraw GX tech/innovations to OpenType.


nah, that’s not true at all. have a look at ‘rich-text’[1] which allows for transforms on metadata in a separate stream from the main content. it’s the same basic algo used for OT on plain text.

(i was the cto at a startup which used this to create a multi-user text editor with rich text support in 2015ish)

1: https://github.com/ottypes/rich-text


What's not true at all? That more powerful rich text editors need to rely on a tree structure?


yes. that is not true. you can build a rich text editor with simple ot. no tree necessary.


I agree. Yes, you can. Quill is the example here.

Actually, back in 2015 when we started prototyping CKEditor 5, we started with this approach as well. Our goal from the beginning was to combine real-time editing capabilities with an engine capable of storing and rendering complex rich-text structures (nested tables, complex nested lists, other rich widgets, etc.). We quickly realized that a linear structure is going to be a huge bottleneck. In the end, if you want to represent trees, storing them as a linear structure is counterproductive.

So, we went for a tree model. That got many things in the engine an order of magnitude harder (OT being one). But I choose to encapsulate this complexity in the model rather than make it leak to particular plugins.

In fact, from what I remember, https://github.com/quilljs/quill/issues/117 (e.g. https://github.com/quilljs/quill/issues/117#issuecomment-644...) is a good example of issues that we avoided.

I also talked to companies that built their platforms on top of Quill. One of them ended up gluing together countless Quill instances to power their editor and overcome the limitations of the linear data model but is now looking for a way to rebuild their editor from scratch due to the issues (performance, complexity, stability).

So, yes. You can implement a rich-text editor based on a linear model. But it has its immediate limitations that you need to take into consideration.


this already happened to some extent. the “blessed” way to develop github is in vscode (from ms), using codespaces (from ms), running on azure (from ms). vim/emacs users can use the terminal (although the codespace and port forwarding, at first, had to be done via vscode exclusively) but your entire toolstack needs to be installed each time you launch a new one.

collaborating with anyone at ms already meant you were using teams to some extent.

(former github developer)


Simple answer: github was doing most of this work as SHA1 is a non-allowed hash type for FIPS compliance, which mattered since Microsoft had landed the US DoD JEDI contract.

The JEDI contract was cancelled a in 2021 so the work never continued on that workstream.

source: former github developer


Clarification: SHA-1 is under review but still allowed. The next revision of FIPS 180-4 will certainly start the clock on retiring it, but that's a years-long process.


More specifically, the current deadline is end of 2030: “Modules that still use SHA-1 after 2030 will not be permitted for purchase by the federal government.” (https://www.nist.gov/news-events/news/2022/12/nist-retires-s...)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: