Hacker Newsnew | past | comments | ask | show | jobs | submit | jvz's commentslogin

I'm evaluating Flow for CDC. Do you support logical decoding messages from `pg_logical_emit_message`? This would allow us to add audit metadata[^1].

[^1]: https://www.infoq.com/articles/wonders-of-postgres-logical-d...


No. But this is neat, and at a glance it looks straight forward to add. Happy to discuss further!


Goodhart's Law in action. Any "product quality" metric will be gamed to the point of uselessness as soon as it gains traction. In other words, services like this one can only work as long as nobody knows about them.

The only stable solution is for each person to curate and maintain their own set of sources, so that there are no high-value metrics for marketers to target. Exactly the opposite of what this service is trying to do.


Looks interesting; thanks for writing this!

Does it support nested structs? Sometimes that's more natural than a flat struct.


Yes - you can use nested dataclasses. But you will loose some of the nice features


I was intrigued, but after visiting your website I'm ambivalent. The only way to truly combat shilling is to fully embrace a model where each person sees reviews only from their own personal network of sources— this drastically reduces the "eyeball multiplier" that makes mass media so attractive to marketers.

But seeing things like "Top 3 Recommended" on your website makes me believe you're falling into the same trap everyone else is, at least partially. I don't care how sophisticated your detection of paid content is; you simply can't win a direct war against these pathogens (marketers). The only way to win is not to play, i.e. don't provide a platform for mass reach in the first place. Mass media is really a type of monoculture, with all the same weaknesses.

Unrelated: how do you plan to solve the problem of identifying and classifying essentially every product in existence? Resolving duplicates, slight variants, etc. is a very hard problem, as is categorization.


Our application does only show reviews from your personal network. Well, close. The app has a Q&A format (think Instagram meets Stack Overflow), and you see both questions and answers from your network. You can also see the user that asked a question that someone in your network has answered, and see users who answered the question someone in your network asked. In this way, you can be exposed to some new content. But reviews can't really spread virally as things stand now.

Re: product identification...actually people tend to shop from the same places (think of the product coverage of the top 5000 retailers). We don't have those top 5000 yet, but we're working on it. We get most of our product data from affiliate networks, who offer up that data to drive sales back to their site. For whatever we can't import and internalize, we have a "search anything" feature which allows the user to use a web view to navigate to a product page and import the product.

You're right, resolving duplicates and variants is a very tricky problem that can become incredibly complicated. Right now we use some very basic heuristics like normalized name and brand, ASIN, GTIN, EAN, UPC, etc. But actually, we don't need things to be perfect. So long as a user can get to a product that is more or less what they want to recommend, they're happy. Also, when you are searching for a product to recommend (this is not "browsing," it's when you are trying to answer a recommendation with a specific product) we boost products that have already been recommended. This way we can get users who are looking for a particular product to recommend the same instance of that product. We also focus our efforts on cleaning up the data for products that have already been recommended, which is a much smaller subset than our total catalog.

Our data is stored in Neo4j, we find its structure to be well suited to a product catalog and taxonomy, and it allows us to derive relationships between products. The process of improving our catalog is an ongoing task, and one that will likely never end.


No metric can escape gaming when you apply it to rational actors (Campbell's Law / Goodhart's Law). Blind devotion to metrics is just as bad as no metrics at all.


Have you evaluated compression algorithms that support custom dictionaries, like zstd? You could generate a compression dictionary for each domain, or just for those above a certain size.


Why not increase the minimum font size instead?


Reviews on sites like Amazon or Yelp are a monoculture: everyone sees the same set of reviews and the same ratings. This creates leverage that makes it worthwhile for shills to spend large amounts of effort "infecting" those sites with bad information, since once they find effective ways of doing so it affects everyone.

So the solution must be a review system that works like Twitter, where each user has a unique "view" composed of sources they've selected, directly or indirectly. This diversity would make infection much more difficult and less rewarding for shills and other attackers.

I'm still trying to figure out the best design for such a system. The requirements and usage patterns would be quite different from Twitter's, and I'm not aware of any existing attempts that I can learn from.


Yes, tie the review more personally to the reviewer. I guess this is more like the influencer culture we see a on Instagram and YouTube.


> I'm still trying to figure out the best design for such a system.

Easy answer: distributed consensus, à la Bitcoin.


Fake reviews can be seen as an instance of Goodhart's law, where the metric is the rating or score of the business. Initially those ratings may have high correlation with something real, let's say the "quality" of the business. But the more people rely on those scores and the reviews underlying them, the more incentive businesses have to game the system— which destroys the original correlation between ratings and quality.

A big part of the problem with review systems is the one-to-many nature of nearly all of them: when a person posts a review, that review and its score can be seen by everyone. This leverage makes it very efficient for businesses to game the system, as a small amount of fake information can "infect" the purchasing decisions of a large number of users.

So, one alternative might be a many-to-many review system where you only see reviews and ratings from your network of friends/follows (and maybe friends-of-friends, to increase coverage). So essentially Twitter, but with tools and UI that focus on reviews and ratings. That way, fake reviews could only affect a limited number of people, making the cost/benefit calculus much less attractive for would-be astroturfers and shills.


Are you familiar with Overtone[1]? I'm no musician, but as a developer I'd think building your own musical abstractions via code would be better than any visual IDE.

[1] http://overtone.github.io/


I am yes. Would you make the same claim about eg image editing? If not, why not?


Images are inherently visual, and they seem to have more irreducible complexity than music. Music has a lot of structure and repetition that seems like it could benefit from abstraction (as Chris Ford demonstrates with some Bach in this entertaining talk[1]).

On the other hand, there are vector graphics editors that let you work in terms of shapes and paths instead of pixels. Would you say you are trying to build the musical equivalent of that?

[1] https://www.infoq.com/presentations/music-functional-languag...


Yes, I would say that the be vector based editors is a good analogy.


Others have made it. Probably why you have access to python from Blender.

https://www.youtube.com/watch?v=xJZyXqJ6nog

https://docs.blender.org/manual/en/dev/editors/python_consol...


I'm aware. However the scripting is like an extra feature, not the main interface. The OP was saying that code > UI, however I disagree. It's useful, however not superior.


OP was saying that code > UI "for developpers". Something like one of Brett Victor's UI[0] integrated in GIMP would be quite nice for a dev who can draw, or a painter who can code.

[0] https://vimeo.com/36579366 (circa 3 min. mark)


Are you talking about photographs?


Or illustrations.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: