Hacker Newsnew | past | comments | ask | show | jobs | submit | jonmoore's commentslogin

The VLDB paper mentioned is https://www.vldb.org/pvldb/vol16/p1601-budiu.pdf.

Abstract:

"Incremental view maintenance has been for a long time a central problem in database theory. Many solutions have been proposed for restricted classes of database languages, such as the relational algebra, or Datalog. These techniques do not naturally generalize to richer languages. In this paper we give a general solution to this problem in 3 steps: (1) we describe a simple but expressive language called DBSP for describing computations over data streams; (2) we give a general algorithm for solving the incremental view maintenance problem for arbitrary DBSP programs, and (3) we show how to model many rich database query languages (including the full relational queries, grouping and aggregation, monotonic and non-monotonic recursion, and streaming aggregation) using DBSP. As a consequence, we obtain efficient incremental view maintenance techniques for all these rich languages."


I asked the same question to one of the core devs at a recent event and he (1) said that some people in finance have done related things and (2) suggested using the Ray slack to connect with developers and power users who might have helpful advice.

I agree this is a very interesting area to consider Ray for. There are lots of projects/products that provide core components that could be used but there’s no widely used library. It feels like one is overdue.


I also like Racket for a good few of these reasons, but the tooling has a lot of sharp edges and limitations in practice. It would be unfair to expect a full JetBrains/Microsoft IDE experience but it surprised me that a Lisp descendant would come with such an underpowered default REPL. I also find, whatever the technical merits of Scribble compared to docstrings and Markdown, that the documentation for most Racket packages is poorly written compared to that for even minor packages in common languages.


Thanks. This focuses on the most common daily need, with really clean design and display of information, and live updates as a flourish. Definitely sparks joy.


I can be pretty cynical about corporate politics but that kind of consultant-IC interview is almost always safe. It’s wildly against the consultant’s interests to cause serious conflict to break out and it’s easy to tone down inflammatory comments. Also, if the IC’s ideas are ones likely to rile up management the consultant is going to drop them. There are also potential upsides; if senior management is decent they will be glad that their employees have good suggestions and a good fraction of consultants will be glad to credit the ICs once management have bought in; ofc some will take all the credit themselves but that’s still a low-downside proposition.


The Modern Data Stack / MLOps product space was succinctly described by one actually-technical CEO as "vending into ignorance"; the author corroborates this with a commendably candid take:

>Imagine it’s 2021, peak MDS, and you meet the CDO of a large bank. “Oh cool,” she says, “you’re the CEO of a tech company. What does your product do?” What do you say?

>“We build a tool that leverages the power of the cloud to apply standard SQL and software engineering best practices to the historically mundane (but critical!) job of data transformation.”

>“We’re the standard for data transformation in the modern data stack.”

>I will tell you that, empirically, option #2 is more effective.

This tallies with what I've seen from a lot of enterprise CxOs and their teams as technology hype moved from big data and block chain and onto data science/machine learning.

There is so much to write about this, but I'll just recommend "Life Cycle of a Silver Bullet" http://freyr.websages.com/Life_Cycle_of_a_Silver_Bullet.pdf, which deserves more attention than it's had on HN.


Is MLOps more or less of a thing than prompt engineer?


MLOps is deploying, monitoring and (re)training ML models. Sits in the DevOps and data engineering space.

Prompt engineering is making generative AI do what you want by crafting the right context. I would put it somewhere in the software and data engineering space, given they will most likely integrate applications with it. MLOps comes into play if you have your own trained or tuned model.


More of a thing, but it's mostly DevOps.


Indeed, and tools like dep-tree provide a combination of 1) making module structure visible 2) making rules about this structure concrete and 3) automatically checking for rule violations.

These all help to lower the cognitive barrier to learning and maintaining the code base effectively. For developers new to the code base they help with learning and for those more experienced they help with ongoing design and maintenance.

Most long-lived code bases I've seen have adopted or built such tooling at some point, often with tools customized to the code base. For example in one large code base (c. 250 devs) we built tooling that simulated and helped optimize the changes to implement a major refactor of the overall module structure.


> If I say "Dyson" and you're in the UK, you think of

Mathematical Physics :)


There is a more general algorithm called Delta Debugging (https://en.m.wikipedia.org/wiki/Delta_debugging), introduced in Andreas Zeller’s 1999 paper "Yesterday, my program worked. Today, it does not. Why?" (https://doi.org/10.1145/318774.318946), recommended if you’re interested in the topic.


Coincidentally, today I noticed a surprisingly high number of file accesses from Tenable's Nessus software, caused by it reading a megabyte-sized config file one character at a time without buffering, each going through Win32's ReadFile.

It seems that negligence is not in short supply.


Classic MS community engagement post right here.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: