Hacker Newsnew | past | comments | ask | show | jobs | submit | henridf's commentslogin

A timelapse of a building being moved in 2012 in Zurich: https://vimeo.com/42984680

It's taken from close up so you see how a concrete platform has been poured under the building, then the whole thing pushed on rails.


Banner seen at the start of the Video can be translated as "Starting May 22th this Building is going crazy.". Its a play on words that also means that the building is being moved starting May 22th.


Prometheus is a (rare) recent example of a significant, thriving open-source project that is community-based. Not quite as broad in scope as something like Elasticsearch though.


Depending on what "community-driven" means, I'd like to add Python, Blender, Jupyter, Django ...

I'd say "thriving community-based open source projects" are rare in the sense that most open source projects don't thrive (especially if you count every open-licensed repository on github), but there are tons of examples.


But it started out as a Soundcloud project...


For those wishing to see Sqlite used to publish data sets:

Datasette (https://github.com/simonw/datasette) is a new tool to publish data on the web. It uses SQLite under the hood.


OP here, my bad indeed. I was fiddling around trying to figure out the right way to shorten and carelessly ended with this. No intention to mislead, but I agree this changes the meaning in a bad way.

Most importantly I don't see a way to revert; hopefully the moderators will see this and do it instead.


We've just updated the title from “Massive crater under Greenland’s ice points to climate-altering impact of humans”.


I've sent an e-mail to the moderators asking them to fix the title. (The moderators can be reached at hn@ycombinator.com.)


I'm not sure which tools the author has tried, but the Prometheus monitoring system supports both histograms and quantiles.

There's a good discussion of the respective merits of each at https://prometheus.io/docs/practices/histograms/#quantiles


Histograms require you to configure buckets into which your samples are allocated; to allocate the buckets appropriately, you need to know what your expected values are — that is, to measure latency, you need to know your latency. While this can work (I think most of us have a clear idea, or can obtain an idea of what our typical latencies is, and configure buckets around that) it is inelegant. I feel like I would rather have X=percentile, Y=latency, but such a bucketing gives you X=latency, Y=request count. Still useful, but only as informative as you are good at choosing buckets. (There is the histogram_quantile function, but I am unclear that its assumption of linear distribution within buckets really makes much sense, since most things would be long-tail distributions, and thus I would think that once you get past the main "hump" of typical latencies, most samples would cluster towards the lower end of any particular bucket.)

I am not clear on how Summaries actually work; they appear to report count and sum of the thing they're monitoring; that is, if one were to use them for latencies (and the docs do indeed suggest this), it would report a value like "3" and "2000ms", indicating that 3 requests took a total of 2000ms together; how is one supposed to derive a latency histogram/profile from that?

Prometheus's fatal flaw here, IMO, is that it requires sampling of metrics. That is, things like CPU, which are essentially a continuous function that you're sampling over time. But its collection method/format doesn't seem to really work that well for when you have an event-based metric, such as request latency, which only happens at discrete points. (If no requests are being served, what is the latency? It makes no sense to ask, unlike CPU usage or RAM usage.)

To me, ideally, you want to collect up all the samples in a central location and then compute percentiles. Anything else seems to run afoul of the very "doing percentiles on the agents, then 'averaging' percentiles at the monitoring system" critique pointed out in the video posted in this sibling comment: https://news.ycombinator.com/item?id=18194507


Your points are largely valid, but prometheus is a monitoring solution, not a scientific or financial tool. Certain tradeoffs are taken since the monitoring aspect comes first and being scientifically correct comes second. Hence poll vs push, for instance.


Is m3ql still being used? And if so any thoughts on how m3ql and promql will respectively be used over time?


M3QL still is the primary query language internally, however the query service is still being rebuilt in open source M3 - so it’s not available just yet.

As to why we’ve kept it, there’s been such a large amount of functions we’ve added over time that don’t really have an alternative in say PromQL, etc - so we aim to offer both PromQL and M3QL in open source land and letting end users use either one. I don’t think either are better than the other one, they’re just different flavors - function expansion vs pipe based, etc.

Here’s the list of functions, you can kind of get a sense of what I’m talking about just by looking at the list: http://m3db.github.io/m3/query_engine/architecture/functions...


We are working on making WarpScript (http://www.warp10.io) able to fetch data from M3, so you can benefit from 850+ functions for analyzing your time series data.


We ran into a number of issues with Helm when deploying - failures leading us to have to rollback, with rollbacks then failing, requiring manual changes to unblock.

I think that for third-party packages and related templating (which seems like the original use-case) it works well, but I would be wary of using it for high-res deploys of our own stuff.


Minikube is working great for us. Makes it easy to run something that's pretty close to a production stack on a dev machine.

Typically with locally-built (dev) images rather than those from the registry that CI writes to, but other than that the k8s manifests are the same.


Sysdig has had a bunch of nice posts over the last 2-3 years:

https://sysdig.com/blog/tag/technical/


And details on the wrk (load gen) setup too, please.


The pipelining benchmark is identical to that of Japronto (another, very similar thing posted here on HN a few days ago). Japronto's repo on GitHub holds the wrk pipelining script used.

I haven't had the time to add configurations for every server tested (esp. Apache & NGINX) but the main point here is to showcase the Node.js vs. Node.js with µWS perf. difference.


How did you not have the time? Apologies, I might be missing something, but was this an emergency work assignment?

If not, then you should have taken the time to provide the information for a fair comparison with the other stacks.

As it is, you're just asking the community to take your word for it.


We don't need to take his word for it. It's open source, so we can run the tests ourselves.

I think it's completely understandable that he threw in the others, probably default config, without caring much about it since they weren't the point of the writeup.


Does this pass all the HTTP tests in Node.js repo? If not the perf diff is irrelevant.


It has a mostly-compatible API but strict conformance doesn't seem to be the goal here. If your application does not make use of obscure features provided by core http (it could probably be refactored to do without anyways), then it's a free boost in performance.


Is the req object a readable stream? is the res object a writable stream? How do you handle backpressure with this mostly-compatible API?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: