If you check their latest updates it says “Update - This incident is impacting R2, Durable Objects, Cache Reserve, Key Transparency Auditor, Stream, Logpush and Images.”
Regarding time and timezones, it's best to not prematurely optimise. Ideally store both. It's much more valuable to track the timezone from your source then to have to worry about it later. ISO8601 is the universal standard.
A case where you see this issue play out is the horrible Strava-*device sync. You track in a different timezone and they store+visualise activities against some weird client-side profile setting, which causes morning runs to render at 11pm. Only adding timezone later on. They totally mess this up.
First of all, I don't see how using unix epoch timestamps can be called "premature optimization", it's a pretty widely used and standardized way of saving timestamps.
Secondly, if you're in "SQLite Strict Land" and you don't have access to abstractions like "tell the database this is a date", then the best way of storing timestamps is unix epoch, I would be extremely surprised if databases don't already do this behind the scenes when abstracting away things like dates for the user.
Thirdly, what is the value of tracking "source timezone"? This is solving a non-existent problem: if you are getting the timestamp in unix epoch from the source, and you're storing it in unix epoch, the "source timezone" is already known: it's UTC just like all unix timestamps are.
Timezones is fundamentally a data presentation concern, and I strongly believe they should not be a part of the source data.
> You track in a different timezone and they store+visualise activities against some weird client-side profile setting, which causes morning runs to render at 11pm. Only adding timezone later on. They totally mess this up.
This is exactly the type of issue that happens because you involve timezones in your source data. If all applications and databases only concern themselves with unix timestamps, and the conversion to a specific timezone only happens in the application layer upon display, this type of issue simply does not happen, because time "1638097466" when I'm writing this is the exact same time everywhere on the globe.
(Of course, similar issues can happen due to user error if two applications have different time zone settings and the user mistakenly enter a timestamp in the wrong TZ, but that's definitely not solved by making time zones a part of the data itself)
I agree that this is the preferred way of dealing with it. Unfortunately, it's not always possible. Some cases that come to mind:
- Importing event/action data that contains date/time values with a timezone but insufficient information on the place where the event occurred. Converting to UTC and throwing away the timezone means you're losing information. You can no longer recover the local time of the event. You can no longer answer questions like, did this happen in broad daylight or in the dark?
- Importing local date/time data without a timezone or place (I've seen lots of financial records like this). In this case, you simply don't know where on the timeline the event actually took place. The best you can do is store the local date/time info as is. You can't even sort it with date/time data from other sources. It's bad data but throwing it away or incorrectly converting it to UTC may be worse.
- Calendar/reminder/alarm clock kind of apps. You don't want to set your alarm to 7 am, travel to another timezone and have the alarm blare at you at 4 am. Sometimes you really do want local time.
- There are other cases where local times are not strictly necessary but so much more convenient. Think shop opening hours in a mapping app for instance. You don't want to update all opening hours every time something timezone related happens, such as the beginning or end of daylight saving time.
You are correct, there are many other reasons for saving dates and times in a database apart from recording "events", and for those it does make sense to use "relative" descriptions of time or date. I'd argue though, that if your data collection of events is imperfect, like if you have no idea which system an event came from or whether you can trust the syntax of the timestamps, those are primary problems that should be fixed and not "worked around" by changing how your timestamps are saved.
For example, if you don't know where a data point originated, that's already a pretty big issue regardless of whether the time is correct or not. If you have financial data with ambiguous timestamps, this is not only a problem but potentially a compliance problem, since banks are heavily regulated. I think it's unlikely that it's acceptable for a bank to be unable to answer the question "when did this event take place", so the fundamental issue should be fixed, not tolerated.
You're not always in a position to fix these issues, nor is it always worth the effort. It may be a complete side-issue that no one cares about. The information may exist somewhere but not in the dataset you're working with.
Having done quite a bit of data integration work in my life, I can tell you that fixing things can range from the unavoidable to the quixotic.
Another use case is the definition of a business day.
A business day is almost always in local time. Example: business is open from 9AM until 6PM. This is affected by day light saving and it moves back and forth in UTC.
Working with financial systems, payments are business day aligned and if you try to have a system with UTC time you might be surprised what sort of issues are surface. I learned these things the hard way. :)
That only applies if what you're storing identifies a point in time. In business code we often have to work with date times in a specific timezone. e.g. if you schedule a meeting on 2022-07-01 09:00 in Berlin time, throwing away the timezone would be a mistake which results in incorrect behaviour when the DST rules change.
> That only applies if what you're storing identifies a point in time.
A meeting is definitely "a point in time", and your example illustrates my point: the display of "2022-07-01 09:00 UTC+2" should be up to the client (which has a timezone setting) but the storage of this meeting's time should, in my opinion, not be "2022-07-01 09:00 UTC+2" but the timezone-neutral "1656658800".
Since there is technically no such thing as "Berlin time", it should be up to the client to decide whether the date you selected in the future (July 1, 2022) is UTC+1 (winter) or UTC+2 (summer) and then store the time in a neutral way to ensure it's then displayed correctly in other clients that have another timezone set (for example the crazy Indian UTC+5:30).
Meetings are a good example of when this matters because it's important that meetings happen at the same time for everyone, which will always happen when the source data is unix epoch. Of course it also happens with correct time zone conversions, but to me that adds unnecessary complexity. Other examples given, like the opening hours of a store, are probably better to store in a "relative" way (09:00, Europe/Berlin) as the second part of that tuple will vary between UTC+1 and UTC+2.
The approach should be digital knobs, which work relative to their setting and roll-through endlessly magnetically. So software can react, but user is in control.
The worst change in recent years is kitchen plates all having these horrible touch systems or magnetic detachable knobs.
It's horribly slow in usage, monster learning curve, locking systems which all behave differently, fails when wet. It's so over-designed and over-engineered. It's so bad. Just provide a knob for **'s sake and when you turn it right, it turns on.
If I set the volume to 0 the software pauses a recording so I can no longer skip unwanted parts without messing with fast-forward while driving. These smart knobs invite designers to mess with basic functionality that has worked for decades (play, pause, rewind, fast forward)
Occasionally, due to a bug, on cold winter mornings when I set the heating to max, I won't get warm air even after I've driven for a while and the engine is already hot. I then have to move the knob to cold and hot again for the software to register and finally give me some warmth.
I'd like chickenhead knobs that are connected to the functionality as direct as possible, please!
Me too. If the content overspills you can't shut off the plaque because the control become unusable ('hot' or liquid on the surface displayed), which often leads to more overspills /autocatalyse.
Proof of work is no different in this regard: more capital, more mining power, more control.
Miners interests don't always align with the network's users interests (see gas fees).
Proof of work isn't more decentralized either (a few mining pool delegators control bitcoin), eth2 proof of stake is more secure because of the pseudorandom validator selection.
It is very different: The difference is that PoW is censorship resistant. Anybody can be a miner and existing miners cannot censor new miners. Performing new work is external to the network state. In PoS, existing stakers can prevent new stakers from registering. Very important distinction.
This is patently false, endgame PoW centralizes mining around 3rd world coal/cheapest possible (stolen?) electricity. The overwhelming majority of the world has been priced out of BTC mining, not that they could get ahold of an ASIC anyways.
Sure. And analogous to PoS is some kind of galactic requirement that you pay some kind of space bond in order to go to space. Mess around and your space bond is slashed. I like the bottlerocket model better.
Don't agree, in the past metabase was good, now it's just getting worse.
I was a big fan of metabase (to the point I deployed it in 4 companies and rolled it out for many people to use in their daily workflow). All was nice.
Somewhere around 0.31 something changed in the internal dev/release flow. I get the feeling they have some kind of internal push to add unnecessary untested features which all look nice on paper, and work on a 5-row 5-column user tables. Until you actually use those features and all functionality fails miserably..
Had weird issues with it, which usually occurred after upgrades, on both small and bigger decently indexed DB's (pg and mysql), things like:
- had sets of reports which all worked fine until they hit a 1001 rows.. and suddenly the UI had a react race condition.
- X-Rays which silently fail and break the UI, while on the DB server you see the queries complete.
- Deployed new versions which would hammer databases by distincting every column individually every hour, instead of handling statistics a bit more intelligently. Disabled that ofcourse.
- Deployed versions in other countries, where the UI client-server latency caused the whole UI to break (probably the limits of setTimeout reached? :-)).
If you check the github issues, it's just ignored problems and they are focus on releasing new functionality instead of fixing the issues. Doesn't really motivate to report anymore problems.
They had one good thing: they would allow you to work in sql and write efficient queries, but clearly they decided it's time to remove the last good feature and now only want graphical query builders.
If you use metabase: deploy it on a replica such it doesn't whack your prod DB. And take backups before upgrading, because there is no revert.
I've spent enough time on Metabase problems, won't touch it anymore.
We've walked a fine line between fixing bugs, adding the features that our users have asked for and building out the BI interface we think the world should be using in 5 years.
We don't always get it right, and I'm sorry if we broke things for you on the performance side.
We will be spending a few of our development cycles after 0.33 to try to clear out our bug queue.
One specific factual point I'd like to reply to however -
SQL isn't going anywhere. If anything we're making it easier to both write plain SQL, use SQL templates or use our graphical interface on top of SQL query results in our next release (preview at https://github.com/metabase/metabase/releases/tag/v0.33.0-pr...)
I’ve been using Metabase off and on in prod since 2016. I think you guys are really onto something and I love seeing Clojure in a popular OSS project. I know you guys just got some money; here’s my pitch to make it count: Metabase would be a lot more sticky if you provided a robust home for annotation and discussion. You’re better than most at letting admins describe data, but often users want to talk about the data they are discovering. Doing so in the context of the data is so much more effective than talking about it in generalities in Slack. I think Yellowfin has some features around discussion but there is always room for improvement. Maybe one could: Comment on a data point, write an explanation for why a date range has weird data, or tag a colleague to ask about an outlier in their dept. Celebrate wins with emojis? All I know is that there is nothing stickier or more engaging than discussion history. Keep up the good fight!
While I have you here...
<nag>
bump for the Athena driver. Also I can’t use Metabot without some privacy settings. Can’t anything be done? </nag>
Fully agree, pipes are awesome, only downside is the potential duplicate serialization/deserialization overhead.
Streams in most decent languages closely adhere to this idea.
I especially like how node does it, in my opinion one of the best things in node. Where you can simply create cli programs that have backpressure the same as you would work with binary/file streams, while also supporting object streams.
Node streams are excellent, but unfortunately don't get as much fanfare as Promises/async+await. A number of times I have gotten asked "how come my node script runs out of memory" -- due to the dev using await and storing the entirety of what is essentially streaming data in memory in between processing steps.