More

agentS · on Jan 20, 2018

I am sorry that you are sick of the HTTPS everywhere movement.

But forcing your sites to use HTTPS will also prevent your users from unwittingly participating in DDOS attacks on other sites (e.g. https://en.wikipedia.org/wiki/Great_Cannon). Consider it herd immunity.

Also, to respond to some of your other anti-HTTPS comments:

regarding overhead: people are also working hard to minimize the amount of overhead inherent to TLS. For instance, TLS 1.3 will establish an encrypted connection in a single roundtrip, and is capable of resuming encrypted connections in zero roundtrips with application opt-in (see https://blog.cloudflare.com/tls-1-3-overview-and-q-and-a/). The encryption itself has fairly ubiquitous support in hardware, making most of them ridiculously fast.

regarding CAs: with HTTP you are implicitly relying on the honesty of people in the network path. With HTTPS you are implicitly relying on the honesty of the intersection of a) people in the network path, b) people who control a CA. This is strictly fewer people than with HTTP. People are also working hard to solidify our faith in the set (b), by requiring Certificate Transparency for all new certificates, thereby ensuring that misbehaving CAs can be detected, and drastically raising the cost of mounting a CA-based attack.

You say "What if I for what ever reason don't want to use HTTPs", I you'll have to layout some of those reasons explicitly. You'll probably find that people are working on all of them.

In general, the default expectation on the web should be encrypted and authenticated (i.e. only both endpoints can read/write the data). Once we live in that future, asking for the ability to allow plaintext network traffic will seem a lot like asking modern programming languages to explicitly allow buffer overflows. The language designer would be justified in saying "No", and ignoring you. The considerate language designer might ask "why would you want that", and try to address your real need. But they would still never actually give you what you ask for. This may be "taking away choice" in the same sense that mandating airbags is "taking away choice", but people shrug and accept it because the baseline has moved.

anfilt · on Jan 20, 2018

The biggest problem is Java Script. Netscape made huge mistake when they added that to their browser. If we wanted to download code on the fly it should have been a separate format from HTML. HTML is supposed to be a document. Sadly, that boat sailed decades ago. I really wish the standards committee would quit adding basically hacks to what was supposed to be a bunch documents. Then create a separate format more conducive to that goal.

I suppose you could that creating a secure web context is kinda of way of doing that. Separating documents from more interactive content. However, freaking HTML is a horrible way to structure such a system.

Also the CA is single point of failure. A MITM only affects the users along that routing path. CA failure can affect the entire web...

I agree default is fine, but the user should be able to change those defaults. I could always compile a version for Firefox or Chromium that does. However, that's kinda ridiculous that I don't see any thing for instance in about:config Firefox.

--Speaking of programing languages-- Honestly, I don't see any language succeeding that only allows code with run time checks running. It's why rust has the unsafe keyword. To override those checks. The reason there are lots of hardware devices where the data from said device will vary in size, and there is no way the compiler will know how all these devices work. The unsafe keyword allows Rust to be a system programming language. Heck even C requires you to revert to ASM at times for working with hardware. The generated code may not meet some strict requirements set by the hardware or you need to setup the environment so C code can run.

Honestly, the biggest problem I see with programming languages is that too many of them try to be general purpose. Domain specific languages are great. If the language is designed well for the problem space it can lead to well written concise and understandable code. If your problem domain is working with hardware you need raw pointers, memory access and accurate timing. If you problem domain is altering images you probably want easy to use vector operations, handling of regions ect...

agentS · on Jan 17, 2017

> Which leads to the question, why is google doing this? They, you, could easily promote AMP pages while not masking the real URL!

Perhaps to allow the content to be served from a CDN (over HTTPS), without requiring the site to CNAME over their domain to Google.

If webmasters are willing to CNAME over their domain to a caching proxy, then a less intrusive design is possible[0], such as the one recently announced by Cloudflare[1].

[0]: https://github.com/ampproject/amphtml/blob/master/spec/amp-c... [1]: https://blog.cloudflare.com/accelerated-mobile/

jshen · on Jan 17, 2017

Most major sites use a CDN already. It's really to keep users within google.

lllr_finger · on Jan 17, 2017

The AMP CDN doesn't even really help with caching - when I implemented AMP at launch for some large sites, it was strongly implied that we couldn't expect any reduction in calls to our servers, and we didn't notice any reduction when AMP went live. From my perspective it appeared to be a walled garden counter-measure to FB Instant Articles and Apple News.

agentS · on April 14, 2016

Monomorphization is not a reasonable option, in my opinion.

It forces the compiler to accept some truly awful running times for pathological cases. Atleast quadratic, probably exponential.

For languages that have reflection or pointer maps for GC or debug information for types, it can force large blowups in space as well. Go has all three of these.

The implementation would likely require runtime code-generation (or accept warts like Rust's "object safety").

Indeed, all of Ian's proposed implementations are polymorphic and seem to avoid each of these issues at first glance. The only advantage of a monomorphic implementation is performance, and considering the downsides, this'd be premature optimization forced by a language spec.

If its actually performance critical, I imagine it'd be easy to write a program that monomorphized a particular instantiation of the generic types. Indeed, the compiler would be free to do that itself, if it felt it would be worth it. Small, guaranteed non-pathological scenarios for instance.

Where if you guarantee monomorphization in a language spec, the compiler and all users are forced to accept the downsides in all instances, in exchange for often meaningless performance gains (example: any program that does computation then IO).

pcwalton · on April 14, 2016

It's really not bad in practice. I've measured the amount of compilation time that generic instantiations take up and it's always been pretty low. Something like 20% (it's been a while, so take with a grain of salt), and that's with a naive implementation that doesn't try to optimize polymorphic code or perform ahead of time mergefunc. 20% is well within the project's demonstrated tolerance for compiler performance regressions from version to version. And you can do better with relatively simple optimizations. Generic compilation has been well-studied for decades; there are no unsolved problems here.

I would heavily advise against trying to do better than monomorphization with intensional type analysis (i.e. doing size/alignment calculations at runtime). We tried that and it was a nightmare. It didn't even save on compilation time in practice because of high constant factor overhead, IIRC.

Monomorphization is one of those things, like typechecking in ML, where the worst-case asymptotic time bounds look terrible on paper, but in practice it works out fine.

People point to C++ compilation times as a negative counterexample, but most of the compilation time here is in the parsing and typechecking, which a strongly typed generics implementation will dodge.

the_mitsuhiko · on April 14, 2016

If generics were not a studied and well implemented concept I would agree. But we live in a world where this is just not the case. I would take a slightly slower compiler with generics support any day over the mess that go devolves into because of the lack of it.

barrkel · on April 14, 2016

Bear in mind that .NET will use a shared instantiation when the generic arguments are normal reference types; a hidden parameter is required for static methods under this scheme, to pass the concrete type info. Monomorphization is only required when the vector of generic arguments has a distinct permutation of reference types vs value types.

This gives you the best of both worlds: memory efficient generics for the majority of instantiations, and compute efficient generics for the types most likely to benefit (like primitives).

agentS · on March 27, 2016

> But you cannot write a non-blocking Go function in the same way.

The caller can make function "non-blocking" by wrapping the call in a goroutine themselves. (There's some subtle differences, but they are mostly irrelevant here). For this reason, I'd say there is (almost) no reason to introduce asynchrony in your API in the way you suggest. The rest of your post built on this example seems shaky to me, since it seems built on an example API that doesn't need to exist.

I'd say that "you don't have to care whether your code is async or not" is a overstating the case. I would append the qualifier "unless you're introducing concurrency". Considering that almost no low-level APIs are asynchronous, this usually happens rarely (or happens in low-level code like the HTTP server). Examples that have come up for me: making N parallel RPCs, writing a TCP server. In those situations, you care about async vs not.

In event-loop based systems, it seems like async is in my face all the time, even when doing things that are entirely sequential.

Lukasa · on March 27, 2016

> The caller can make function "non-blocking" by wrapping the call in a goroutine themselves.

Sure, but if they want the return value then either they need to construct the Future-y wrapper I just described or they need to assemble it together in a collection of other function calls wrapped inside a function that itself is either Future-y or uses a long-lived channel to communicate results.

It is not novel to build up a non-blocking system from purely blocking method invocations. We've been doing that for years: it's called threading. Doing things this way has many advantages when written with appropriate diligence, and I'm not pretending otherwise. However, if you actually care about communicating between these arbitrary threads of execution than you either need Futures or queues (both of which are essentially just channels in Go), and at this point you've got the exact same problems as you get in NodeJS or any other asynchronous programming environment.

> The rest of your post built on this example seems shaky to me, since it seems built on an example API that doesn't need to exist.

I don't think that's fair: as I mentioned above, the fact that you as library author would not write the Future-y extension doesn't mean that the Future-y extension isn't built: you just force your caller to build it. That's fine, it's a perfectly good architectural decision (probably you should't be making those decisions for your user), but it doesn't remove the problem.

> I'd say that "you don't have to care whether your code is async or not" is a overstating the case. I would append the qualifier "unless you're introducing concurrency".

Sure. The thing that matters here is that Node is always introducing concurrency, because Node is concurrent. This is why all Node programs have to care about concurrency: they are all concurrent because their system is concurrent.

This is desperately inconvenient for many one-off programs, which is why I personally don't use Node for anything like that: I'd much rather use Python or Rust or Go. But that was never my argument. My argument was about OP's assertion that "with Go it doesn't matter if an operation is blocking or non blocking, that fact can totally be abstracted from the client code. Writing callbacks is tedious, promises are tedious, co routines with yield need plumbing and async isn't in the spec."

The first sentence is dangerously misleading (while technically true, any system that does that is usable only in that one context), and the second one misses the point, which is that those things get effectively built anyway in any moderate-scale concurrent system in Go.

But my biggest point is this: Go isn't magic in regard to concurrency, and there is a weird amount of magical thinking around Go. Go is a very good language with a lot to like, and I like it quite a lot. But when boiled down to it, Go's concurrency model is threads with a couple of really useful primitives. And that's great, and it works really well. But it's not new or novel.

The sentence "with Go it doesn't matter if an operation is blocking or non blocking, that fact can totally be abstracted from the client code" is equally true if you replace "Go" with "C", or "Python", or "Java", or any language with a threaded concurrency model. There's no magic here. It's the same building blocks everyone else is using.

vishbar · on March 27, 2016

Go isn't just a threaded concurrency model, it uses an M:N greenthreads pattern. Also, when you say that Go I/O operations are blocking, it is true that they'll logically block a goroutine. However, under the hood, it uses the same libuv-style async IO (or IOCP on Windows) that Node does. An operating system thread doesn't get blocked; the goroutine is "shelved" and woken up again when the I/O is complete. It accomplishes the same kind of thing as Nodejs does, it just abstracts the async nature of the IO away from the programmer. I have to say I like it: procedural execution is easier to reason about.

Lukasa · on March 27, 2016

Honestly, I think the distinction between M:N threading and straight OS threading is pretty minor. It grants some advantages to the language runtime: it can control the stack size, for example. But in terms of how it affects the development style and what kinds of bugs it encourages/discourages I don't think it dramatically differs from the OS threading model.

sagichmal · on March 27, 2016

It is categorically different. OS threads are orders of magnitude more expensive, which makes them a nonstarter for most problems that are a good fit for lightweight conceptual concurrency.

Lukasa · on March 27, 2016

It is not categorically different.

As I said above, green threading has advantages over OS threading, but they behave exactly the same in terms of design patterns and potential bugs.

This is what I was getting at when I said "not that different": compared to the difference between event-loop concurrency and threaded concurrency, M:N green threading is basically just a subcategory of threading.

sagichmal · on March 27, 2016

No. Green threads enable entirely new classes of design patterns that are categorically infeasible with OS threads.

Rapzid · on March 27, 2016

I believe that depending on the system call the thread handling the call could block, but it's not the same thread developer Goroutines are running on. Yeah though, same as nodejs.

jerf · on March 27, 2016

The core problem here is that the Node community invented/popularized a connotation of "blocking" and "non-blocking" that is excessively event-loop-specific. The important difference in their connotation is that code that blocks blocks the whole OS process. The conventional meaning of the term referred just blocking the running thread.

In normal Go, nothing is blocking in the Node sense. (Oh, if you put your mind to it you can manage it, but I've never once encountered this as an practical problem, either in Go or the equivalents you can do in Erlang if you put your mind to it.)

This has profound changes on how you write code.

It's true, Go is not magic. It's just another threaded language in most ways, with the "real" magic in the community best practices around sharing by communicating instead of communicating by sharing. In theory, you could write an equivalent set of C libraries and get most of the same things, but you'd have a lot of library to write. (This is why things like porting C goroutines to C have a hard time getting traction. It can be done, but it's actually the easy part. Also, you'd still be in C, which is its own discussion. But you can get the concurrency.)

The real issue here isn't that Go is necessarily exceptionally strong at concurrency, the real issue is that Node is exceptionally weak. It introduces this new concept of "blocking" that only exists in the first place because it is weak, and then makes you worry about it continuously, to the point that many people seem to internalize the concept as what concurrency is, when it isn't. It's really just something Node laid on you. So when you step out of Node, and you see a community that isn't visibly as worried about "blocking" as the Node community, someone trained by Node thinks they are seeing a community that "isn't good at concurrency". My gosh! Look how cavalier they are about "blocking"! Look how they tell people not to worry about it, and how casual they are about having users wrapping library code in goroutines and explicitly telling library writers not to do the concurrency themselves. But what you're seeing is what happens when you simply no longer have the problems Node and "event-based code" brings to the table. Go is not magic in the general case, but, honestly, when someone coming from the Node world picks up Go, I can see why they might go through a period where they sort of think it is. There are really differences in code style, and how easy it is to write correct code.

You have to make sure you're not letting the limitations of one connotation of "blocking" spill over into the other, or you will have problems. (True in both directions.)

To speak to someone else's point, "futures" in Go don't "suck", they basically don't exist. If you're writing in a recognizably "futures" fashion, you are not writing idiomatic or even particularly good Go. You don't need futures, because (what are today called) futures are basically an embedding of a concurrency-aware language into a non-concurrency-aware language, and you don't need them when the language you're working in is already concurrency-aware. That's why you don't see futures in Haskell or Erlang either. (I have to qualify with "what are today called" because the term has drifted; for instance, Haskell does have explicit support for an older academic definition of the term with MVars, but modern software engineers are not using the term that way.)

kasey_junk · on March 27, 2016

I've never in my life programmed in Node. When I say futures I'm talking about the logical concurrency primitive written about by Friedman/Wise in the 70s.

Lots & lots of idiomatic go code exists in that form (anytime you wrap a select that times out in a function you have a future).

Channels, what we are really discussing here, have 2 problems: the first is in abstraction, they don't provide basic primitives that other similar structures provide, like timeouts & cancellation. The second is in implementation. As futures you have to worry about all the edge cases around nil & closed channels. As queues they are highly contended.

Lukasa · on March 27, 2016

I agree with this entirely. As I said many times before, I like Go and I like its approach to concurrency.

All I'm trying to do is to make sure that people who make bold claims about abstracting away blocking code aren't misleading others: when calling other code you should always be aware of how it interacts with the flow control of your program.

philipov · on March 27, 2016

In your opinion, how does Python 3.5's async/await syntax compare to Go for writing concurrent programs? I work primarily with Python3 these days and have no Go experience.

agentS · on July 4, 2015

For servers that primarily speak RPC or HTTP, do you foresee Rust going thread-per-request or something more callback-y?

eddyb · on July 4, 2015

Neither. Existing successful solutions use tight event loops (the "Reactor pattern": https://stackoverflow.com/questions/3436808/how-does-nginx-h...).

There is a Rust library called "mio" which provides a lot of the plumbing for such systems: https://github.com/carllerche/mio.

The path forward is likely going to involve adding a way to build cheap state machines (call them generators or async/await) with a clean syntax and giving mio hundreds of thousands of reusable instances.

agentS · on July 4, 2015

> ... Reactor pattern ...

I don't understand, and the link seems unclear. Perhaps a more direct question: I get a request X, and I need to consult a backend service to answer the request. Do I write synchronous code calling that backend? Or do I have some callback mechanism?

> ... generators or async/await

Ah. This perhaps answers my question. Both of these are essentially compiler-written callbacks.

If this is going to be like C#, then I presume there will be a thread-pool where user code will execute. It seems like a non-ideal story for concurrency. Users will have to take inordinate care not to call any blocking code; otherwise they will prevent one of the threads in the pool from doing useful work.

pcwalton · on July 4, 2015

> It seems like a non-ideal story for concurrency. Users will have to take inordinate care not to call any blocking code; otherwise they will prevent one of the threads in the pool from doing useful work.

The downsides of going M:N are worse. The cgo-like FFI performance problems, for example, are killer for Rust's use case.

pcwalton · on July 4, 2015

Most applications right now should do thread-per-request. Thread spawning is very optimized in both Rust and the Linux kernel, and you can adjust stack sizes if you need to. If you're hitting limits caused by this, you can use mio.

Narishma · on July 4, 2015

What about systems other than Linux?

pcwalton · on July 4, 2015

Mac OS X is rarely used for servers, so I'm not particularly concerned about it. On Windows you can use user-mode scheduling—I would like to see a library for this—which is effectively 1:1.

agentS · on March 12, 2015

Without SSL, there is a trivial way to upgrade a temporary MITM capability to a permanent MITM capability using ServiceWorker.

agentS · on Feb 10, 2015

A binary protocol's parsing is usually something like read 2 bytes from the wire, decode N = uint16(b[0] << 8) | uint16(b[1]), then read N bytes from the wire. A text-based protocol's parsing almost always involves a streaming parser, which is tricky to get correct, and always more inefficient.

Besides, I think this is a moot point, because chances are that less than 100 people's HTTP2 implementations will serve 99.9999% of traffic. It's not like you or I spend much of our time deep in nginx's code debugging some HTTP parsing; I think its just as unlikely we'll be doing that for HTTP2 parsing.

Also, HTTP2 will always (pretty much) be wrapped in TLS. So its not like you're going to be looking at a plain-text dump of that. You'll be using a tool and that tool author will implement a way to convert the binary framing to human-readable text.

Another way to put it is that the vast majority of HTTP streams are not examined by humans and only examined by computers. Choosing a text-based protocol just seems a way to adversely impact the performance of every single user's web-browsing.

Another another way to put it is that there is a reason that Thrift, Protocol Buffers, and other common RPC mechanisms do not use a text-based protocol. Nor do IP, TCP, or UDP, for that matter. And there's a reason that Memcached was updated with a binary protocol even though it had a perfectly serviceable text-based protocol.

drawkbox · on Feb 10, 2015

Agreed on all points. Binary protocols are no doubt better, faster, more efficient and more precise. I use reliable UDP all the time in game server/clients. Multiplayer games have to be efficient, TCP is even too slow for real-time gaming.

Binary protocols work wonderfully... when you control both endpoints, the client and the server.

When you don't control both endpoints is where interoperability breaks down. Efficiency and exactness can be enemies of interoperability at times, we currently use very forgiving systems instead of throwing them out and assert crash dump upon communication error. Network data is a form of communication.

Maybe you are right, since it is binary, only a few hundred implementations might be made and those will be made by better engineers since it is more complex. Maybe HTTP is really a lower level protocol like TCP/UDP etc now. Maybe since Google controls Chrome and the browser lead and has enough engineers to ensure all leading implementations and OSs/server libraries/webservers are correct then it may work out.

As engineers we want things to be exact, but there are always game bugs not found in testing and hidden new problems that we aren't weighing against the known current ones. Getting a something new is nice because all the old problems are gone, but there will be new problems!

It will be an all new experiment we try going away from text/MIME based to something more lower level, complex and exact over simple and interoperability focused. Let's see if the customers find any bugs in the release.

MichaelGG · on Feb 10, 2015

>Binary protocols work wonderfully... when you control both endpoints, the client and the server.

IP is all binary and I don't think it's a case of one party controlling all endpoints.

agentS · on Aug 7, 2014

If your GC is triggered after the allocated memory increases by X% (which is fairly common), then this technique is effective, since it lowers allocation rate.

Also, Go doesn't scan arrays that are marked as containing no pointers, so representing an index as a massive array of values has proven quite effective for me.

agentS · on July 27, 2014

Out of curiosity, why do shells need to use fork?

Note: I am not familiar with the implementation techniques behind shells.

ridiculous_fish · on July 27, 2014

The essential function of a shell is to start processes. In Unix and Linux, the usual way to start a new process is to clone yourself (fork), and then have the clone replace itself with a new executable image (exec).

It's kind of roundabout, but the brilliance of this approach lies in what happens between those two calls. There exists process metadata that survives the call to exec, such as where stdout goes, or whether the process is in the foreground. So shells call fork, the clone sets up the metadata for the target process, and then calls exec to start it.

But when a multithreaded program forks, the clone is very limited in what it can do (before exec). In particular, the clone must not acquire a lock that may have been held at the time of fork (which usually rules out heap allocations!). Now say something goes wrong: the clone needs to print an error message, without locking anything. But lots of functions acquire locks internally. How do you know what's safe to call?

fish solves this by providing its own known-safe implementations of printf() and friends, and being careful to only call those after fork. Go solves this by disallowing any user-code between fork and exec. Instead it provides a single posix_spawn-like entry point called ForkExec, and does some black magic (like raw syscalls - see https://code.google.com/p/go/source/browse/src/pkg/syscall/e... ) in between the underlying fork and exec calls.

My hunch is that a shell written in Go will eventually bump up against the limitations of ForkExec. Happily Go has a strong FFI, so you can hopefully implement this stuff in C, if it comes to that!

xiaq · on July 27, 2014

One of the most frequent things you do in shells is invoking external programs. Unlike in Windows where a CreateProcess does the invocation, in Unix systems this is (traditionally) achieved by a fork/exec combo. See http://en.wikipedia.org/wiki/Fork-exec

agentS · on May 23, 2014

Out of curiosity, what form of "spying" do you think is enabled by Javascript that would not be possible without Javascript?

Edit: And since I replied to a small part of your comment, I should say that I disagree completely with your "few people truly care about their craft" statement. At least, I think that writing code that handles a lack of Javascript is only valuable if you have enough users to justify it. i.e. if you spend 20% of your time working on features for 0.1% of users, then you are doing a disservice to the rest of your users. Even more so if you have to compromise the experience for everyone else such that degrading is an option.

In some cases, you go out of your way to accommodate small fractions of your audience. ARIA and catering to those with disabilities is a good example. But turning off JS is a choice; one I respect, but feel no obligation to cater to. I think pages should show a noscript warning, but other than that, its a matter of engineering tradeoffs.

cpeterso · on May 23, 2014

Some analytics companies track mouse movements to watch how people interact with web pages. They can also use JavaScript to fingerprint browsers beyond what is available with cookies.