I also found the advice for technical writing at the end of this MIT lecture by Professor Patrick Winston to be very valuable: https://youtu.be/bQI0OmJPby4?t=2703
It is probably ethically no worse than any killing simulation (e.g. first-person shooters). Creatures in such games arguably do not have a rich and lasting experience of processing information as humans and many animals do, so it only has a small ethical importance. My moral intuition on this matter is that the real ethical questions begin with systems that perceive, learn and can experience pain/reward: http://petrl.org
You may have missed the joke (such things are hard to judge by the medium of text). The parent was implying that the complexity of Dwarf Fortress will grow to the point of sentience of the dwarfs themselves.
With the same skepticism you could also ask yourself to which extent the hallucinations were actually due to subconscious recall of all kinds of memes surrounding hallucinogens like "out-of-body experience", "consciousness", "parallel worlds", "breakthrough", "feeling like dying", as well as imagery from all kinds of drug films and literature etc.
Humans are however very bad at retrospectively telling which memes they were exposed to in the past and which not. That is basically the reason why highly specialized people in academia are often bad teachers if they didn't practice teaching a lot. The inferential distance between expert and novice (i.e. the difference in knowledge) simply becomes so large and complex, that the expert forgets what the actual gap is.
I do agree, but at ~14 years of age, in a small village, during the days when only the privileged had even 28k dial-up access, I'd had rather limited opportunity to be exposed to such things!
It would be enough for a friend having bragged about how they hallucinated about certain things. Words can convey very complex imagery like that and these experiences are encoded everywhere (in scripture, folk wisdom, idioms, songs, jokes, tales); in a web of distributed representations of the noosphere that we share via language. It is pretty much impossible to escape it and it seems conceivable that it might strongly bias our experiences under influence of hallucinogens because the mind is basically always concerned with interpreting inputs by the most suitable explanation (which will likely stem from stories and memories). Of course, we are only talking about a certain extent to which these experiences are determined by shared concept spaces, because there is no indication this is the only sensible hypothesis. The similarity of these experiences by different people may as well be explained by our shared cognitive architecture (e.g. modules for recognition of agency or venomous bugs) as other comments in this thread have suggested.
There are brief sections about this in the Deep Learning book by Bengio, Courville and Goodfellow (2016):
> 18.2
Because the negative phase involves drawing samples from the model’s distri- bution, we can think of it as finding points that the model believes in strongly. Because the negative phase acts to reduce the probability of those points, they are generally considered to represent the model’s incorrect beliefs about the world. They are frequently referred to in the literature as “hallucinations” or “fantasy particles.” In fact, the negative phase has been proposed as a possible explanation for dreaming in humans and other animals (Crick and Mitchison, 1983), the idea being that the brain maintains a probabilistic model of the world and follows the gradient of log p ̃ while experiencing real events while awake and follows the negative gradient of log p ̃ to minimize log Z while sleeping and experiencing events sampled from the current model. This view explains much of the language used to describe algorithms with a positive and negative phase, but it has not been proven to be correct with neuroscientific experiments. In machine learning models, it is usually necessary to use the positive and negative phase simultaneously, rather than in separate time periods of wakefulness and REM sleep. As we will see in Sec. 19.5, other machine learning algorithms draw samples from the model distribution for other purposes and such algorithms could also provide an account for the function of dream sleep.
> 19.5.1 Wake-Sleep
One of the main difficulties with training a model to infer h from v is that we do not have a supervised training set with which to train the model. Given a v,we do not know the appropriate h. The mapping from v to h depends on the choice of model family, and evolves throughout the learning process as θ changes. The wake-sleep algorithm (Hinton et al., 1995b; Frey et al., 1996) resolves this problem by drawing samples of both h and v from the model distribution. For example, in a directed model, this can be done cheaply by performing ancestral sampling beginning at h and ending at v. The inference network can then be trained to perform the reverse mapping: predicting which h caused the present v. The main drawback to this approach is that we will only be able to train the inference network on values of v that have high probability under the model. Early in learning, the model distribution will not resemble the data distribution, so the inference network will not have an opportunity to learn on samples that resemble data.
Another possible explanation for biological dreaming is that it is providing samples from p(h,v) which can be used to train an inference network to predict h given v. In some senses, this explanation is more satisfying than the partition function explanation. Monte Carlo algorithms generally do not perform well if they are run using only the positive phase of the gradient for several steps then with only the negative phase of the gradient for several steps. Human beings and animals are usually awake for several consecutive hours then asleep for several consecutive hours. It is not readily apparent how this schedule could support Monte Carlo training of an undirected model. Learning algorithms based on maximizing L can be run with prolonged periods of improving q and prolonged periods of improving θ, however. If the role of biological dreaming is to train networks for predicting q, then this explains how animals are able to remain awake for several hours (the longer they are awake, the greater the gap between L and log p(v), but L will remain a lower bound) and to remain asleep for several hours (the generative model itself is not modified during sleep) without damaging their internal models. Of course, these ideas are purely speculative, and there is no hard evidence to suggest that dreaming accomplishes either of these goals. Dreaming may also serve reinforcement learning rather than probabilistic modeling, by sampling synthetic experiences from the animal’s transition model, on which to train the animal’s policy. Or sleep may serve some other purpose not yet anticipated by the machine learning community.
Hmm interesting. So it seems the Boltzmann machine explanation of dreams that I described becomes less convincing when you consider that running the waking and dreaming phases of the training algorithm in separate long "chunks", instead of simultaneously, seems to not work well in practice.
Doesn't the caveat at the end of your citation from Wikipedia prevent endless recursion? Both players assume that the other player will not change their decision.
I've been surprised to not find anything for ctrl-F "Bayes". It turns out that traditional deductive logic is simply a special case of applying Bayes rule in order to find out the state of some binary variables given the other variables in which the probabilities are only 0 or 1. See for example these examples from Barber's "Bayesian Reasoning and Machine Learning" (p. 39):
Deductive logic is sort of like the machine code for reasoning. It encompasses everything, but it is hard to see in daily usage, just like machine code. Bayes Rule is just one part of it.
If you look at the Mizar mathematical library you will find a formalization of Bayes Rule using plain FOL.
In fact, one of my Logic II (an intermediate logic class) exercises was formalizing a version of Bayes' rule in classic first-order logic.
The idea is basically that junk DNA contains a memory in shape of a distributed representation of the past of the organism and its environment, akin to how neural networks encode information. It basically provides a basis for fast adaptability by introducing noise into the gene expression and morphogenesis process so as to have more versatility and robustness to explore alternatives (very similar to dropout in neural networks).
You're looking for purpose, but as the article states at the outset, don't ask 'what is this for'; ask 'how has this sequence evolved?'.
90% of our genome is unconserved, meaning that it is not under selection. Most of this consists of dead viruses and mobile elements. Such DNA was present for its own purposes while it was active but is long since dead. A tiny proportion of this junk is later co-opted by the host organism.
The null hypothesis is that junk DNA is junk. It survives in the genomes of species with small effective population size because its selection coefficient is too small for it to be purged.
The alternative hypothesis you gave would need evidence to support it, otherwise it's another 'just so' story. Ask yourself, if this junk is beneficial for adaptation as you hypothesize, why don't bacteria have any? More broadly, why is the amount of junk DNA indirectly proportional to the effective population size (as the null hypothesis predicts)?
But the null-hypothesis remains that it is junk. The hypothesis is that it is not junk, borrowing evidence from neural networks in which (1) things that look like noise are actually distributed representations and (2) in which noise improves robustness and facilitates exploration. This evidence is possibly transferable because both processes, neural network training and evolution can be formalized as a high-dimensional optimization problem (one being informed by gradient information while the other just randomly mutates and exploits ensemble effects of recombination).