D𝜋 - LessWrong

Welcome onboard this IT ship to baldly go where no one as gone before !

Indeed, I just wrote 'when it spikes' and further as the 'low threshold' and no more. I work in complete isolation and some things are so obvious inside my brain that I do not consider them as non obvious to others.

It is part of the 'when' aspect of learning, but uses an internal state of the neuron instead of an external information from the quantilisers.

If there is little reaction to a sample in a neuron (spiking does happen slowly, or not), it is meaningless and you should ignore it. If it comes too fast, it is already 'in' the system and there is no point in adding to it. You are right to say the first rule is more important than the second.

Originally, there was only one threshold instead of 3. When learning, the update would only take place if the threshold was reached after a minimum of two cycles (or 3, but then it converges unbearably slowly), and only for the connections that had been active at least twice. I 'compacted' it for use within one cycle (to make it look simpler), so it was 50% of the threshold minimum, and then adjusted (might as well) that value by scanning around and, then, added the upper threshold, but more to limit the number of updates than to improve the accuracy (although it contributes a small bit). The best result is with 30% and 120%, whatever the size or the other parameters.

Before I write this, I quickly checked on PI-F-MNIST. It is still ongoing, but it seems to hold true even on that dataset (BTW: use quantUpP = 3.4 and quantUpN = 40.8 to get to 90.2% with 792 neurons and 90.5% with 7920).

As it seems you are interested, feel free to contact me through private message. There is plenty more in my bag than can fit in a post or comment. I can provide you some more complete code (this one is triple distilled).

Thank you very much for your interest.

Self-Organised Neural Networks: A simple, natural and efficient way to intelligence

D𝜋2y60

I am going to answer this comment because it is the first to address the analysis section. Thank you.

I close the paragraph saying that there is no functions anywhere and it will aggrieve some. The shift I am trying to suggest is for those who want to analyse the system using mathematics, and could be dismayed by the absence of functions to work with.

Distributions can be a place to start. The quantilisers are a place to restart mathematical analysis. I gave some links to an existing field of mathematical research that is working along those lines.

Check this out: they are looking for a multi-dimensional extension to the concept. Here it is, I suggest.

SONN : What's Next ?

D𝜋2y-10

This introduces a new paradigm. Read T.Kuhn. You cannot compare different paradigms.

Everything that matters is in the post. Read it; really.

What is needed next is engineering, ingenuity and suitable ICs, not maths. The IT revolution came from IT (coders) and ICs, not CS.

As for your recommendation, I have tried so many things over the past four years… I posted here first to get to the source of one of the evidences; to no avail.

Good bye everyone

I am available through private messages

D𝜋's Spiking Network

D𝜋2y30

BP is Back-Propagation.

We are completely missing the plot here.

I had to use a dataset for my explorations and MNIST was simple; and I used PI-MNIST to show an 'impressive' result so that people have to look at it. I expected the 'PI' to be understood, and it is not. Note that I could readily answer the 'F-MNIST challenge'.

If I had just expressed an opinion on how to go about AI, the way I did in the roadmap, it would have been just, rightly, ignored. The point was to show it is not 'ridiculous' and the system fits with that roadmap.

I see that your last post is about complexity science. This is an example of it. The domain of application is nature. Nature is complex, and maths have difficulties with complexity. The field of chaos theory puttered in the 80s for that reason. If you want to know more about it, start with Turing morphogenesis (read the conclusion), then Prigogine. In NN, there is Kohonen.

Some things are theoretical correct, but practically useless. You know how to win the lotto, but nobody does it. Better something simple that works and can be reasoned about, even without a mathematical theory. AI is not quantum physics.

Maybe it could be said that intelligence is to cut through all the details to, then, reason using what is left, but the devil is in those details.

D𝜋's Spiking Network

D𝜋2y30

Also,

No regularisation. I wrote about that in the analysis.

Without max-norm (or maxout, ladder, VAT: all forms of regularisation), BP/SGD only achieves 98.75% (from the dropout -2014- paper).

Regularisation must come from outside the system. - SO can be seen that way - or through local interactions (neighbors). Many papers clearly suggest that should improve the result.

That is yet to do.

D𝜋's Spiking Network

D𝜋2y20

... and it is in this description:

"The spiking network can adjust the weights of the active connections"

D𝜋's Spiking Network

D𝜋2y30

It is not a toolbox you will be using tomorrow.

I applied it to F-MNIST, in a couple of hours after being challenged, to show that is not just only MNIST. I will not do it again, that is not the point.

It is a completely different approach to AGI, that sounds so ridiculous that I had to demonstrate that it is not, by getting near SOTA on one widely used dataset (so PI-MNIST) and finding relevant mathematical evidence.

D𝜋's Spiking Network

D𝜋2y30

I am going after pure BP/SGD, so neural networks (no SVM), no convolution,...

No pre-processing either. That is changing the dataset.

It is just a POC, to make a point: you do not need mathematics for AGI. Our brain does not.

I will publish a follow-up post soon.

D𝜋's Spiking Network

D𝜋2y30

I doubt that this would be the best a MLP can achieve on F-MNIST.

I will put it this way: SONNs and MLPs do the same thing, in a different way. Therefore they should achieve the same accuracy. If this SONN can get near 90%, so should MLPs.

It is likely that nobody has bothered to try 'without convolutions' because it is so old-fashioned.

Convolutions are for repeated locally aggregated correlations.

D𝜋's Spiking Network

D𝜋2y40

Spot on.

I hope your explanation will be better understood than mine. Thank you.

It 'so happens' that MNIST (but not PI) can also be used for basic geometry. That is why I selected it for my exploration (easy switch between the two modes).

LESSWRONG
LW

Posts

Wiki Contributions

Comments