# Neural Network xor example from scratch (no libs)

jacek
3,598 views

### This is a preview

This playground version isn't public and is work in progress. ## Introduction

XOR example is a toy problem in machine learning community, a hello world for introducing neural networks. It means you have to build and train the neural network so that given 2 inputs it will output what a XOR function would output (at least close to it). This isn't math heavy explanatory tutorial, there are plenty of them out there. I assume you have some vague knowledge of neural networks and try to write a simple one. This article is just a bunch of simple python scripts that implement neural networks. No numpy or other libraries are used, so they should be easily translatable to other languages.

All the scripts use stochastic gradient descent to train the neural network, one data row at a time, so no need for matrix tranpositions. The loss function is mean squared error.

## First script

This is the simplest script, an implementation of .

Here the neural network is just a bunch of loosely written variables.

Example output:

``````epoch 1000 mean squared error: 0.2499876271419115
epoch 2000 mean squared error: 0.2499688242837126
epoch 3000 mean squared error: 0.24988612392100873
epoch 4000 mean squared error: 0.24903213808270375
epoch 5000 mean squared error: 0.20392068756493792
epoch 6000 mean squared error: 0.06346297881590131
epoch 7000 mean squared error: 0.01137474589491641
epoch 8000 mean squared error: 0.005176747319816359
epoch 9000 mean squared error: 0.0031937304736529845
epoch 10000 mean squared error: 0.0022656890991194886
0 0 0.027649625886219092
1 0 0.95846511144229
0 1 0.9433905288343537
1 1 0.05803856813942385
``````

Your mileage may vary. Sometimes this simple net will diverge and output for all inputs the 0.666..., or it would need more iterations to train. It's normal as it is more sensitive to starting random weights than more complex models. NN libraries suffer from that too, but they can mitigate it by smarter weights initialization. You can play around with the learning rate (alpha) or the random bounds (VARIANCE_W, VARIANCE_B).  