If you’ve read even a little about artificial intelligence (AI), you’ve probably seen the term parameters in headlines about massive models with billions of them.
You’ve probably read or heard phrases like “a neural network can contain up to millions of parameters,” or “calculating parameters requires significant computational power from a GPU.”
Parameters are the values a model learns. In neural networks, weights are a specific type of parameter that determine the strength of connections between neurons.The essence of AI lies in its ability to make predictions or decisions based on patterns it has learned from data, even when it comes across something it hasn’t seen before.
We’d like to share an example of how these neural network weights work and why they’re fundamental to AI’s ability to generalize and predict.
AI neural network prediction vs. simply looking up information
Let’s say you have a database with information on all houses in a city.
You have their size in square feet, their age and their market price. If you want the price of a specific house that’s already in that database, you simply look it up.
Here, no AI is needed.
But what if you need to know the price of a house that’s not in the database?
Now the system must predict the price based on some of the characteristics it knows. This is where AI can be helpful.
How neural network weights enable AI predictions
If you already have the price data for some homes in a city, we can plot square footage vs. price and try to fit a straight line. For example:
y = 50x + 200000
(In this equation, x is square footage and y is the predicted price.)
This equation predicts that a 1000 sq. ft home would cost $250,000. This is a prediction based on a formula, a linear model.
However, this initial model may be far off. If real homes that are 1000 sq. ft. actually cost around $150,000, our prediction is off by $100,000.
You can then try a new line:
y = 130x + 28066
Now the prediction for a 1000 sq. ft. home is $158,066, which is a closer price estimate. This process of adjusting the weight (130) and the bias (28066) improves our model.
Note: This house price example is relatively simple. There are only two variables and there is a linear relationship. Real-world data, however, isn’t often this simple. For complex patterns, like recognizing faces or understanding speech, you need models that can handle non-linear relationships. That’s what neural networks are for.
Training a neural network
In machine learning, this process of improving a model is called training.
During training, a neural network tests many combinations of weights and biases to find the best fit for the data. This is done by calculating the loss function, which is a measure of how far the model’s predictions are from the actual values.
A lower loss function means a better model, and the training process seeks to minimize this loss.
Learn more by checking out this blog: What is AI inference VS. AI training?
Why powerful GPUs are needed for AI
Even though our house price example is simple, it’s meant to convey how neural networks and AI use weights, loss functions and training to learn from data and make better predictions.
Training neural networks involves adjusting potentially millions of weights simultaneously, which is a computationally-intensive task. That’s why GPUs, which can handle many calculations in parallel, are necessary for modern AI!
Massed Compute provides NVIDIA cloud GPUs for training your own AI models without the need for you to invest on your own hardware. Sign up and explore our marketplace today!
Use the coupon code MassedComputeResearch for 15% off any GPU rental