Gradient Descent Algorithm using Pandas + GIF Visualization

Question

Happy new year everyone, Following Andrew NG's course on coursera here's my implementation for the gradient descent algorithm(univariate linear regression) in Python using pandas, matplotlib with optional visualization which creates a GIF. Any optimizations/suggestions are welcome.

For those who don't know what gradient descent algorithm is: Gradient descent is a first-order iterative optimization algorithm for finding the local minimum of a function. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or approximate gradient) of the function at the current point. If, instead, one takes steps proportional to the positive of the gradient, one approaches a local maximum of that function. You can check the definition on wikipedia.

Ex1 GIF:

Ex2 GIF:

Links for the datasets used:

ex1:

https://drive.google.com/open?id=1tztcXVillZTrbPeeCd28djRooM5nkiBZ

ex2:

https://drive.google.com/open?id=17ZQ4TLA7ThtU-3J-G108a1fzCH72nSFp

CODE

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import random
import imageio
import os


def compute_cost(b, m, data):
    """
    Compute cost function for univariate linear regression using mean squared error(MSE).
    Args:
        b: Intercept (y = mx + b).
        m: Slope (y = mx + b).
        data: A pandas df with x and y data.
    Return:
         Cost function value.
    """
    data['Mean Squared Error(MSE)'] = (data['Y'] - (m * data['X'] + b)) ** 2
    return data['Mean Squared Error(MSE)'].sum().mean()


def adjust_gradient(b_current, m_current, data, learning_rate):
    """
    Adjust Theta parameters for the univariate linear equation y^ = hθ(x) = θ0 + θ1x
    Args:
        b_current: Current Intercept (y = mx + b) or [Theta(0) θ0].
        m_current: Current Slope (y = mx + b) or [Theta(1) θ1].
        data: A pandas df with x and y data.
        learning_rate: Alpha value.
    Return:
        Adjusted Theta parameters.
    """
    data['b Gradient'] = -(2 / len(data)) * (data['Y'] - ((m_current * data['X']) + b_current))
    data['m Gradient'] = -(2 / len(data)) * data['X'] * (data['Y'] - ((m_current * data['X']) + b_current))
    new_b = b_current - (data['b Gradient'].sum() * learning_rate)
    new_m = m_current - (data['m Gradient'].sum() * learning_rate)
    return new_b, new_m


def gradient_descent(data, b, m, learning_rate, max_iter, visual=False):
    """
    Optimize Theta values for the univariate linear regression equation y^ = hθ(x) = θ0 + θ1x.
    Args:
        data: A pandas df with x and y data.
        b: Starting b (θ0) value.
        m: Starting m (θ1) value.
        learning_rate: Alpha value.
        max_iter: Maximum number of iterations.
        visual: If True, a GIF progression will be generated.
    Return:
        Optimized values for θ0 and θ1.
    """
    line = np.arange(len(data))
    folder_name = None
    if visual:
        folder_name = str(random.randint(10 ** 6, 10 ** 8))
        os.mkdir(folder_name)
        os.chdir(folder_name)
    for i in range(max_iter):
        b, m = adjust_gradient(b, m, data, learning_rate)
        if visual:
            data['Line'] = (line * m) + b
            data.plot(kind='scatter', x='X', y='Y', figsize=(8, 8), marker='x', color='r')
            plt.plot(data['Line'], color='b')
            plt.grid()
            plt.title(f'y = {m}x + {b}\nCurrent cost: {compute_cost(b, m, data)}\nIteration: {i}\n'
                      f'Alpha = {learning_rate}')
            fig_name = ''.join([str(i), '.png'])
            plt.savefig(fig_name)
            plt.close()
    if visual:
        frames = os.listdir('.')
        frames.sort(key=lambda x: int(x.split('.')[0]))
        frames = [imageio.imread(frame) for frame in frames]
        imageio.mimsave(folder_name + '.gif', frames)
    return b, m


if __name__ == '__main__':
    data = pd.read_csv('data.csv')
    data.columns = ['X', 'Y']
    learning = 0.00001
    initial_b, initial_m = 0, 0
    max_it = 350
    b, m = gradient_descent(data, initial_b, initial_m, learning, max_it, visual=True)

Gaslight Deceive Subvert · Accepted Answer · 2020-01-02 17:26:03Z

Nothing happens!

When I ran the script my first thought was that it had hung because nothing happened. Apparently the calculations are just slow on my computer.

To reassure the user that the program is working, have some output indicating progress. Here's how I added it, but you can make it way fancier with progress bars and stuff:

n_reports = 20
for i in range(max_iter):
    if max_iter >= n_reports and i % (max_iter // n_reports) == 1:
        print(f'{(i * 100 // max_iter)}% ', end = '', flush = True)
    ...
print('100%')

n_reports is the number of times the completion percentage is printed. An impatient user wants it printed more often, a patient one (or one with a faster computer) wants it printed less often.

Temporary directories

Python has a module for temporary files and directories which you should use instead of creating your own. Also, never change the process current working directory (the chdir call) unless you have to. It can cause very strange problems.

The tempfile and pathlib modules simplifies file handling:

if visual:
    save_dir = pathlib.Path(tempfile.mkdtemp())
...
for i in range(max_iter):
    ...
    if visual:
        ...
        fig_name = f'{i:05}.png'
        plt.savefig(save_dir / fig_name)
...
if visual:
    frames = sorted([f.resolve() for f in save_dir.iterdir()])
    frames = [imageio.imread(frame) for frame in frames]
    image_file = save_dir.name + '.gif'
    imageio.mimsave(image_file, frames)
    print(f'Saved image as {image_file}')

I also made it so that filenames are padded with zeroes. That way, you can rely on the lexicographical order and you don't have to write your own key function.

What does the flush = True in the print statement do? and does the temporary directory (with the .png frames) disappear after the execution and if so, where the .gif file will be saved? and what do you think of the implementation of the algorithm? any suggestions you have regarding how to optimize close to sklearn auto-optimization done by the LinearRegression()? — watch-this, Commented Jan 2, 2020 at 21:37
It flushes so that the output becomes visible despite lacking an ending newline character. The temporary directory has to be removed with shutil.rmtree(save_dir). The gif is saved in the current directory. Your gradient descent implementation is fine, but lacks learning rate scheduling and early stoppage (detection if an optima is reached). Of course, it is not the right tool for the job and there are better methods for univariate linear regression. — Gaslight Deceive Subvert, Commented Jan 2, 2020 at 22:59
Thanks for the feedback and I know using LinearRegression() is sufficient for the job, I've done it for the sake of understanding how things work as i'm following Andrew NG's course(which is done in Octave and Matlab) so I have to convert everything I learn to Python. — watch-this, Commented Jan 2, 2020 at 23:15

Stack Exchange Network

Gradient Descent Algorithm using Pandas + GIF Visualization

1 Answer 1

Nothing happens!

Temporary directories

Hot Network Questions

Gradient Descent Algorithm using Pandas + GIF Visualization

1 Answer 1

Nothing happens!

Temporary directories

Related

Hot Network Questions