Machine Learning with TensorFlow JS

The main points of contention upon discussing the initial proposal were:

differentiation from CS 4770

differentiation from Node another course taught at Forge

desire to use newer, more exiting AI tools such as OpenAI

simplicity for newer programmers

What you will learn

Through interactive projects, this course seeks to give students a broad understanding of machine learning topics. However, depth with be developed through interactive projects.

The main goals of this course are:

Using machine-learning APIs (ex. OpenAI) in your projects
The principles of building machine-learning systems
Principles of good metric design
Understanding machine learning issues

Why JavaScript?

JavaScript may seem like a strange choice for a machine-learning library. JavaScript was designed for single-threaded operation, web use and is not as performant C, C++ or Rust.

However, JavaScript is portable, flexible and great for describing the structure of models. Thanks to WebGL calculations can take place on the GPU. These considerations make TensorFlowJS, the JavaScript implementation of TensorFlow even slightly faster for smaller models (only when models are far larger does performance drop by ~10 to 15 times).

Given this course’s goal to develop a foundational understanding of machine learning and TensorFlow as well as to create communicable projects, JavaScript is the clear choice.

Atwood’s law

Communicability of JavaScript

When building projects for a resume or otherwise, you should be always biased towards things that someone reviewing your programming portfolio can visualize. Without trying to toot my own horn too much, here is a project I build to analyze International Science and Engineering Fair (ISEF) projects which featured an interactive, web-based visualization.

Graph visualization of ISEF analysis

Being able to send people both an article and a interactive demonstration was incredibly valuable for project communication.

Course Contents

From a general perspective, models will become simpler as you move throughout the semester, but you will have more fine-grained control over them.

The first guided lab / project will be to create a wrapper for gpt-3.5-turbo (also known as ChatGPT) in the command line using javascript and openai’s library.

After this, we will venture into more foundational artificial intelligence including neural networks, natural language processing, time-series estimation, image processing and more. The goal of this course is to learn what their is to learn.

Schedule

Week	Lecture	Workshop	Readings	Take Home
1	What is Machine Learning?	JavaScript Demo	Introduction to JavaScript	None
2	What does Machine Learning Require?	`node` demo + functional programming	Functional Programming in JavaScript	Chat GPT in Terminal - Using OpenAI API
3	Neural Networks	TensorFlow JS Workshop	The Curse of Dimensionality	MNIST Classifier
4	Good data’s importance	Cleaning a real data set	Data collection practices and trends	None
5	Time-Series Processing	TensorFlow Visualization	Types of Attention	Stock Market Predictor
6	Natural Language Processing	Shakespeare Predictor	None	None
7	Unsupervised learning	Clustering Research Papers	Unsupervised PreTraining	None
8	Final Project Advising	None	None	None
9	Final Project Showcase	None	None	None

Sections

Each section here could be thought of a slide on a page which will be mentioned.¹

Introduction to Machine Learning

Breaking down recent advancements in AI
- ChatGPT
- Alpha Go
- Alpha Zero
- Alpha Fold
- Dale 2
Hype-management
Machine-Learning as Advanced Regression
The “Machine Learning Formula”
Machine learning model types (supervised, unsupervised and reinforcement learning)

JavaScript Intro / Refresher

Functional programming
Interpreted languages
v8’s performance advantages
WebGL and WebGPU
Machine-Learning tasks being highly scalable

Exploratory Data Analysis

Data preprocessing
Filling gaps
Visualization Techniques
The Curse of Dimensionality
SciKit Learn’s Machine Learning Types
- Decision Trees
- Support Vector Machines
- Ensemble Learning
Precision versus recall
Loss functions
Hyper-Parameters and their optimization
Paper: The Unreasonable Effectiveness of Data Neural Networks

History

Underlying Linear Algebra
Activation functions (Sigmoid, ReLu and Tanh)
Gradient descent from the chain rule
N-Adam, Adam and Adabost optimizers
Normalization’s importance
Encoding methods
Keras and TensorFlow’s Sequential Model
Confusion Matrices
Project 1: MNIST Handwritten Digit Classifier in TensorFlow JS

Graph Computation

TensorBoard
Just-In Time (JIT) Compilation
Optimizers
Image Processing
Convolution Neural Networks (CNNs)
Keras Data Augmentation
Transfer Learning
Fine-Tuning Models

Time-Series Processing

Structure of time-series networks
Recurrent Neural Networks (RNNs)
Limitations of RNNs
Training RNNs through creating windowed training data
TensorFlow Datasets
Spectrograms and Fast Fourier Transforms (FFTs)

Natural Language Processing

Tokenizers
Memory gap
Memory Modules
Long-Short Term Memory (LSTM)
Gated Recurrent Unit (GRU)
Deep Learning
Paper: Imagenet classification with deep convolutional neural networks’s influence

Garbage in, Garbage out

Reinforcement learning with human feedback
The Transformer Revolution
Transformer Architecture
Peak model size illustrated with a GPT comparison
Production issues with models
Attention saliency
Paper: Attention is all you need

Deployment

TensorFlow Modules
- Serving
- JavaScript
- LITE
Parameter Reduction
Entropy-biased training
Online Algorithms
Data Rot

ChatGPT Policy

Use of ChatGPT will be heavily encouraged uring this course. Because this course is designed to be an introduction to the field, many topics will be covered to yield some of what is out. Because of this, however, topics may not get the full attention they deserve. For this reason, tools like Chat GPT will be encouraged for topics where you know what to do just maybe not how (ex. how do I create an array in JavaScript).

Personally, one of the best uses of Chat GPT is to make programmers multilingual and multi-paradigm.

JavaScript and Python Similarities

For those concerned about the transferability of experience, here is the virtually identical code for creating a basic linear regression in both Python and JavaScript using TensorFlow and TensorFlowJS, respectively.

JavaScript

// Importing the required libraries
const tf = require('@tensorflow/tfjs')

// Defining the structure of the model as being linear
const model = tf.sequential()
model.add(tf.layers.dense({
    units: 1,
    inputShape: [1]
  }))

// Compiling the model so it can be sent to the GPU
model.compile({
    loss: 'meanSquaredError',
    optimizer: 'sgd', // simple gradient descent
  })

// Submitting the input data
const x = tf.tensor2d([[1], [2], [3], [4]], [4, 1]); // [4, 1] is the dimensions of the input
const y = tf.tensor2d([[2], [4], [6], [8]], [4, 1]); // [4, 1] is the dimensions of the input

// Training the model
model.fit(x, y, {
    epochs: 100
  }).then(() => {
      console.log("Model trained successfully")
  })

Python

# Importing the required libraries
import tensorflow as tf
from tensorflow import keras

# Define the model architecture
model = keras.Sequential()
model.add(keras.layers.Dense(units=1, input_shape=[1]))

# Compile the model
model.compile(optimizer='sgd', loss='mean_squared_error')

# Prepare the data
x_train = [[1], [2], [3], [4]]
y_train = [[2], [4], [6], [8]]

# Train the model
model.fit(x_train, y_train, epochs=100)

The code for both models is structurally identical and time spent learning TensorFlow JS will transfer over.

TensorFlow can also be tricky to install, particularly on Windows¹. Even more insidious, TensorFlow takes up really 1.1GB. TensorFlowJS is far more lightweight at only 9 MB, 100x smaller.

// Saving models is really easy
await model.save('downloads://model_name')

Most Windows uses typically solve this problem by using an online Jupyter notebook such as Kaggle or Google Collabrotory. If you would like to try this, I much prefer Kaggle and have had many issues resolved in the past by simply running the same code in a Kaggle.

Most topics covered will only be done so in a brief capacity. A core design principle of this course is giving you a broader knowledge of what is out there, but going hands on a much smaller subset of topics.