Digit and English Letter Classification Convolutional Neural Network (Source Code Included)
Note: I wrote an app to classify a handwritten digit or uppercase letter
here using a trained CNN model, all by myself.
Here's the app classifying a digit:
Here's the app classifying an English letter:
Model & System Overview
This project implements a custom Convolutional Neural Network (CNN) for
handwritten digit (0–9) and uppercase letter (A–Z) recognition. The model
accepts a 28×28 grayscale image and outputs a predicted class along with a
confidence score.
CNN Architecture
The network consists of three convolutional blocks with increasing channel
depth (32 → 64 → 128), each followed by batch normalization and max pooling.
This design allows the model to progressively learn low-level strokes,
mid-level shapes, and higher-level character structures.
After feature extraction, the model flattens the activations and passes them
through a fully connected layer with dropout regularization to mitigate
overfitting. A final softmax layer outputs probabilities across 47 classes.
Training & Data Preprocessing
The model was trained on Kaggle using MNIST-style datasets. During
experimentation, I observed that many MNIST samples are not oriented in a
way that directly matches natural human drawing input. To address this, I
applied orientation-aware preprocessing during training so that inference
on user-drawn inputs requires only normalization, without any additional
rotation or alignment steps.
This approach simplifies the prediction pipeline and reduces latency during
real-time inference in the web application.
Deployment & Compatibility Challenges
After exporting the trained model as a
.h5 file, I implemented a
Python inference service to load the model and perform predictions. However,
the model was trained using a newer Keras version (3.10.0, TensorFlow ≥ 2.14),
while my CentOS 7 VPS environment was constrained to an older TensorFlow
release (2.6.2), making a direct upgrade infeasible.
To resolve this, I containerized the inference service using Docker, allowing
the application to run with a compatible Python and TensorFlow runtime
independent of the host system. This ensured reproducibility and eliminated
version conflicts across environments.
Frontend Integration
A React-based frontend was built to allow users to draw digits or uppercase
letters directly in the browser. The drawing is normalized and sent to the
Python backend via a REST API, which returns both the predicted label and a
confidence score.
During integration, cross-origin (CORS) restrictions were encountered and
resolved by explicitly configuring CORS middleware in the FastAPI backend.
This enables secure and seamless communication between the React frontend
and the Dockerized inference service.
The implementation is straightforward. Below is the source code for the CNN I built in Python:
def build_same_model():
model = Sequential([
Conv2D(32, 3, padding='same', activation='relu',
input_shape=(28, 28, 1)),
BatchNormalization(),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
BatchNormalization(),
MaxPooling2D(),
Conv2D(128, 3, padding='same', activation='relu'),
BatchNormalization(),
MaxPooling2D(),
Flatten(),
Dense(256, activation='relu'),
Dropout(0.5),
Dense(47, activation='softmax')
])
model.compile(
optimizer=Adam(learning_rate=1e-3),
loss='categorical_crossentropy',
metrics=['accuracy']
)
return model
Any comments? Feel free to participate below in the Facebook comment section.