Self-Pruning Neural Network

Overview

Modern neural networks are powerful, but they are often over-parameterized. This leads to unnecessary computational cost, increased memory usage, and inefficiencies during deployment, especially in resource-constrained environments such as mobile devices or real-time systems.

This project presents a practical approach to addressing this issue by enabling a neural network to identify and remove less important connections automatically, resulting in a smaller, more efficient model without significantly compromising performance.

Problem Statement

Traditional neural networks learn dense representations where every weight contributes to the final output. However, in practice, a large portion of these weights are redundant. This redundancy:

Increases inference time
Consumes unnecessary memory
Makes deployment harder on edge devices
Adds cost in production environments

The core challenge is to retain only the most important connections while maintaining model accuracy.

Proposed Solution

This project implements a self-pruning neural network using a structured and controlled pruning strategy.

Instead of relying on unstable or heuristic-based pruning methods, the approach separates learning and pruning into two clear phases:

1. Standard Training

The model is first trained normally on the CIFAR-10 dataset. During this phase:

All weights are active
The network learns meaningful representations
No artificial constraints are imposed

This ensures that the model reaches a stable and well-optimized state before pruning is applied.

2. Controlled Pruning via Top-K Selection

After training, pruning is applied using a deterministic method:

Each weight is associated with a learned importance score (gate)
A keep ratio is defined (e.g., 0.5 keeps 50% of weights)
Only the top-K most important weights are retained
Remaining weights are set to zero

This guarantees:

Precise control over sparsity
Stable and reproducible results
Avoidance of model collapse

3. Systematic Evaluation

To ensure reliability:

Multiple pruning levels are tested (from 10% to 90% sparsity)
Each configuration is run multiple times
Results are averaged to reduce randomness

This produces a consistent and trustworthy understanding of how pruning affects performance.

Results and Analysis

The following plot summarizes the relationship between sparsity and accuracy:

Key Observations

The model maintains near-constant accuracy up to approximately 70% sparsity
This indicates a high degree of redundancy in the network
Beyond 80% sparsity, accuracy begins to degrade noticeably
At extreme pruning levels (around 90%), performance drops sharply

Interpretation

There exists a clear operating region where significant compression is possible without major performance loss. This region represents the optimal balance between efficiency and accuracy.

Advantages of This Approach

1. Deterministic and Stable

Unlike many pruning techniques, this method avoids instability by using a clear selection mechanism rather than relying on indirect regularization effects.

2. Controllable Sparsity

The pruning level is explicitly defined through the keep ratio, making it easy to tailor the model for different deployment needs.

3. Preserves Performance

The model retains strong accuracy even after removing a large percentage of weights.

4. Simple and Practical

The approach is straightforward to implement and does not require complex modifications to the training process.

Practical Impact

This approach has direct implications for real-world systems:

Faster inference due to reduced computation
Lower memory footprint, enabling deployment on edge devices
Reduced operational costs in large-scale systems
Improved scalability for production environments

Business Perspective

In production systems, efficiency translates directly into cost savings.

In cloud environments, fewer computations mean reduced infrastructure usage
In mobile applications, smaller models improve battery life and responsiveness
In large-scale AI services, pruning can significantly lower serving costs

This makes pruning not just a technical optimization, but a business-critical capability.

Conclusion

This project demonstrates that neural networks can be significantly compressed without sacrificing much performance. By combining standard training with controlled pruning, it is possible to build models that are both efficient and reliable.

The results highlight an important insight:
a large portion of learned parameters are not essential for maintaining performance.

This opens the door to building smarter, leaner, and more deployable AI systems.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
experiments		experiments
models		models
results		results
training		training
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Pruning Neural Network

Overview

Problem Statement

Proposed Solution

1. Standard Training

2. Controlled Pruning via Top-K Selection

3. Systematic Evaluation

Results and Analysis

Key Observations

Interpretation

Advantages of This Approach

1. Deterministic and Stable

2. Controllable Sparsity

3. Preserves Performance

4. Simple and Practical

Practical Impact

Business Perspective

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Self-Pruning Neural Network

Overview

Problem Statement

Proposed Solution

1. Standard Training

2. Controlled Pruning via Top-K Selection

3. Systematic Evaluation

Results and Analysis

Key Observations

Interpretation

Advantages of This Approach

1. Deterministic and Stable

2. Controllable Sparsity

3. Preserves Performance

4. Simple and Practical

Practical Impact

Business Perspective

Conclusion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages