Skip to content

11saishiva/self-pruning-neural-network

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Self-Pruning Neural Network

Overview

Modern neural networks are powerful, but they are often over-parameterized. This leads to unnecessary computational cost, increased memory usage, and inefficiencies during deployment, especially in resource-constrained environments such as mobile devices or real-time systems.

This project presents a practical approach to addressing this issue by enabling a neural network to identify and remove less important connections automatically, resulting in a smaller, more efficient model without significantly compromising performance.


Problem Statement

Traditional neural networks learn dense representations where every weight contributes to the final output. However, in practice, a large portion of these weights are redundant. This redundancy:

  • Increases inference time
  • Consumes unnecessary memory
  • Makes deployment harder on edge devices
  • Adds cost in production environments

The core challenge is to retain only the most important connections while maintaining model accuracy.


Proposed Solution

This project implements a self-pruning neural network using a structured and controlled pruning strategy.

Instead of relying on unstable or heuristic-based pruning methods, the approach separates learning and pruning into two clear phases:

1. Standard Training

The model is first trained normally on the CIFAR-10 dataset. During this phase:

  • All weights are active
  • The network learns meaningful representations
  • No artificial constraints are imposed

This ensures that the model reaches a stable and well-optimized state before pruning is applied.


2. Controlled Pruning via Top-K Selection

After training, pruning is applied using a deterministic method:

  • Each weight is associated with a learned importance score (gate)
  • A keep ratio is defined (e.g., 0.5 keeps 50% of weights)
  • Only the top-K most important weights are retained
  • Remaining weights are set to zero

This guarantees:

  • Precise control over sparsity
  • Stable and reproducible results
  • Avoidance of model collapse

3. Systematic Evaluation

To ensure reliability:

  • Multiple pruning levels are tested (from 10% to 90% sparsity)
  • Each configuration is run multiple times
  • Results are averaged to reduce randomness

This produces a consistent and trustworthy understanding of how pruning affects performance.


Results and Analysis

The following plot summarizes the relationship between sparsity and accuracy:

Trade-off

Key Observations

  • The model maintains near-constant accuracy up to approximately 70% sparsity
  • This indicates a high degree of redundancy in the network
  • Beyond 80% sparsity, accuracy begins to degrade noticeably
  • At extreme pruning levels (around 90%), performance drops sharply

Interpretation

There exists a clear operating region where significant compression is possible without major performance loss. This region represents the optimal balance between efficiency and accuracy.


Advantages of This Approach

1. Deterministic and Stable

Unlike many pruning techniques, this method avoids instability by using a clear selection mechanism rather than relying on indirect regularization effects.

2. Controllable Sparsity

The pruning level is explicitly defined through the keep ratio, making it easy to tailor the model for different deployment needs.

3. Preserves Performance

The model retains strong accuracy even after removing a large percentage of weights.

4. Simple and Practical

The approach is straightforward to implement and does not require complex modifications to the training process.


Practical Impact

This approach has direct implications for real-world systems:

  • Faster inference due to reduced computation
  • Lower memory footprint, enabling deployment on edge devices
  • Reduced operational costs in large-scale systems
  • Improved scalability for production environments

Business Perspective

In production systems, efficiency translates directly into cost savings.

  • In cloud environments, fewer computations mean reduced infrastructure usage
  • In mobile applications, smaller models improve battery life and responsiveness
  • In large-scale AI services, pruning can significantly lower serving costs

This makes pruning not just a technical optimization, but a business-critical capability.


Conclusion

This project demonstrates that neural networks can be significantly compressed without sacrificing much performance. By combining standard training with controlled pruning, it is possible to build models that are both efficient and reliable.

The results highlight an important insight:
a large portion of learned parameters are not essential for maintaining performance.

This opens the door to building smarter, leaner, and more deployable AI systems.

About

A neural network that automatically prunes redundant weights to improve efficiency while maintaining accuracy.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages