Ridurre - Convolutional Filter Pruning

- 4 mins

Pruning is the process when we try to shrink a network by removing the redundant and “not so significant” filters.

The Ridurre package is a mini-framework which you can easily use on your existing Keras models, and also you can define your own pruning methods without any struggle.

The use of the package can be visualized in the following way:

Therefore the steps in a pruning process:

Now we can see how the method is done with these simple steps, let’s talk about the metric which was mentioned before.

Pruning Metrics

You can think about almost anything which can be a metric for pruning. The question is, which one is better than the others. One of the best solution would be to measure the correlation of the filters and the loss/accuracy: which filter can I drop in order to have the least change in accuracy. This is called the Oracle Pruning and yes, this is a brute force solution which would take a lot of time to execute.

So we need other solutions which can reduce this time and this means, we need to make calculations on the filters which can rank those in a layer from least to most important. I would like to summarize my favourite two of them, which you can try out in the Ridurre package.

There are a lot others which can bring great results, but keep in mind that the baseline should be an already (mmanually) small network or the random pruning which is the simplest among them.

Case Study - Cifar10 with a ResNet20

Why Cifar10 and a shallow network?

(Unfortunately) with my hardware, this is what I could run, but in the future I would like to produce better results with not just classification, but object detection, semantic segmentation and keypoint detection.

With that being said I will let the plots do the talking for me. This case study is available as an example for the package.

As we can see the train and validation accuracy dropped at the end of the pruning, but by just a little in contrast with the parameter loss. Initially we had $220.000$ parameters which was reduced below $80.000$ which means almost $64%$ reduction. Also it is worth saying that this example is a pretty aggressive pruning as we remove filters from every layer with $\text{clustering factor} = 0.9$. So at the end of the pruning our first few layers will have $10$ filters instead of the original $16$.

A more sufficient process would be to freeze the first few layers and have a soft pruning factor. That way we could have shorter fine-tunings and the model could stay more accurate, but of course it would take more time to prune.

I hope you enjoyed this project and you will use or at least try out the Ridurre package.

References

[1] - Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks

comments powered by Disqus
rss facebook twitter github gitlab youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora