Skip to content


This is the documentation for video_features. A library (😅) that allows you to extract features from raw videos using the pre-trained nets. So far, it supports several extractors that capture visual appearance, calculates optical flow, and, even, audio features. The source code lives at v-iashin/video_features.

The source code was intended to support the feature extraction pipeline for two of my papers (BMT and MDVC). This library (😅) somehow emerged out of that code and now has more models implemented.

Supported models

If you would like to see more, please create an Issue.

Action Recognition

Sound Recognition

Optical Flow

Image Recognition