2017-08-07T06:38:45Z

Flask Video Streaming Revisited

Flask Video Streaming Server

Almost three years ago I wrote an article on this blog titled Video Streaming with Flask, in which I presented a very modest streaming server that used a Flask generator view function to stream a Motion-JPEG stream to web browsers. My intention with that article was to show a simple, yet practical use of streaming responses, a not very well known feature in Flask.

That article is extremely popular, but not because it teaches how to implement streaming responses, but because a lot of people want to implement streaming video servers. Unfortunately, my focus when I wrote the article was not on creating a robust video server, so I frequently get questions and requests for advice from those who want to use the video server for a real application and quickly find its limitations. So today I'm going to revisit my streaming video server and describe a few improvements I've made to it.

Recap: Using Flask's Streaming for Video

I recommend you read the original article to familiarize yourself with my project. In short, this is a Flask server that uses a streaming response to provide a stream of video frames captured from a camera in Motion JPEG format. This format is very simple and not the most efficient, but has the advantage that all browsers support it natively and without any client-side scripting required. It is a fairly common format used by security cameras for that reason. To demonstrate the server, I implemented a camera driver for a Raspberry Pi with its camera module. For those that didn't have a Pi with a camera at hand, I also wrote an emulated camera driver that streams a sequence of jpeg images stored on disk.

Running the Camera Only When There Are Viewers

One aspect of the original streaming server that people did not like is that the background thread that captures video frames from the Raspberry Pi camera starts when the first client connects to the stream, but then it never stops. A more efficient way to handle this background thread is to only have it running while there are viewers, so that the camera can be turned off when nobody is connected.

I implemented this improvement a while ago. The idea is that every time a frame is accessed by a client the current time of that access is recorded. The camera thread checks this timestamp and if it finds it is more than ten seconds old it exits. With this change, when the server runs for ten seconds without any clients it will shut its camera off and stop all background activity. As soon as a client connects again the thread is restarted.

Here is a brief description of the changes:

class Camera(object):
    # ...
    last_access = 0  # time of last client access to the camera

    # ...

    def get_frame(self):
        Camera.last_access = time.time()
        # ...

    @classmethod
    def _thread(cls):
        with picamera.PiCamera() as camera:
            # ...
            for foo in camera.capture_continuous(stream, 'jpeg', use_video_port=True):
                # ...
                # if there hasn't been any clients asking for frames in
                # the last 10 seconds stop the thread
                if time.time() - cls.last_access > 10:
                    break
        cls.thread = None

Simplifying the Camera Class

A common problem that a lot of people mentioned to me is that it is hard to add support for other cameras. The Camera class that I implemented for the Raspberry Pi is fairly complex because it uses a background capture thread to talk to the camera hardware.

To make this easier, I decided to move the generic functionality that does all the background processing of frames to a base class, leaving only the task of getting the frames from the camera to implement in subclasses. The new BaseCamera class in module base_camera.py implements this base class. Here is what this generic thread looks like:

class BaseCamera(object):
    thread = None  # background thread that reads frames from camera
    frame = None  # current frame is stored here by background thread
    last_access = 0  # time of last client access to the camera
    # ...

    @staticmethod
    def frames():
        """Generator that returns frames from the camera."""
        raise RuntimeError('Must be implemented by subclasses.')

    @classmethod
    def _thread(cls):
        """Camera background thread."""
        print('Starting camera thread.')
        frames_iterator = cls.frames()
        for frame in frames_iterator:
            BaseCamera.frame = frame

            # if there hasn't been any clients asking for frames in
            # the last 10 seconds then stop the thread
            if time.time() - BaseCamera.last_access > 10:
                frames_iterator.close()
                print('Stopping camera thread due to inactivity.')
                break
        BaseCamera.thread = None

This new version of the Raspberry Pi's camera thread has been made generic with the use of yet another generator. The thread expects the frames() method (which is a static method) to be a generator implemented in subclasses that are specific to different cameras. Each item returned by the iterator must be a video frame, in jpeg format.

Here is how the emulated camera that returns static images can be adapted to work with this base class:

class Camera(BaseCamera):
    """An emulated camera implementation that streams a repeated sequence of
    files 1.jpg, 2.jpg and 3.jpg at a rate of one frame per second."""
    imgs = [open(f + '.jpg', 'rb').read() for f in ['1', '2', '3']]

    @staticmethod
    def frames():
        while True:
            time.sleep(1)
            yield Camera.imgs[int(time.time()) % 3]

Note how in this version the frames() generator forces a frame rate of one frame per second by simply sleeping that amount between frames.

The camera subclass for the Raspberry Pi camera also becomes much simpler with this redesign:

import io
import picamera
from base_camera import BaseCamera

class Camera(BaseCamera):
    @staticmethod
    def frames():
        with picamera.PiCamera() as camera:
            # let camera warm up
            time.sleep(2)

            stream = io.BytesIO()
            for foo in camera.capture_continuous(stream, 'jpeg', use_video_port=True):
                # return current frame
                stream.seek(0)
                yield stream.read()

                # reset stream for next frame
                stream.seek(0)
                stream.truncate()

OpenCV Camera Driver

A fair number of users complained that they did not have access to a Raspberry Pi equipped with a camera module, so they could not try this server with anything other than the emulated camera. Now that adding camera drivers is much easier, I wanted to also have a camera based on OpenCV, which supports most USB webcams and laptop cameras. Here is a simple camera driver for it:

import cv2
from base_camera import BaseCamera

class Camera(BaseCamera):
    @staticmethod
    def frames():
        camera = cv2.VideoCapture(0)
        if not camera.isOpened():
            raise RuntimeError('Could not start camera.')

        while True:
            # read current frame
            _, img = camera.read()

            # encode as a jpeg image and return it
            yield cv2.imencode('.jpg', img)[1].tobytes()

With this class, the first video camera reported by your system will be used. If you are using a laptop, this is likely your internal camera. If you are going to use this driver, you need to install the OpenCV bindings for Python:

$ pip install opencv-python

Camera Selection

The project now supports three different camera drivers: emulated, Raspberry Pi and OpenCV. To make it easier to select which driver to use without having to edit the code, the Flask server looks for a CAMERA environment variable to know which class to import. This variable can be set to pi or opencv, and if it isn't set, then the emulated camera is used by default.

The way this is implemented is fairly generic. Whatever the value of the CAMERA environment variable is, the server will expect the driver to be in a module named camera_$CAMERA.py. The server will import this module and then look for a Camera class in it. The logic is actually quite simple:

from importlib import import_module
import os

# import camera driver
if os.environ.get('CAMERA'):
    Camera = import_module('camera_' + os.environ['CAMERA']).Camera
else:
    from camera import Camera

For example, to start an OpenCV session from bash, you can do this:

$ CAMERA=opencv python app.py

From a Windows command prompt you can do the same as follows:

$ set CAMERA=opencv
$ python app.py

Performance Improvements

Another observation that was made a few times is that the server consumes a lot of CPU. The reason for this is that there is no synchronization between the background thread capturing frames and the generator feeding those frames to the client. Both run as fast as they can, without regards for the speed of the other.

In general it makes sense for the background thread to run as fast as possible, because you want the frame rate to be as high as possible for each client. But you definitely do not want the generator that delivers frames to a client to ever run at a faster rate than the camera is producing frames, because that would mean duplicate frames will be sent to the client. While these duplicates do not cause any problems, they increase CPU and network usage without any benefit.

So there needs to be a mechanism by which the generator only delivers original frames to the client, and if the delivery loop inside the generator is faster than the frame rate of the camera thread, then the generator should wait until a new frame is available, so that it paces itself to match the camera rate. On the other side, if the delivery loop runs at a slower rate than the camera thread, then it should never get behind when processing frames, and instead it should skip frames to always deliver the most current frame. Sounds complicated, right?

What I wanted as a solution here is to have the camera thread signal the generators that are running when a new frame is available. The generators can then block while they wait for the signal before they deliver the next frame. In looking through synchronization primitives, I've found that threading.Event is the one that matches this behavior. So basically, each generator should have an event object, and then the camera thread should signal all the active event objects to inform all the running generators when a new frame is available. The generators deliver the frame and reset their event objects, and then go back to wait on them again for the next frame.

To avoid having to add event handling logic in the generator, I decided to implement a customized event class that uses the thread id of the caller to automatically create and manage a separate event for each client thread. This is somewhat complex, to be honest, but the idea came from how Flask's context local variables are implemented. The new event class is called CameraEvent, and has wait(), set(), and clear() methods. With the support of this class, the rate control mechanism can be added to the BaseCamera class:

class CameraEvent(object):
    # ...

class BaseCamera(object):
    # ...
    event = CameraEvent()

    # ...

    def get_frame(self):
        """Return the current camera frame."""
        BaseCamera.last_access = time.time()

        # wait for a signal from the camera thread
        BaseCamera.event.wait()
        BaseCamera.event.clear()

        return BaseCamera.frame

    @classmethod
    def _thread(cls):
        # ...
        for frame in frames_iterator:
            BaseCamera.frame = frame
            BaseCamera.event.set()  # send signal to clients

            # ...

The magic that is done in the CameraEvent class enables multiple clients to be able to wait individually for a new frame. The wait() method uses the current thread id to allocate an individual event object for each client and wait on it. The clear() method will reset the event associated with the caller's thread id, so that each generator thread can run at its own speed. The set() method called by the camera thread sends a signal to the event objects allocated for all clients, and will also remove any events that aren't being serviced by their owners, because that means that the clients associated with those events have closed the connection and are gone. You can see the implementation of the CameraEvent class in the GitHub repository.

To give you an idea of the magnitude of the performance improvement, consider that the emulated camera driver consumed about 96% CPU before this change because it was constantly sending duplicate frames at a rate much higher than the one frame per second being produced. After these changes, the same stream consumes about 3% CPU. In both cases there was a single client viewing the stream. The OpenCV driver went from about 45% CPU down to 12% for a single client, with each new client adding about 3%.

Production Web Server

Lastly, I think if you plan to use this server for real, you should use a more robust web server than the one that comes with Flask. A very good choice is to use Gunicorn:

$ pip install gunicorn

With Gunicorn, you can run the server as follows (remember to set the CAMERA environment variable to the selected camera driver first):

$ gunicorn --threads 5 --workers 1 --bind 0.0.0.0:5000 app:app

The --threads 5 option tells Gunicorn to handle at most five concurrent requests. That means that with this number you can get up to five clients to watch the stream simultaneously. The --workers 1 options limits the server to a single process. This is required because only one process can connect to a camera to capture frames.

You can increase the number of threads some, but if you find that you need a large number, it will probably be more efficient to use an asynchronous framework instead of threads. Gunicorn can be configured to work with the two frameworks that are compatible with Flask: gevent and eventlet. To make the video streaming server work with these frameworks, there is one small addition to the camera background thread:

class BaseCamera(object):
    # ...
   @classmethod
    def _thread(cls):
        # ...
        for frame in frames_iterator:
            BaseCamera.frame = frame
            BaseCamera.event.set()  # send signal to clients
            time.sleep(0)
            # ...

The only change here is the addition of a sleep(0) in the camera capture loop. This is required for both eventlet and gevent, because they use cooperative multitasking. The way these frameworks achieve concurrency is by having each task release the CPU either by calling a function that does network I/O or explicitly. Since there is no I/O here, the sleep call is what achieves the CPU release.

Now you can run Gunicorn with the gevent or eventlet workers as follows:

$ CAMERA=opencv gunicorn --worker-class gevent --workers 1 --bind 0.0.0.0:5000 app:app

Here the --worker-class gevent option configures Gunicorn to use the gevent framework (you must install it with pip install gevent). If you prefer, --worker-class eventlet is also available. The --workers 1 limits to a single process as above. The eventlet and gevent workers in Gunicorn allocate a thousand concurrent clients by default, so that should be much more than what a server of this kind is able to support anyway.

Conclusion

All the changes described above are incorporated in the GitHub repository. I hope you get a better experience with these improvements.

Before concluding, I want to provide quick answers to other questions I have received about this server:

  • How to force the server to run at a fixed frame rate? Configure your camera to deliver frames at that rate, then sleep enough time during each iteration of the camera capture loop to also run at that rate.
  • How to increase the frame rate? The server as described here delivers frames as fast as possible. If you need better frame rates, you can try configuring your camera for a smaller frame size.
  • How to add sound? That's really difficult. The Motion JPEG format does not support audio. You are going to need to stream the audio separately, and then add an audio player to the HTML page. Even if you manage to do all this, synchronization between audio and video is not going to be very accurate.
  • How to save the stream to disk on the server? Just save the sequence of JPEG files in the camera thread. For this you may want to remove the automatic mechanism that ends the background thread when there are no viewers.
  • How to add playback controls to the video player? Motion JPEG was not made for interactive operation by the user, but if you are set on doing this, with a little bit of trickery it may be possible to implement playback controls. If the server saves all jpeg images, then a pause can be implemented by having the server deliver the same frame over and over. When the user resumes playback, the server will have to deliver "old" images that are loaded from disk, since now the user would be in DVR mode instead of watching the stream live. This could be a very interesting project!

That is all for now. If you have other questions please let me know!

207 comments

  • #126 Eli said 2019-05-02T01:40:59Z

    Hi, i read your previous tutorial and i implemented streaming frames to the browser with opencv. My application involves streaming frames on the home page and another page to take pictures of user with the webcam. the problem here is, when i switch/move from the other page to the home page(where i stream the frames ) i get an error like this:

    VIDEOIO ERROR: V4L2: Pixel format of incoming image is unsupported by OpenCV Unable to stop the stream: Device or resource busy video stream started OpenCV(3.4.1) Error: Assertion failed (scn == 3 || scn == 4) in cvtColor, file /home/eli/cv/opencv-3.4.1/modules/imgproc/src/color.cpp, line 11115

    when i switch from the home page to the other page(where i need to capture pictures with the camera), the camera feed doesnt show at all. Im a newbie in this whole thing, but i believe opencv isnt releasing the camera(correct me if im wrong). On reading one of your answers above, how can i stop/release the webcam when i switch between pages(esp from the home page to the other page)? PS: ill be grateful for any code reference. Thanks in advance

  • #127 Miguel Grinberg said 2019-05-02T02:07:08Z

    @Eli: stopping the stream when you leave the page is not a good solution. Consider what happens if you have two connected clients. When one of the leaves to the second page, I assume you do not want to interrupt the stream of the other client, right?

    Since you are already taking constant pictures to get the video stream, why don't you just take the latest image from the stream as the captured image without stopping it?

  • #128 Fbeat said 2019-05-08T08:57:02Z

    Hi Miguel, thanks for the write up.

    I'm having trouble understanding why separate event objects are created for each client.

    In my mind, each client will be running get_frame() infinitely and waiting at line 78 for the background thread to call set().

    Because .set() is called simultaneously on all events.. what's the difference between calling .wait() on a client specific event object vs. one class level event object used by every client at line 78 of BaseCamera

    I have to be missing something easy because it's really confusing me.

    Thanks!

  • #129 Miguel Grinberg said 2019-05-08T17:33:53Z

    @Fbeat: I think it should be possible to implement it with a single event. In this code I'm also keeping track of clients that stop requesting frames, and that requires a separate event object per client, but if that is not a desired feature a single event should work too, I think.

  • #130 Mayuresh Kadam said 2019-06-13T05:29:20Z

    Excellent Post on Flask Video Streaming Miguelgrinberg. However How do we add a start and a stop button on the server itself. I am using your code to run my YOLO Model and i am saving the detected objects in a csv file. this csv file is being displayed in form of a table, next to where the webcam is running. I want to Add a start button as well as a stop button so that the contents in the csv file could be displayed with ease(as everytime i've to stop the script from running altogether in cmd and start it again) Your help would be much appreciated, Thanks!

  • #131 Miguel Grinberg said 2019-06-14T16:55:49Z

    @Mayuresh: the video stops on its own when there are no clients watching. Look at the code that stops the server in that case, it can be adapted to stop on demand if you like.

  • #132 Rudimar said 2019-06-24T18:44:43Z

    Thanks for the amazing post, Mr. Grinberg! I just realized and problem that may happen when someone opens it in multiple tabs, which is changing the events array size on the fly when it is being accessed inside the for loop, for example. In order to cope with it, I create a threading.Lock() object for the CameraEvent class and used the methods accquire() and release() when accessing the array.

    Bests regards.

  • #133 Georg Manz said 2019-07-23T11:44:47Z

    Hi Miguel, thank you for your nice piece of Code. It works really fine. But i want to stream the pictures of 2 or 3 webcams. Can u help me with that? I have issues with the static Methods. I thought that i only have to make instances of the camera class. but with the static variables for the camera source it won't work. I'm new in Python language.

    Do you have an idea for my issue? I tried to change the Camera Class to an normal instance calss and the frames method to a normal method. Nothing helped me.

  • #134 Miguel Grinberg said 2019-07-23T13:23:53Z

    @Georg: the easiest is to stream each camera from a different application. But if you want to host all of them together, you can move the class variables to regular instance variables and initialize them when you create each camera instance.

  • #135 Wilson said 2019-08-31T07:50:18Z

    How about encoding and streaming local video files uploaded by users?

  • #136 Miguel Grinberg said 2019-08-31T09:14:00Z

    @Wilson: that has nothing to do with the project featured in this article. I suggest you look at ffmpeg for all tasks related to encoding and decoding video and audio.

  • #137 Paolo Ferrentino said 2019-09-21T12:44:08Z

    Hi, Miguel, congratulations for your explainations. Your app is very very cool. I'm trying to connect my pre-existing python script to your app. My script acquire, store and show stream locally. I did it before knowing your app. My one doesn't perform LAN streaming by web server, so I wish to bind your app. What initialization is necessary to launch your app from my script? I refer to stuff as : environment vars, Videocapture object pass (I open it before your app), host spcifications and so on.

    I wish to follow you. Thank you a lot. Paolo from Italy

  • #138 Miguel Grinberg said 2019-09-21T21:11:15Z

    @Paolo: this application works in the other way, it runs a web server as the main thing, and then a Camera class provides video frames to stream. So you would need to adapt your script into another Camera subclass, like the ones shown here and on GitHub.

  • #139 myl said 2019-11-07T13:11:29Z

    Hello, Looks nice, I'll have a closer look when I have time to see if I can derive something that fits my needs: I would like to get the video feeds from IP cam (jpg capture URL is available on most of them, so best for compatibility) that are not (for obvious reasons) reachable from the Internet, merge them (or allow selection) on a self hosted web server that is built in my home automation stuff (Domoticz based). The aim would be to trigger custom pages served by Domoticz server to make cam feeds indirect view available from a server I trust. Could not find easy solution for this, but I'm a perfect noob on web development side: On the LAN, that's just an iframe job to make the URL accessible. But from outside, did not found an easy way to "proxy" video streams from another (LAN only, from IP cams) server through my Domoticz one reachable from outside + avoid generating useless LAN traffic when page is not loaded. With your work, this should mostly be a camera source rework... if the Domoticz server can feed the pages! It's a bit dedicated and simplified (no PHP support for instance). Regards.

  • #140 Awais said 2019-12-18T13:36:17Z

    Hi Miguel,

    thanks for your wonderful article, i want to convert the image to base64 and then stream it on web browser, later i would also like to return a json reponse containing camera id in case of multiple cameras, time of each frame and the base64 string to of the frame. i have tried to convert the frame to base64 and the pass it to the yield (b'--frame\r\n' b'Content-Type: image/jpg\r\n\r\n'+stringData+b'\r\n')

    could you please help me in this regard.

  • #141 Miguel Grinberg said 2019-12-20T10:28:15Z

    @Awais: the motion-jpeg format is very specific about how the frames should be streamed. You cannot use base64 encoding, the frames need to be raw jpeg. If you switch to base64, then you will need to create your own player on the client side, and if you do that there is no point in following the motion-jpeg standard anymore, you can just stream the frames in the way that is easier for you.

  • #142 Spencer said 2020-01-11T06:47:39Z

    Hi, thanks for a great write-up. I am quite new to this subject; can you please elaborate what the method 'frames()' does as a staticmethod? All it does is raise a 'RunTimeError' how does it return frames from camera if all it does is raise an error?

  • #143 Miguel Grinberg said 2020-01-13T19:24:55Z

    @Spencer: You are looking at the implementation in the BaseCamera class. This is an abstract class from where actual camera classes can inherit. Did you read the article all the way to the end? Later there are a couple of implementations for different cameras.

  • #144 Nathan Sinclair said 2020-02-16T02:00:37Z

    Hi Miguel,

    Just want to thank you for your tutorial. It's been invaluable so far.

    I'm contacting you with a rather focused question which is part of a bigger problem im having with video streaming from a pi using flask. Firstly what im trying to do.

    I have a thermo imaging camera which I have hooked up to the raspberry pi. running your code from the blog. The simple task is to view the stream, as well as take snapshots while viewing.

    I ran into a problem while trying to get both of these functions to work.

    My attempt to save photos is getting a jpg still using the last frame that was encoded using the cv2.imencode statement in the frames function. Once i try to take a snapshot i receive a 502 Gateway Timout message and cannot get access back to the stream. So question. I am using uwsgi as a server. In your opinion what is happening with the gateway timeout? is it a problem unique to uwsgi? My configuration is using 5 processes and 2 threads. To be honest, even saying it sounds strange. Restarting the webserver also doesn't change anything. I end up needing to restart the whole pi in order to get access to the stream again.

    Is uwsgi my best option for doing video streaming from flask? In your experience what server would work the best for it. So that's two questions. Thanks for any advice in advance

  • #145 Miguel Grinberg said 2020-02-16T17:02:49Z

    @Nathan: you are not providing enough details. What happens when you take a snapshot?

  • #146 Nathan Sinclair said 2020-02-19T23:24:17Z

    I thought i was clear enough. Trying to take a snapshot gives a 502 gateway timeout error - it'll wait roughly 30 seconds to get the snapshot and then give the 502 error.

    Just to update on situation, a friend suggested that I run the snapshot on a separate thread. Would that fix this issue?

  • #147 Miguel Grinberg said 2020-02-20T18:28:16Z

    @Nathan: when you take a snapshot, something that you do in your application causes the 502 error. I'm asking what actions does your application take when it is asked to take a snapshot. Something that you are doing must be causing a block.

  • #148 Nathan Sinclair said 2020-02-20T22:01:57Z

    Okay thanks. i'll go from the flask route back to the camera. starting with flask route. Response(generate_jpeg(Camera()), mimetype="image/jpeg") def generate_jpeg(camera): camera.save_jpeg() frame = camera.get_jpeg() return (b'--frame\r\n' b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n')

    def save_jpeg(): if Camera.thread_save is None: Camera.thread_save = threading.Thread(target=self._threadsave) Camera.thread_save.start()

    while self.get_jpeg() is None: time.sleep(0)

    def threadsave(cls): print(cls.TAG + ": starting saving thread") frames_iterator = cls.jpegframe() for frame in frames_iterator: Camera.framesave = frame Camera.saveevent.set() # send signal to clients time.sleep(0) print(cls.TAG + ": closing saving thread") frames_iterator.close() break print(cls.TAG + ": emptying thread object") Camera.thread_save = None def jpegframe(): camera = cv2.VideoCapture(Camera.video_source) if not camera.isOpened(): raise RuntimeError('Could not start camera.') ,img = camera.read()

    # encode as a jpeg image and return it res = cv2.resize(img, None, fx=7, fy=7, interpolation=cv2.INTER_CUBIC) yield cv2.imencode('.jpg', res)[1].tobytes()

    From there, the functionality is the same as with the video feed. (i've created a CameraEvent object, called saveevent, in the camera_opencv Camera class.

  • #149 Miguel Grinberg said 2020-02-20T23:33:25Z

    @Nathan: sorry but I'm completely confused about this. You said you were taking snapshots. The code that you are showing is a stream, not a single picture. You already have a video stream that is taking snapshots several times a second. When you want to take a snapshot all you need to do is take the current frame and save it to a file. I don't understand what you are trying to do by duplicating the video streaming code.

  • #150 shuai ma said 2020-02-23T14:07:07Z

    hi miguel, nice post! i got a question here: if i want to do two processing task on the same frame got from opencv, both tasks tasks some time so i want to run them parallelly. But I searched google and it says even multi-thread tasks in a process run sequentially because of the GIL lock, so can i run this two tasks in different process? if yes, how to do that? Thanks in advance for comments!

Leave a Comment