Masking Sensitive Text in Videos with Python: A Step-by-Step Guide

To automatically mask sensitive information like email IDs, phone numbers, and unwanted text in videos, you can use a combination of video processing, text detection, and redaction techniques. In today’s digital world, protecting sensitive information in videos is crucial. Whether you’re sharing tutorials, presentations, or vlogs, you may accidentally expose emails, phone numbers, or confidential text. This blog post walks you through a Python script that automatically detects and masks sensitive text in videos using OpenCV, Tesseract OCR, and MoviePy. By the end, you’ll have a working tool to enhance privacy in your video content.

Protecting sensitive information in videos is essential in today’s digital age. Whether you’re sharing tutorials, presentations, or vlogs, you risk exposing emails, phone numbers, or confidential text. This blog post guides you through a Python script that automatically detects and masks sensitive text in videos using OpenCV, Tesseract OCR, and MoviePy. By the end, you’ll have a practical tool to enhance privacy in your video content, perfect for content creators, businesses, or developers learning video processing.

What This Project Does

This Python script processes a video to:

Detect sensitive text, including:
- Emails (e.g., user@domain.com).
- Phone numbers (e.g., +12345678901).
- Specific unwanted words (e.g., ‘KILL’, ‘MURDER’, ‘IDIOT’).
Mask detected text by drawing black rectangles over it in every frame.
Preserve the original audio and save the processed video as output_video.mp4.

Use Cases:

Content creators safeguarding personal information.
Businesses ensuring compliance with data privacy regulations.
Developers exploring video processing and OCR techniques.

Prerequisites

Before running the script, ensure you have:

Hardware

Windows 11 PC with at least 4GB RAM (8GB recommended for video processing).
A valid MP4 video file (e.g., fortesting.mp4) containing text to mask.

Software

Python 3.13.4 (or 3.11.8 for better compatibility with MoviePy).
Python libraries: opencv-python, pytesseract, numpy, moviepy==1.0.3.
Tesseract OCR: For text detection.
FFmpeg: For video encoding/decoding with MoviePy.

Installation Steps

Follow these steps to set up the environment on Windows 11:

1. Install Python

Download Python 3.13.4 from python.org or Python 3.11.8 from python.org.
Run the installer, checking Add Python to PATH and selecting Install Now.
- Optionally, customize the installation path (e.g., C:\Program Files\Python313).
Verify installation in PowerShell:
python --version
(Expected output: Python 3.13.4 (or 3.11.8).)

2. Install Tesseract OCR

Download the installer from Tesseract at UB Mannheim.
Install to the default path (C:\Program Files\Tesseract-OCR).
Add Tesseract to PATH:
- Press Win + S, search for “environment variables,” and select Edit the system environment variables.
- In System Variables, find Path, click Edit, and add C:\Program Files\Tesseract-OCR.
- Click OK to save.
Verify:
tesseract –version
(tesseract --version)

3. Install FFmpeg

Download FFmpeg from gyan.dev (e.g., ffmpeg-release-full.7z).
Extract to C:\Program Files\ffmpeg using 7-Zip.
Add FFmpeg to PATH:
- In System Variables > Path, add C:\Program Files\ffmpeg\bin.
- Click OK to save.
Verify:
ffmpeg -version
(Expected output: FFmpeg version details.)

4. Install Python Libraries

Open PowerShell and install required libraries:
& “C:\Program Files\Python313\python.exe” -m pip install opencv-python pytesseract numpy moviepy==1.0.3
Verify:
& “C:\Program Files\Python313\python.exe” -m pip list
Look for:
- moviepy 1.0.3
- opencv-python
- pytesseract
- numpy

The Script

Create a file named mask_video.py in a directory (e.g., D:\PythonVideos) with the following code:

import cv2
import pytesseract
import re
import numpy as np
from moviepy.editor import VideoFileClip, ImageSequenceClip, AudioFileClip

# Set Tesseract path
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

# Define patterns for sensitive data
email_pattern = r'[\w\.-]+@[\w\.-]+'  # Matches emails
phone_pattern = r'\+?\d{10,12}'       # Matches phone numbers
unwanted_text = ['KILL', 'MURDER', 'IDIOT']  # Words to mask

# Input and output video paths
input_video_path = 'fortesting.mp4'
output_video_path = 'output_video.mp4'

# Load video
try:
    video = VideoFileClip(input_video_path)
except Exception as e:
    print(f"Error loading video: {e}")
    exit()
audio = video.audio
fps = video.fps
print(f"Video FPS: {fps}")

# Function to process each frame
def process_frame(frame):
    img = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    _, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    text_data = pytesseract.image_to_data(thresh, output_type=pytesseract.Output.DICT, config='--psm 6')
    sensitive_texts = []
    for i, text in enumerate(text_data['text']):
        text = text.strip()
        if not text:
            continue
        is_sensitive = (
            re.match(email_pattern, text) or
            re.match(phone_pattern, text) or
            any(word.lower() in text.lower() for word in unwanted_text)
        )
        if is_sensitive:
            sensitive_texts.append(text)
            x, y, w, h = (text_data['left'][i], text_data['top'][i], 
                         text_data['width'][i], text_data['height'][i])
            padding = 5
            x, y, w, h = x - padding, y - padding, w + 2 * padding, h + 2 * padding
            x, y = max(0, x), max(0, y)
            w, h = min(w, img.shape[1] - x), min(h, img.shape[0] - y)
            cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 0), -1)
    if sensitive_texts:
        print(f"Detected sensitive texts: {sensitive_texts}")
    return cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Process all frames
print("Processing video frames... This may take a while.")
try:
    processed_frames = [process_frame(frame) for frame in video.iter_frames()]
except Exception as e:
    print(f"Error processing frames: {e}")
    video.close()
    exit()

# Create output video
print("Saving output video...")
try:
    processed_video = ImageSequenceClip(processed_frames, fps=fps)
    if audio is not None:
        processed_video = processed_video.set_audio(audio)
    processed_video.write_videofile(output_video_path, codec='libx264', audio_codec='aac', fps=fps)
except Exception as e:
    print(f"Error saving video: {e}")
    processed_video.close()
    video.close()
    exit()

# Clean up
processed_video.close()
video.close()
print("Done! Check output_video.mp4")

How It Works

Video Loading:
- VideoFileClip loads the input video (fortesting.mp4).
- Extracts the audio and frame rate (fps).
Frame Processing:
- Converts each frame to grayscale and applies Otsu’s thresholding to improve text detection.
- Uses Tesseract OCR with –psm 6 (single uniform text block) to detect text.
- Matches text against:
  - Email regex: [\w\.-]+@[\w\.-]+
  - Phone number regex: \+?\d{10,12}
  - Unwanted words: [‘AMK’, ‘Aravinda’, ‘sjec’] (case-insensitive).
- Draws black rectangles over detected text with a 5-pixel padding for complete coverage.
Output Creation:
- Creates a new video using ImageSequenceClip from processed frames.
- Reattaches the original audio and saves the result as output_video.mp4 with H.264 video and AAC audio codecs.

Running the Script

Place your video (fortesting.mp4) in D:\PythonVideos.
Update input_video_path in the script if your video has a different name.
Open PowerShell:
cd D:\PythonVideos
& "C:\Program Files\Python313\python.exe" mask_video.py
Monitor the console for detected texts (e.g., Detected sensitive texts: [‘AMK’, ‘user@domain.com’]).
Check output_video.mp4 for masked text and preserved audio.

Customizing the Script

Change Unwanted Text: Modify the unwanted_text list:
unwanted_text = [‘KILL’, ‘MURDER’, ‘IDIOT’]
Use Blur Instead of Rectangles: Replace the rectangle with a blur effect:
img[y:y+h, x:x+w] = cv2.blur(img[y:y+h, x:x+w], (50, 50))
Uncomment this in process_frame and comment out cv2.rectangle.
Enhance OCR: Add Gaussian blur for noisy videos:
thresh = cv2.GaussianBlur(gray, (5, 5), 0)
Insert before thresholding in process_frame.
Adjust Padding: Increase padding = 5 to 10 for larger rectangles if text isn’t fully masked.

Example Output

Input Video: fortesting.mp4 with text like “Contact: user@domain.com, +12345678901, KILL, MURDER, IDIOT”.
Output Video: output_video.mp4 with black rectangles covering “user@domain.com”, “+12345678901”, “KILL”, “MURDER”, and “IDIOT” in all frames, with the original audio preserved.

Limitations

Text Variations: The regex patterns may miss complex email/phone formats. Customize patterns for your use case.
OCR Accuracy: Tesseract may struggle with low-contrast, rotated, or blurry text. Preprocessing mitigates this but isn’t foolproof.
Performance: Processing long or high-resolution videos is slow. Test with short clips or reduce resolution.
Audio Sync: Audio is preserved, but rare sync issues may require FFmpeg post-processing.

Conclusion

This Python script is a powerful tool for protecting sensitive information in videos using OpenCV, Tesseract, and MoviePy. It’s ideal for content creators, businesses, and developers interested in video processing and data privacy. Experiment with the script, customize it to your needs, and share your results!

Download Source code

Download