How to Run Whisper on Mac Locally: Complete 2025 Guide

OpenAI's Whisper is one of the most accurate speech-to-text models available today. The best part? You can run it entirely on your Mac without sending any audio to the cloud. This guide covers two approaches: the easy way using a dedicated app, and the developer way using Python scripts.

🎙️ Get Whisper Dictation for Mac

AI-powered dictation with 100% local processing. One-time purchase.

Download Now

Why Run Whisper Locally on Mac?

Running Whisper locally on your Mac offers several compelling advantages over cloud-based transcription services:

  • Complete Privacy — Your audio never leaves your computer. This is crucial for sensitive content like medical dictation, legal documents, or confidential business meetings.
  • No Internet Required — Once set up, Whisper works entirely offline. Transcribe on airplanes, in remote locations, or anywhere without WiFi.
  • No Usage Limits — Unlike API-based services that charge per minute, local Whisper has no limits. Transcribe as much as you want.
  • Lower Latency — No network round-trip means faster results, especially for real-time dictation use cases.
  • Cost Effective — After the initial setup, there are no ongoing costs. The OpenAI Whisper API charges $0.006/minute, which adds up quickly for heavy users.

Apple Silicon Macs (M1, M2, M3, M4) are particularly well-suited for running Whisper locally thanks to their unified memory architecture and Neural Engine capabilities.

Method 1: The Easy Way — Whisper Dictation App

If you want to use Whisper on your Mac without any technical setup, coding, or command-line knowledge, Whisper Dictation is the simplest solution. It's a native macOS app that bundles everything you need.

Why Choose the App Approach?

  • Zero setup — Download, install, and start dictating in minutes
  • Real-time dictation — Speak and see text appear instantly in any app
  • System-wide integration — Works in any text field on your Mac
  • Automatic model management — No manual model downloads
  • Optimized for macOS — Native Apple Silicon support with GPU acceleration
  • Keyboard shortcuts — Trigger dictation with customizable hotkeys

How It Works

  1. Download Whisper Dictation from the official website
  2. Install the app by dragging it to your Applications folder
  3. Grant permissions for microphone and accessibility access
  4. Choose your Whisper model (the app downloads it automatically)
  5. Press your hotkey (default: Fn key) and start speaking
  6. Your words appear as text in whatever app you're using

Get Whisper Dictation

The fastest way to run Whisper on your Mac. One-time purchase, no subscription, 7-day money-back guarantee.

Download Whisper Dictation

The app is ideal for writers, journalists, students, professionals, and anyone who wants powerful speech-to-text without the technical overhead. It handles all the complexity of running Whisper—model loading, audio processing, memory management—behind a simple interface.

Method 2: The Developer Way — Python Script

For developers, researchers, or power users who want full control, you can run Whisper directly via Python. This method is perfect for batch processing audio files, integrating into automated workflows, or customizing the transcription process.

Prerequisites

Before installing Whisper, you'll need:

  • macOS 11 Big Sur or later
  • Python 3.8-3.11 (3.10 recommended)
  • Homebrew package manager
  • ffmpeg for audio processing
  • 8GB+ RAM (16GB recommended for larger models)

Step 1: Install Homebrew (if needed)

Homebrew is the easiest way to install dependencies on macOS. Open Terminal and run:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Step 2: Install Python and ffmpeg

# Install Python 3.10
brew install python@3.10

# Install ffmpeg (required for audio processing)
brew install ffmpeg

# Verify installations
python3 --version
ffmpeg -version

Step 3: Create a Virtual Environment (Recommended)

Using a virtual environment keeps your Whisper installation isolated:

# Create a new virtual environment
python3 -m venv whisper-env

# Activate it
source whisper-env/bin/activate

# Your prompt should now show (whisper-env)

Step 4: Install OpenAI Whisper

# Install Whisper via pip
pip install -U openai-whisper

# This will also install PyTorch and other dependencies

If you encounter errors with tiktoken, you may need the Rust compiler:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env
pip install -U openai-whisper

Step 5: Basic Command-Line Usage

Once installed, you can transcribe audio files directly from Terminal:

# Basic transcription (uses 'small' model by default)
whisper audio.mp3

# Specify a model
whisper audio.mp3 --model turbo

# Specify language (improves accuracy)
whisper audio.mp3 --model medium --language English

# Output as SRT subtitles
whisper audio.mp3 --model turbo --output_format srt

# Save to specific directory
whisper audio.mp3 --model turbo --output_dir ./transcripts

Step 6: Python Script for More Control

For programmatic use, create a Python script. Here's a complete example:

#!/usr/bin/env python3
"""
whisper_transcribe.py
Simple script to transcribe audio files using OpenAI Whisper locally.
"""

import whisper
import sys
import os

def transcribe_audio(audio_path, model_name="turbo", language=None):
    """
    Transcribe an audio file using Whisper.
    
    Args:
        audio_path: Path to the audio file
        model_name: Whisper model to use (tiny, base, small, medium, large, turbo)
        language: Optional language code (e.g., "en", "fr", "es")
    
    Returns:
        Transcription result dictionary
    """
    # Check if file exists
    if not os.path.exists(audio_path):
        raise FileNotFoundError(f"Audio file not found: {audio_path}")
    
    print(f"Loading Whisper model '{model_name}'...")
    model = whisper.load_model(model_name)
    
    print(f"Transcribing: {audio_path}")
    
    # Transcription options
    options = {
        "fp16": False,  # Use FP32 on CPU (required for most Macs)
        "verbose": True  # Show progress
    }
    
    if language:
        options["language"] = language
    
    result = model.transcribe(audio_path, **options)
    
    return result

def main():
    if len(sys.argv) < 2:
        print("Usage: python whisper_transcribe.py <audio_file> [model] [language]")
        print("Models: tiny, base, small, medium, large, turbo")
        print("Example: python whisper_transcribe.py interview.mp3 turbo en")
        sys.exit(1)
    
    audio_file = sys.argv[1]
    model = sys.argv[2] if len(sys.argv) > 2 else "turbo"
    language = sys.argv[3] if len(sys.argv) > 3 else None
    
    try:
        result = transcribe_audio(audio_file, model, language)
        
        # Print full transcription
        print("\n" + "="*50)
        print("TRANSCRIPTION:")
        print("="*50)
        print(result["text"])
        
        # Print segments with timestamps
        print("\n" + "="*50)
        print("SEGMENTS WITH TIMESTAMPS:")
        print("="*50)
        for segment in result["segments"]:
            start = segment["start"]
            end = segment["end"]
            text = segment["text"]
            print(f"[{start:.2f}s - {end:.2f}s] {text}")
        
        # Save to file
        output_file = os.path.splitext(audio_file)[0] + "_transcript.txt"
        with open(output_file, "w") as f:
            f.write(result["text"])
        print(f"\nTranscription saved to: {output_file}")
        
    except Exception as e:
        print(f"Error: {e}")
        sys.exit(1)

if __name__ == "__main__":
    main()

Save this as whisper_transcribe.py and run it:

# Make it executable
chmod +x whisper_transcribe.py

# Run with default settings
python whisper_transcribe.py recording.mp3

# Run with specific model and language
python whisper_transcribe.py meeting.m4a medium en

Whisper Model Comparison

Whisper offers several model sizes. Larger models are more accurate but require more RAM and processing time. Here's a comparison to help you choose:

Model Parameters RAM Required Disk Space Relative Speed Best For
tiny 39M ~1 GB 75 MB ~10x faster Quick tests, limited RAM
base 74M ~1 GB 142 MB ~7x faster Basic transcription
small 244M ~2 GB 244 MB ~4x faster Good balance
medium 769M ~5 GB 769 MB ~2x faster High accuracy
large-v3 1550M ~10 GB 2.87 GB 1x (baseline) Maximum accuracy
turbo 809M ~6 GB 1.5 GB ~8x faster Best overall choice

Our recommendation: Start with the turbo model. It offers near-large-v3 accuracy at 8x the speed, making it the best choice for most users. If you have limited RAM (8GB), try small first.

Apple Silicon Performance

Apple Silicon Macs excel at running Whisper locally. Here are real-world benchmarks for transcribing a 10-minute audio file:

Mac Model Turbo Model Medium Model Large-v3 Model
MacBook Air M1 (8GB) ~45 seconds ~3 minutes ~8 minutes
MacBook Pro M1 Pro (16GB) ~30 seconds ~2 minutes ~5 minutes
MacBook Pro M3 Pro (18GB) ~20 seconds ~1.5 minutes ~3.5 minutes
MacBook Pro M4 Pro (24GB) ~15 seconds ~50 seconds ~2.5 minutes

Note on GPU acceleration: While PyTorch has experimental MPS (Metal Performance Shaders) support for Apple Silicon, it's not yet fully optimized for Whisper. The Python version primarily uses CPU. For native Metal acceleration, consider using whisper.cpp, which can be 2x faster on Apple Silicon.

Intel Macs: Whisper runs on Intel Macs too, but expect 3-5x slower performance compared to Apple Silicon equivalents.

Troubleshooting Common Issues

ffmpeg not found

If you get "ffmpeg not found" errors:

brew reinstall ffmpeg

# If you have librist conflicts:
brew uninstall librist --ignore-dependencies
brew reinstall ffmpeg

FP16 Warning Messages

You may see: "FP16 is not supported on CPU; using FP32 instead." This is normal on Mac and doesn't affect functionality. To suppress it, explicitly set fp16=False in your transcribe call.

Out of Memory Errors

If Whisper crashes with memory errors:

  • Use a smaller model (try small instead of medium)
  • Close other applications to free RAM
  • For very long audio files, split them into chunks first

Slow First Run

The first time you run Whisper with a new model, it downloads the model weights (~75MB to 2.9GB depending on model size). Subsequent runs use the cached model from ~/.cache/whisper/.

Python/pip Not Found

# Use python3 and pip3 explicitly
pip3 install -U openai-whisper
python3 -m whisper --help

# Or specify the full path
/opt/homebrew/bin/python3 -m pip install openai-whisper

Frequently Asked Questions

Is running Whisper locally really private?

Yes, 100%. When you run Whisper locally, all audio processing happens on your Mac. No data is sent to OpenAI, cloud servers, or any third party. This is fundamentally different from using the OpenAI Whisper API, which sends your audio to OpenAI's servers.

Which Whisper model should I use?

For most users, we recommend the turbo model. It provides excellent accuracy (close to large-v3) while being 8x faster. If you have only 8GB of RAM, start with small. For maximum accuracy on important transcriptions, use large-v3.

Can I transcribe languages other than English?

Yes! Whisper supports 100+ languages. For best results with non-English audio, specify the language: whisper audio.mp3 --language French. You can also use --task translate to translate any language to English.

Does Whisper work offline?

Yes, after the initial model download. Once you've run Whisper once with a model (e.g., turbo), that model is cached locally. All future transcriptions work completely offline.

What audio formats does Whisper support?

Whisper (via ffmpeg) supports virtually all audio formats: MP3, WAV, M4A, FLAC, OGG, WEBM, and more. It also handles video files, extracting the audio track automatically.

App vs Python: which should I choose?

Choose Whisper Dictation app if you want real-time dictation, system-wide integration, and zero setup. Choose the Python approach if you need to batch-process files, integrate with scripts, or customize the transcription pipeline.

Conclusion

Running Whisper locally on your Mac gives you access to one of the world's best speech-to-text models while maintaining complete privacy. Whether you choose the easy path with Whisper Dictation or the technical path with Python, you'll get accurate transcriptions without relying on cloud services.

For most users, we recommend starting with Whisper Dictation—it handles all the complexity and gives you real-time dictation in any app. Developers and power users will appreciate the flexibility of the Python approach for batch processing and automation.

Ready to Try Whisper on Your Mac?

Get started in minutes with Whisper Dictation. No command line, no Python, no hassle.

Get Whisper Dictation