Spaces:

jujutechnology
/

https-huggingface-co-spaces-jujutechnology-ebook2audiobook

Paused

App Files Files Community

jujutechnology commited on Jun 16

Commit

ad6d77a

verified ·

1 Parent(s): b86cad2

Update README.md

Browse files

Files changed (1) hide show

README.md +373 -368

README.md CHANGED Viewed

@@ -1,368 +1,373 @@
-# 🎧 Chatterbox Audiobook Generator
-**This is a work in progress. You can consider this a pre-launch repo at the moment, but if you find bugs, please put them in the issues area. Thank you.**
-**Transform your text into high-quality audiobooks with advanced TTS models, voice cloning, and professional volume normalization.**
-## 🚀 Quick Start
-### 1. Install Dependencies
-```bash
-./install-audiobook.bat
-```
-### 2. Launch the Application
-```bash
-./launch_audiobook.bat
-```
-### 3. CUDA Issue Fix (If Needed)
-If you encounter CUDA assertion errors during generation, install the patched version:
-```bash
-# Activate your virtual environment first
-venv\Scripts\activate.bat
-# Install the CUDA-fixed version
-pip install --force-reinstall --no-cache-dir "chatterbox-tts @ git+https://github.com/fakerybakery/better-chatterbox@fix-cuda-issue"
-```
-The web interface will open automatically in your browser at `http://localhost:7860`
----
-## ✨ Features
-### 📚 **Audiobook Creation**
-- **Single Voice**: Generate entire audiobooks with one consistent voice
-- **Multi-Voice**: Create dynamic audiobooks with multiple characters
-- **Custom Voices**: Clone voices from audio samples for personalized narration
-- **Professional Volume Normalization**: Ensure consistent audio levels across all voices
-- **📋 Text Queuing System** ⭐ *NEW*: Upload books in any size chapters and generate continuously
-- **🔄 Chunk-Based Processing** ⭐ *NEW*: Improved reliability for longer text generations
-### 🎵 **Audio Processing**
-- **Smart Cleanup**: Remove unwanted silence and audio artifacts
-- **Volume Normalization**: Professional-grade volume balancing for all voices
-- **Real-time Audio Analysis**: Live volume level monitoring and feedback
-- **Preview System**: Test settings before applying to entire projects
-- **Batch Processing**: Process multiple projects efficiently
-- **Quality Control**: Advanced audio optimization tools
-- **🎯 Enhanced Audio Quality** ⭐ *NEW*: Improved P-top and minimum P parameters for better voice generation
-### 🎭 **Voice Management**
-- **Voice Library**: Organize and manage your voice collection
-- **Voice Cloning**: Create custom voices from audio samples
-- **Volume Settings**: Configure target volume levels for each voice
-- **Professional Presets**: Industry-standard volume levels (audiobook, podcast, broadcast)
-- **Character Assignment**: Map specific voices to story characters
-### 📊 **Volume Normalization System** ⭐ *NEW*
-- **Professional Standards**: Audiobook (-18 dB), Podcast (-16 dB), Broadcast (-23 dB) presets
-- **Consistent Character Voices**: All characters maintain the same volume level
-- **Real-time Analysis**: Color-coded volume status with RMS and peak level display
-- **Retroactive Normalization**: Apply volume settings to existing voice projects
-- **Multi-Voice Support**: Batch normalize all voices in multi-character audiobooks
-- **Soft Limiting**: Intelligent audio limiting to prevent distortion
-### 📖 **Text Processing**
-- **Chapter Support**: Automatic chapter detection and organization
-- **Multi-Voice Parsing**: Parse character dialogue automatically
-- **Text Validation**: Ensure proper formatting before generation
-- **📋 Queue Management** ⭐ *NEW*: Batch process multiple text files sequentially
-- **🔇 Return Pause System** ⭐ *NEW*: Automatic pause insertion based on line breaks for natural speech flow
----
-## 🎭 Custom Audiobook Processing Pipeline ⭐ *NEW*
-Our advanced text processing pipeline transforms your written content into natural-sounding audiobooks with intelligent pause placement and character flow management.
-### 🔇 **Return Pause System**
-**Automatic pause insertion based on your text formatting** - Every line break (`\n`) in your text automatically adds a 0.1-second pause to the generated audio, creating natural speech rhythms without manual intervention.
-#### **How It Works**
-- **Line Break Detection**: System automatically counts all line breaks in your text
-- **Pause Calculation**: Each return adds exactly 0.1 seconds of silence
-- **Accumulative Pauses**: Multiple consecutive line breaks create longer pauses
-- **Universal Support**: Works with single-voice, multi-voice, and batch processing
-#### **Example Text Formatting**
-```
-[Narrator] The sun was setting over the hills.
-[Character1] "We need to find shelter soon."
-[Character2] "I see a cave up ahead.
-Let's hurry before it gets dark."
-[Narrator] They rushed toward the cave, hearts pounding.
-```
-**Result**: Natural pauses between dialogue, emphasis pauses for dramatic effect, and smooth character transitions.
-### 📝 **Text Formatting Best Practices**
-#### **🎭 Multi-Voice Dialogue Structure**
-```
-[Character Name] Dialogue content here.
-[Another Character] Response content here.
-Multiple lines can be used for the same character.
-[Narrator] Descriptive text and scene setting.
-```
-#### **🎪 Natural Flow Techniques**
-- **Paragraph Breaks**: Use double line breaks for scene transitions
-- **Emphasis Pauses**: Add extra returns before important revelations
-- **Character Separation**: Single returns between different speakers
-- **Breathing Room**: Natural pauses for complex concepts or emotional moments
-#### **📖 Single Voice Formatting**
-```
-Chapter content flows naturally here.
-New paragraphs create natural pauses.
-Extended pauses can emphasize dramatic moments.
-Regular text continues with normal pacing.
-```
-### 🔄 **Processing Pipeline Features**
-#### **🧠 Intelligent Text Analysis**
-- **Line Break Preservation**: Maintains your formatting intentions throughout processing
-- **Character Assignment**: Automatically maps voice tags to selected voice profiles
-- **Chunk Optimization**: Breaks long texts into optimal segments while preserving pause timing
-- **Error Recovery**: Validates text and provides helpful formatting suggestions
-#### **⚡ Real-Time Processing**
-- **Live Feedback**: Console output shows exactly how many pauses are being added
-- **Debug Information**: Detailed logging of pause detection and application
-- **Progress Tracking**: Monitor pause processing alongside audio generation
-- **Quality Assurance**: Automatic validation of pause placement
-#### **🎚️ Professional Output**
-- **Seamless Integration**: Pauses blend naturally with generated speech
-- **Volume Consistency**: Silence segments match the audio output specifications
-- **Format Compatibility**: Works with all supported audio formats and quality settings
-- **Project Preservation**: Pause information saved in project metadata for regeneration
-### 💡 **Pro Tips for Better Audiobooks**
-#### **🎯 Dialogue Formatting**
-- **Character Consistency**: Always use the same character name format `[Name]`
-- **Natural Breaks**: Place returns where a human reader would naturally pause
-- **Scene Transitions**: Use multiple returns (2-3) for major scene changes
-- **Emotional Beats**: Add single returns before/after emotional dialogue
-#### **📚 Chapter Structure**
-```
-Chapter 1: The Beginning
-Opening paragraph with scene setting.
-"Character dialogue with natural flow."
-Descriptive narrative continues.
-Major scene transition with extended pause.
-New section begins here.
-```
-#### **🎪 Advanced Techniques**
-- **Cliffhangers**: Use extended pauses before revealing crucial information
-- **Action Sequences**: Shorter, punchy sentences with minimal pauses for intensity
-- **Contemplative Moments**: Longer pauses for reflection and character development
-- **Comedic Timing**: Strategic pauses before punchlines or comedic reveals
-### 🔍 **Debug Output Examples**
-When generating your audiobook, watch for these helpful console messages:
-```
-🔇 Detected 15 line breaks → 1.5s total pause time
-🔇 Line breaks detected in [Character1]: +0.3s pause (from 3 returns)
-🔇 Chunk 2 (Narrator): Added 0.2s pause after speech
-```
-This real-time feedback helps you understand exactly how your formatting translates to audio timing.
----
-## 🆕 Recent Improvements
-### 🎯 **Audio Quality Enhancements**
-We've significantly improved audio generation quality by optimizing the underlying TTS parameters:
-- **Enhanced P-top and Minimum P Settings**: Fine-tuned probability parameters for more natural speech patterns
-- **Reduced Audio Artifacts**: Better handling of pronunciation and intonation
-- **Improved Voice Consistency**: More stable voice characteristics across long generations
-- **Better Pronunciation**: Enhanced handling of complex words and names
-**📝 Note for Existing Users**:
-- Older voice profiles will continue to work as before
-- To take advantage of the new audio quality improvements, consider re-creating voice profiles
-- Existing projects remain fully compatible
-### 📋 **Text Queuing System**
-Perfect for processing large books or multiple chapters:
-- **Batch Upload**: Upload multiple text files of any size
-- **Sequential Processing**: Automatically processes files one after another
-- **Progress Tracking**: Monitor generation progress across all queued items
-- **Flexible Chapter Sizes**: No restrictions on individual file length
-- **Unattended Generation**: Set up large projects and let them run automatically
-### 🔄 **Chunk-Based TTS System**
-Enhanced the core text-to-speech engine for better reliability:
-- **Background Chunking**: Automatically splits long texts into optimal chunks
-- **Memory Management**: Better handling of large text inputs
-- **Error Recovery**: Improved resilience during long generation sessions
-- **Consistent Quality**: Maintains voice quality across chunk boundaries
-- **Progress Feedback**: Real-time updates on generation progress
----
-## 🎚️ Volume Normalization Guide
-### **Individual Voice Setup**
-1. Go to **Voice Library** tab
-2. Upload your voice sample and configure settings
-3. Set target volume level (default: -18 dB for audiobooks)
-4. Choose from professional presets or use custom levels
-5. Save voice profile with volume settings
-### **Multi-Voice Projects**
-1. Navigate to **Multi-Voice Audiobook Creation** tab
-2. Enable volume normalization for all voices
-3. Set target level for consistent character voices
-4. All characters will be automatically normalized during generation
-### **Text Queuing Workflow** ⭐ *NEW*
-1. Go to **Production Studio** tab
-2. Select "Batch Processing" mode
-3. Upload multiple text files (chapters, sections, etc.)
-4. Choose your voice and settings
-5. Start batch processing - files will generate sequentially
-6. Monitor progress and download completed audiobooks
-### **Professional Standards**
-- **📖 Audiobook Standard**: -18 dB RMS (recommended for most audiobooks)
-- **🎙️ Podcast Standard**: -16 dB RMS (for podcast-style content)
-- **🔇 Quiet/Comfortable**: -20 dB RMS (for quiet listening environments)
-- **🔊 Loud/Energetic**: -14 dB RMS (for dynamic, energetic content)
-- **📺 Broadcast Standard**: -23 dB RMS (for broadcast television standards)
----
-## 📁 Project Structure
-```
-📦 Your Audiobook Projects
-├── 🎤 speakers/           # Voice library and samples
-├── 📚 audiobook_projects/ # Generated audiobooks
-├── 🔧 src/audiobook/      # Core processing modules
-└── 📄 Generated files...  # Audio chunks and final outputs
-```
----
-## 🎯 Workflow
-1. **📝 Prepare Text**: Format your story with proper chapter breaks and strategic line breaks for natural pauses
-2. **🎤 Select Voices**: Choose or clone voices for your characters
-3. **🎚️ Configure Volume**: Set professional volume levels and normalization
-4. **⚙️ Configure Settings**: Adjust quality, speed, and processing options
-5. **🎧 Generate Audio**: Create your audiobook with advanced TTS and automatic pause insertion
-6. **🧹 Clean & Optimize**: Use smart cleanup tools for perfect audio
-7. **📦 Export**: Get your finished audiobook ready for distribution
-### 🎭 **Enhanced Multi-Voice Workflow**
-1. **📝 Format Dialogue**: Use `[Character]` tags and strategic line breaks for natural flow
-2. **🔇 Add Return Pauses**: Place line breaks where you want natural speech pauses (0.1s each)
-3. **🎤 Assign Voices**: Map each character to their voice profile
-4. **⚡ Process with Intelligence**: Watch console output for pause detection feedback
-5. **🎧 Review & Adjust**: Listen to generated audio and refine formatting if needed
-### 📋 **Batch Processing Workflow** ⭐ *NEW*
-1. **📚 Organize Chapters**: Split your book into individual text files
-2. **📋 Queue Setup**: Upload all files to the batch processing system
-3. **🎤 Voice Selection**: Choose voice and configure settings once
-4. **🔄 Automated Generation**: Let the system process all files sequentially
-5. **📊 Monitor Progress**: Track completion status in real-time
-6. **📦 Collect Results**: Download all generated audiobook chapters
----
-## 🛠️ Technical Requirements
-- **Python 3.8+**
-- **CUDA GPU** (recommended for faster processing)
-- **8GB+ RAM** (16GB recommended for large projects)
-- **Modern web browser** for the interface
-### 🔧 **CUDA Support**
-- CUDA compatibility issues have been resolved with updated dependencies
-- GPU acceleration is now stable for extended generation sessions
-- Fallback to CPU processing available if CUDA issues occur
-- **If you encounter CUDA assertion errors**: Use the patched version from the installation instructions above
-- The fix addresses PyTorch indexing issues that could cause crashes during audio generation
----
-## ⚠️ Known Issues & Compatibility
-### **Multi-Voice Generation**
-- Short sentences or sections may occasionally cause issues during multi-voice generation
-- This is a limitation of the underlying TTS models rather than the implementation
-- **Workaround**: Use longer, more detailed sentences for better stability
-- Single-voice generation is not affected by this issue
-### **Voice Profile Compatibility**
-- **Existing Voices**: All older voice profiles remain fully functional
-- **New Features**: To benefit from improved audio quality, consider re-creating voice profiles
-- **Project Compatibility**: Existing audiobook projects work without modification
-- **Regeneration**: Individual chunks can be regenerated with improved quality settings
-### **Batch Processing Considerations**
-- Large batch jobs may take significant time depending on text length and hardware
-- Monitor system resources during extended batch processing sessions
-- Consider processing very large books in smaller batches for better control
----
-## 📋 Supported Formats
-### Input
-- **Text**: `.txt`, `.md`, formatted stories and scripts
-- **Audio Samples**: `.wav`, `.mp3`, `.flac` for voice cloning
-- **Batch Files**: Multiple text files for queue processing
-### Output
-- **Audio**: High-quality `.wav` files with professional volume levels
-- **Projects**: Organized folder structure with chapters
-- **Exports**: Ready-to-use audiobook files
-- **Batch Results**: Multiple completed audiobooks from queue processing
----
-## 🆘 Support
-- **Features Guide**: See `AUDIOBOOK_FEATURES.md` for detailed capabilities
-- **Development Notes**: Check `development/` folder for technical details
-- **Issues**: Report problems via GitHub issues
----
-## 📄 License
-This project is licensed under the terms specified in `LICENSE`.
----
-**🎉 Ready to create amazing audiobooks with professional volume levels and enhanced audio quality? Run `./launch_audiobook.bat` and start generating!**

+---
+license: apache-2.0
+title: ebookChatterBox
+sdk: gradio
+---
+# 🎧 Chatterbox Audiobook Generator
+**This is a work in progress. You can consider this a pre-launch repo at the moment, but if you find bugs, please put them in the issues area. Thank you.**
+**Transform your text into high-quality audiobooks with advanced TTS models, voice cloning, and professional volume normalization.**
+## 🚀 Quick Start
+### 1. Install Dependencies
+```bash
+./install-audiobook.bat
+```
+### 2. Launch the Application
+```bash
+./launch_audiobook.bat
+```
+### 3. CUDA Issue Fix (If Needed)
+If you encounter CUDA assertion errors during generation, install the patched version:
+```bash
+# Activate your virtual environment first
+venv\Scripts\activate.bat
+# Install the CUDA-fixed version
+pip install --force-reinstall --no-cache-dir "chatterbox-tts @ git+https://github.com/fakerybakery/better-chatterbox@fix-cuda-issue"
+```
+The web interface will open automatically in your browser at `http://localhost:7860`
+---
+## ✨ Features
+### 📚 **Audiobook Creation**
+- **Single Voice**: Generate entire audiobooks with one consistent voice
+- **Multi-Voice**: Create dynamic audiobooks with multiple characters
+- **Custom Voices**: Clone voices from audio samples for personalized narration
+- **Professional Volume Normalization**: Ensure consistent audio levels across all voices
+- **📋 Text Queuing System** ⭐ *NEW*: Upload books in any size chapters and generate continuously
+- **🔄 Chunk-Based Processing** ⭐ *NEW*: Improved reliability for longer text generations
+### 🎵 **Audio Processing**
+- **Smart Cleanup**: Remove unwanted silence and audio artifacts
+- **Volume Normalization**: Professional-grade volume balancing for all voices
+- **Real-time Audio Analysis**: Live volume level monitoring and feedback
+- **Preview System**: Test settings before applying to entire projects
+- **Batch Processing**: Process multiple projects efficiently
+- **Quality Control**: Advanced audio optimization tools
+- **🎯 Enhanced Audio Quality** ⭐ *NEW*: Improved P-top and minimum P parameters for better voice generation
+### 🎭 **Voice Management**
+- **Voice Library**: Organize and manage your voice collection
+- **Voice Cloning**: Create custom voices from audio samples
+- **Volume Settings**: Configure target volume levels for each voice
+- **Professional Presets**: Industry-standard volume levels (audiobook, podcast, broadcast)
+- **Character Assignment**: Map specific voices to story characters
+### 📊 **Volume Normalization System** ⭐ *NEW*
+- **Professional Standards**: Audiobook (-18 dB), Podcast (-16 dB), Broadcast (-23 dB) presets
+- **Consistent Character Voices**: All characters maintain the same volume level
+- **Real-time Analysis**: Color-coded volume status with RMS and peak level display
+- **Retroactive Normalization**: Apply volume settings to existing voice projects
+- **Multi-Voice Support**: Batch normalize all voices in multi-character audiobooks
+- **Soft Limiting**: Intelligent audio limiting to prevent distortion
+### 📖 **Text Processing**
+- **Chapter Support**: Automatic chapter detection and organization
+- **Multi-Voice Parsing**: Parse character dialogue automatically
+- **Text Validation**: Ensure proper formatting before generation
+- **📋 Queue Management** ⭐ *NEW*: Batch process multiple text files sequentially
+- **🔇 Return Pause System** ⭐ *NEW*: Automatic pause insertion based on line breaks for natural speech flow
+---
+## 🎭 Custom Audiobook Processing Pipeline ⭐ *NEW*
+Our advanced text processing pipeline transforms your written content into natural-sounding audiobooks with intelligent pause placement and character flow management.
+### 🔇 **Return Pause System**
+**Automatic pause insertion based on your text formatting** - Every line break (`\n`) in your text automatically adds a 0.1-second pause to the generated audio, creating natural speech rhythms without manual intervention.
+#### **How It Works**
+- **Line Break Detection**: System automatically counts all line breaks in your text
+- **Pause Calculation**: Each return adds exactly 0.1 seconds of silence
+- **Accumulative Pauses**: Multiple consecutive line breaks create longer pauses
+- **Universal Support**: Works with single-voice, multi-voice, and batch processing
+#### **Example Text Formatting**
+```
+[Narrator] The sun was setting over the hills.
+[Character1] "We need to find shelter soon."
+[Character2] "I see a cave up ahead.
+Let's hurry before it gets dark."
+[Narrator] They rushed toward the cave, hearts pounding.
+```
+**Result**: Natural pauses between dialogue, emphasis pauses for dramatic effect, and smooth character transitions.
+### 📝 **Text Formatting Best Practices**
+#### **🎭 Multi-Voice Dialogue Structure**
+```
+[Character Name] Dialogue content here.
+[Another Character] Response content here.
+Multiple lines can be used for the same character.
+[Narrator] Descriptive text and scene setting.
+```
+#### **🎪 Natural Flow Techniques**
+- **Paragraph Breaks**: Use double line breaks for scene transitions
+- **Emphasis Pauses**: Add extra returns before important revelations
+- **Character Separation**: Single returns between different speakers
+- **Breathing Room**: Natural pauses for complex concepts or emotional moments
+#### **📖 Single Voice Formatting**
+```
+Chapter content flows naturally here.
+New paragraphs create natural pauses.
+Extended pauses can emphasize dramatic moments.
+Regular text continues with normal pacing.
+```
+### 🔄 **Processing Pipeline Features**
+#### **🧠 Intelligent Text Analysis**
+- **Line Break Preservation**: Maintains your formatting intentions throughout processing
+- **Character Assignment**: Automatically maps voice tags to selected voice profiles
+- **Chunk Optimization**: Breaks long texts into optimal segments while preserving pause timing
+- **Error Recovery**: Validates text and provides helpful formatting suggestions
+#### **⚡ Real-Time Processing**
+- **Live Feedback**: Console output shows exactly how many pauses are being added
+- **Debug Information**: Detailed logging of pause detection and application
+- **Progress Tracking**: Monitor pause processing alongside audio generation
+- **Quality Assurance**: Automatic validation of pause placement
+#### **🎚️ Professional Output**
+- **Seamless Integration**: Pauses blend naturally with generated speech
+- **Volume Consistency**: Silence segments match the audio output specifications
+- **Format Compatibility**: Works with all supported audio formats and quality settings
+- **Project Preservation**: Pause information saved in project metadata for regeneration
+### 💡 **Pro Tips for Better Audiobooks**
+#### **🎯 Dialogue Formatting**
+- **Character Consistency**: Always use the same character name format `[Name]`
+- **Natural Breaks**: Place returns where a human reader would naturally pause
+- **Scene Transitions**: Use multiple returns (2-3) for major scene changes
+- **Emotional Beats**: Add single returns before/after emotional dialogue
+#### **📚 Chapter Structure**
+```
+Chapter 1: The Beginning
+Opening paragraph with scene setting.
+"Character dialogue with natural flow."
+Descriptive narrative continues.
+Major scene transition with extended pause.
+New section begins here.
+```
+#### **🎪 Advanced Techniques**
+- **Cliffhangers**: Use extended pauses before revealing crucial information
+- **Action Sequences**: Shorter, punchy sentences with minimal pauses for intensity
+- **Contemplative Moments**: Longer pauses for reflection and character development
+- **Comedic Timing**: Strategic pauses before punchlines or comedic reveals
+### 🔍 **Debug Output Examples**
+When generating your audiobook, watch for these helpful console messages:
+```
+🔇 Detected 15 line breaks → 1.5s total pause time
+🔇 Line breaks detected in [Character1]: +0.3s pause (from 3 returns)
+🔇 Chunk 2 (Narrator): Added 0.2s pause after speech
+```
+This real-time feedback helps you understand exactly how your formatting translates to audio timing.
+---
+## 🆕 Recent Improvements
+### 🎯 **Audio Quality Enhancements**
+We've significantly improved audio generation quality by optimizing the underlying TTS parameters:
+- **Enhanced P-top and Minimum P Settings**: Fine-tuned probability parameters for more natural speech patterns
+- **Reduced Audio Artifacts**: Better handling of pronunciation and intonation
+- **Improved Voice Consistency**: More stable voice characteristics across long generations
+- **Better Pronunciation**: Enhanced handling of complex words and names
+**📝 Note for Existing Users**:
+- Older voice profiles will continue to work as before
+- To take advantage of the new audio quality improvements, consider re-creating voice profiles
+- Existing projects remain fully compatible
+### 📋 **Text Queuing System**
+Perfect for processing large books or multiple chapters:
+- **Batch Upload**: Upload multiple text files of any size
+- **Sequential Processing**: Automatically processes files one after another
+- **Progress Tracking**: Monitor generation progress across all queued items
+- **Flexible Chapter Sizes**: No restrictions on individual file length
+- **Unattended Generation**: Set up large projects and let them run automatically
+### 🔄 **Chunk-Based TTS System**
+Enhanced the core text-to-speech engine for better reliability:
+- **Background Chunking**: Automatically splits long texts into optimal chunks
+- **Memory Management**: Better handling of large text inputs
+- **Error Recovery**: Improved resilience during long generation sessions
+- **Consistent Quality**: Maintains voice quality across chunk boundaries
+- **Progress Feedback**: Real-time updates on generation progress
+---
+## 🎚️ Volume Normalization Guide
+### **Individual Voice Setup**
+1. Go to **Voice Library** tab
+2. Upload your voice sample and configure settings
+3. Set target volume level (default: -18 dB for audiobooks)
+4. Choose from professional presets or use custom levels
+5. Save voice profile with volume settings
+### **Multi-Voice Projects**
+1. Navigate to **Multi-Voice Audiobook Creation** tab
+2. Enable volume normalization for all voices
+3. Set target level for consistent character voices
+4. All characters will be automatically normalized during generation
+### **Text Queuing Workflow** ⭐ *NEW*
+1. Go to **Production Studio** tab
+2. Select "Batch Processing" mode
+3. Upload multiple text files (chapters, sections, etc.)
+4. Choose your voice and settings
+5. Start batch processing - files will generate sequentially
+6. Monitor progress and download completed audiobooks
+### **Professional Standards**
+- **📖 Audiobook Standard**: -18 dB RMS (recommended for most audiobooks)
+- **🎙️ Podcast Standard**: -16 dB RMS (for podcast-style content)
+- **🔇 Quiet/Comfortable**: -20 dB RMS (for quiet listening environments)
+- **🔊 Loud/Energetic**: -14 dB RMS (for dynamic, energetic content)
+- **📺 Broadcast Standard**: -23 dB RMS (for broadcast television standards)
+---
+## 📁 Project Structure
+```
+📦 Your Audiobook Projects
+├── 🎤 speakers/           # Voice library and samples
+├── 📚 audiobook_projects/ # Generated audiobooks
+├── 🔧 src/audiobook/      # Core processing modules
+└── 📄 Generated files...  # Audio chunks and final outputs
+```
+---
+## 🎯 Workflow
+1. **📝 Prepare Text**: Format your story with proper chapter breaks and strategic line breaks for natural pauses
+2. **🎤 Select Voices**: Choose or clone voices for your characters
+3. **🎚️ Configure Volume**: Set professional volume levels and normalization
+4. **⚙️ Configure Settings**: Adjust quality, speed, and processing options
+5. **🎧 Generate Audio**: Create your audiobook with advanced TTS and automatic pause insertion
+6. **🧹 Clean & Optimize**: Use smart cleanup tools for perfect audio
+7. **📦 Export**: Get your finished audiobook ready for distribution
+### 🎭 **Enhanced Multi-Voice Workflow**
+1. **📝 Format Dialogue**: Use `[Character]` tags and strategic line breaks for natural flow
+2. **🔇 Add Return Pauses**: Place line breaks where you want natural speech pauses (0.1s each)
+3. **🎤 Assign Voices**: Map each character to their voice profile
+4. **⚡ Process with Intelligence**: Watch console output for pause detection feedback
+5. **🎧 Review & Adjust**: Listen to generated audio and refine formatting if needed
+### 📋 **Batch Processing Workflow** ⭐ *NEW*
+1. **📚 Organize Chapters**: Split your book into individual text files
+2. **📋 Queue Setup**: Upload all files to the batch processing system
+3. **🎤 Voice Selection**: Choose voice and configure settings once
+4. **🔄 Automated Generation**: Let the system process all files sequentially
+5. **📊 Monitor Progress**: Track completion status in real-time
+6. **📦 Collect Results**: Download all generated audiobook chapters
+---
+## 🛠️ Technical Requirements
+- **Python 3.8+**
+- **CUDA GPU** (recommended for faster processing)
+- **8GB+ RAM** (16GB recommended for large projects)
+- **Modern web browser** for the interface
+### 🔧 **CUDA Support**
+- CUDA compatibility issues have been resolved with updated dependencies
+- GPU acceleration is now stable for extended generation sessions
+- Fallback to CPU processing available if CUDA issues occur
+- **If you encounter CUDA assertion errors**: Use the patched version from the installation instructions above
+- The fix addresses PyTorch indexing issues that could cause crashes during audio generation
+---
+## ⚠️ Known Issues & Compatibility
+### **Multi-Voice Generation**
+- Short sentences or sections may occasionally cause issues during multi-voice generation
+- This is a limitation of the underlying TTS models rather than the implementation
+- **Workaround**: Use longer, more detailed sentences for better stability
+- Single-voice generation is not affected by this issue
+### **Voice Profile Compatibility**
+- **Existing Voices**: All older voice profiles remain fully functional
+- **New Features**: To benefit from improved audio quality, consider re-creating voice profiles
+- **Project Compatibility**: Existing audiobook projects work without modification
+- **Regeneration**: Individual chunks can be regenerated with improved quality settings
+### **Batch Processing Considerations**
+- Large batch jobs may take significant time depending on text length and hardware
+- Monitor system resources during extended batch processing sessions
+- Consider processing very large books in smaller batches for better control
+---
+## 📋 Supported Formats
+### Input
+- **Text**: `.txt`, `.md`, formatted stories and scripts
+- **Audio Samples**: `.wav`, `.mp3`, `.flac` for voice cloning
+- **Batch Files**: Multiple text files for queue processing
+### Output
+- **Audio**: High-quality `.wav` files with professional volume levels
+- **Projects**: Organized folder structure with chapters
+- **Exports**: Ready-to-use audiobook files
+- **Batch Results**: Multiple completed audiobooks from queue processing
+---
+## 🆘 Support
+- **Features Guide**: See `AUDIOBOOK_FEATURES.md` for detailed capabilities
+- **Development Notes**: Check `development/` folder for technical details
+- **Issues**: Report problems via GitHub issues
+---
+## 📄 License
+This project is licensed under the terms specified in `LICENSE`.
+---
+**🎉 Ready to create amazing audiobooks with professional volume levels and enhanced audio quality? Run `./launch_audiobook.bat` and start generating!**