Overview
Voicebot by UnifyApps enables sophisticated voice interaction capabilities within your automation workflows. This powerful integration allows you to create, manage, and process voice-based communications through AI-powered voice agents. The system provides real-time voice processing, session management, and intelligent response handling, making it perfect for customer service automation, voice-activated workflows, and interactive voice response systems.


Use Cases
Automated Customer Support:
A customer service team implements Voicebot by UnifyApps to handle initial customer inquiries. When customers call, the voicebot starts a session, listens to their requests, processes the audio content, and provides intelligent responses. Complex queries are seamlessly transferred to human agents with full conversation context, reducing wait times and improving customer satisfaction.
Voice-Activated Workflow Triggers:
A logistics company uses voice commands to trigger warehouse operations. Workers speak into devices to initiate inventory checks, shipment processing, or status updates. The voicebot processes these voice commands, converts them to actionable data, and triggers appropriate workflow sequences, streamlining operations and reducing manual data entry.
Interactive Voice Surveys:
A market research firm automates survey collection through voice interactions. The voicebot initiates sessions with respondents, asks survey questions, processes spoken responses, and records answers in structured formats. This approach increases response rates and provides richer qualitative data compared to traditional text-based surveys.


Send Response
The Send Response action in Voicebot by UnifyApps enables your automation to provide voice responses back to users during active voice sessions. This action controls how the voicebot communicates with users, managing response delivery and call flow control.
Input fields
Interrupt
: Configure whether the response should interrupt any currently playing audio or wait for it to complete.
Type: Boolean
Default: true
Options:
true: Immediately interrupt current audio playback
false: Wait for current audio to finish before responding
End Call
: Determine whether the voice session should be terminated after sending the response.
Type: Boolean
Default: false
Options:
true: End the voice session after response delivery
false: Keep the session active for continued interaction
Advanced Configuration
Fallback Mode
: STOP - Determines action behavior when errors occurResource Version
: 86 - Current version of the Send Response functionalityGroup Integration
: Links with other voicebot actions in the same workflow group
Output
The Send Response action provides:
Response Status
: Confirmation of successful response deliverySession State
: Current status of the voice sessionTiming Information
: Response delivery timestampsError Details
: Any issues encountered during response transmission
This action is essential for creating interactive voice experiences where your automation needs to provide intelligent responses based on user input or workflow logic.


Respond to Voicebot Session
The Respond to Voicebot Session action processes incoming voice interactions and prepares responses within an active voice session. This action serves as the core processing engine for handling user voice input and generating appropriate responses.
Input Fields
Event Id
: Unique identifier for the specific voice event being processed.
Type: String
Required: Yes
Purpose: Links the response to the specific voice interaction event
Audio Content
: The processed audio data from the user's voice input.
Type: Audio data object
Format: Typically processed speech-to-text content
Usage: Contains the actual voice input that needs to be processed
Room Name
: Identifier for the voice session room or channel.
Type: String
Purpose: Organizes voice sessions and enables multi-user voice environments
Session Id
: Unique identifier for the current voice interaction session.
Type: String
Required: Yes
Purpose: Maintains session continuity and context
Received Time
: Timestamp indicating when the voice input was received.
Type: DateTime
Format: ISO 8601 timestamp
Usage: Enables timing analysis and session sequencing
Sent Time
: Timestamp for when the original voice input was transmitted.
Type: DateTime
Format: ISO 8601 timestamp
Usage: Calculates processing delays and response times
Start Event
: Configuration for session initiation parameters.
Type: Dropdown selection
Options: Various session start triggers and conditions
Purpose: Defines how the voice session begins
Agent State
: Current operational state of the voice agent.
Type: String
Values: Active, Listening, Processing, Responding, Idle
Purpose: Manages agent behavior and availability
Format
: Audio and response format specifications.
Type: String
Common Values: PCM16, MP3, WAV
Purpose: Ensures compatibility between voice input and output formats
Advanced options
Caching
: Disabled by default for real-time voice processingRetry
: Disabled to prevent voice interaction loopsError Handling
: Stop automation to prevent cascading voice session errors
Output
For each voice interaction processed, the action outputs:
Processed Response
: The generated response ready for voice synthesisSession Context
: Updated session state and conversation historyProcessing Metrics
: Response time, confidence scores, and processing statisticsNext Action Indicators
: Guidance for subsequent workflow steps
This action is crucial for creating intelligent voice interactions that understand user intent and provide contextually appropriate responses.


Start Voicebot Session
The Start Voicebot Session action initializes new voice interaction sessions, establishing the connection between users and your voice-enabled automation workflows. This action sets up the technical foundation for voice communication and configures session parameters.
Input fields
Session Id
: Unique identifier for the voice session being created.
Type: String
Generation: Auto-generated or manually specified
Purpose: Tracks the session throughout its lifecycle
User Id
: Identifier for the user participating in the voice session.
Type: String
Required: Recommended for personalized interactions
Usage: Links voice sessions to specific users or accounts
Workflow Id
: Reference to the automation workflow that will process voice interactions.
Type: String
Required: Yes
Purpose: Determines which automation logic handles the voice session
Input Audio Format
: Technical specification for incoming audio processing.
Type: String
Default: pcm16
Options:
pcm16: 16-bit PCM audio (recommended for quality)
pcm8: 8-bit PCM audio (lower bandwidth)
mp3: Compressed audio format
wav: Uncompressed audio format
Output Audio Format
: Technical specification for outgoing audio synthesis.
Type: String
Default: pcm16
Options:
pcm16: 16-bit PCM audio (recommended for quality)
pcm8: 8-bit PCM audio (lower bandwidth)
mp3: Compressed audio format
wav: Uncompressed audio format
Create Room
: Determines whether to establish a new voice session room.
Type: Boolean
Default: true
Options:
true: Create a new isolated voice session environment
false: Join an existing voice session room
Case Id
: Optional identifier linking the voice session to specific cases or tickets.
Type: String
Usage: Useful for customer service scenarios where voice sessions relate to support cases
AI Agent Id
: Identifier for the specific AI agent that will handle voice interactions.
Type: String
Purpose: Enables multiple AI agents with different capabilities or personalities
Advanced Configuration
Caching
: Disabled for real-time voice processing requirementsRetry
: Disabled to prevent duplicate session creationError Handling
: Stop automation to prevent incomplete session initializationResource Version
: 10 - Current version of session start functionality
Output
Upon successful session initiation, the action provides:
Session Details
: Complete session configuration and identifiersConnection Status
: Confirmation of established voice connectionAgent Assignment
: Details of assigned AI agent for the sessionRoom Information
: Voice session room details and access parametersQuality Metrics
: Initial connection quality and latency measurements
This action is the foundation for all voice interactions, ensuring proper session setup and configuration for optimal voice processing performance.
Start Listening to Voicebot Session
The Start Listening to Voicebot Session action activates audio input monitoring for established voice sessions. This action enables your voicebot to actively listen for user voice input and begin processing spoken interactions.
Input Fields
Session Id
: Reference to the active voice session that should begin listening.
Type: String
Required: Yes
Source: Typically from the Start Voicebot Session action output
Purpose: Links listening activation to the correct session
Workflow Id
: Reference to the workflow that will process detected voice input.
Type: String
Required: Yes
Purpose: Determines processing logic for incoming voice data
Room Name
: Identifier for the voice session room where listening should occur.
Type: String
Source: From session creation or room assignment
Purpose: Focuses listening on the correct voice channel
Track Id
: Unique identifier for the audio track being monitored.
Type: String
Usage: Enables multiple audio stream monitoring within single sessions
Purpose: Manages complex voice environments with multiple participants
Functionality
The listening activation process:
Audio Stream Connection
: Establishes connection to the voice session's audio inputVoice Activity Detection
: Monitors for speech patterns and voice activityNoise Filtering
: Applies audio processing to improve voice recognition qualityContinuous Monitoring
: Maintains active listening state throughout the sessionEvent Triggering
: Generates voice input events when speech is detected
Advanced Configuration
Caching
: Disabled for real-time audio processingRetry
: Disabled to prevent audio stream conflictsError Handling
: Stop automation to prevent incomplete listening setupResource Version
: 10 - Current version of listening functionality
Output
When listening is successfully activated, the action provides:
Listening Status
: Confirmation that voice monitoring is activeAudio Stream Details
: Technical information about the audio input streamDetection Sensitivity
: Current voice activity detection settingsProcessing State
: Status of voice recognition and processing pipelineEvent Configuration
: Details of how voice input events will be generated
This action is essential for creating responsive voice interactions, as it enables your automation to detect and respond to user voice input in real-time.
Workflow Integration Patterns
Sequential Voice Processing
The typical workflow pattern follows this sequence:
Start Voicebot Session
- Initialize voice communicationStart Listening to Voicebot Session
- Activate voice input monitoringRespond to Voicebot Session
- Process incoming voice interactionsSend Response
- Deliver voice responses back to users
Error Handling Strategy
All voicebot actions are configured with "STOP
" fallback mode, meaning:
Errors in any voice action will halt the workflow
This prevents partial voice sessions or confused user experiences
Proper error handling ensures voice interactions remain coherent
Session Management
Session Persistence
: Voice sessions maintain state across multiple interactionsContext Preservation
: Conversation history and user context are maintainedResource Cleanup
: Sessions are properly closed when interactions complete
Audio Format Specifications
PCM16 (Recommended)
Quality: High-quality uncompressed audio
Compatibility: Widely supported across voice processing systems
Bandwidth: Higher data usage but optimal for voice recognition
Use Case: Professional voice applications requiring accuracy
Audio Processing Pipeline
Input Processing: User voice → Audio format conversion → Speech recognition
Response Generation: Text processing → Voice synthesis → Audio format conversion
Output Delivery: Formatted audio → User playback