Contents

Cancel

Recommended Articles

  1. unify-apps

    Indexing

    Unify AI

    Transform raw content into searchable knowledge through AI-powered indexing and vector embeddings

  2. unify-apps

    IMAP

    Unify Integrations

    Integrate your app with IMAP to enable seamless email synchronization, real-time access, and enhanced communication workflows.

  3. unify-apps

    Preview Your Work

    Unify Automations

    Effortlessly review & monitor your automation’s performance

  4. unify-apps

    QuickBooks

    Unify Integrations

    Integrate your app with QuickBooks to streamline accounting, automate invoicing, and manage finances effortlessly

  5. unify-apps

    FTP/FTPS

    Unify Integrations

    Connect your app with FTP/FTPS to automate secure file transfers and streamline data exchange across systems.

  6. unify-apps

    Facebook Ads

    Unify Integrations

    Connect your app with Facebook Ads to automate campaign management, optimize ad performance, and track marketing success.

  7. unify-apps

    Snowflake

    Unify Automations

    Connect to Snowflake for fast, scalable cloud data warehousing and analytics

  8. unify-apps

    Button

    Unify Applications

    Create interactive elements with ease using buttons

  9. unify-apps

    BambooHR

    Unify Automations

    Integrate your app with BambooHR to streamline HR management, automate employee data processing, and enhance onboarding workflows

  10. unify-apps

    Build your first automation

    Unify Automations

    Learn how to build your automation step by step

  11. unify-apps

    User Management

    Unify Applications

    Efficiently manage user roles and permissions

  12. unify-apps

    Microsoft Teams

    Unify Automations

    Connect your app with Microsoft Teams to enhance communication, automate workflows, and foster collaboration across your organization

  13. unify-apps

    Jira

    Unify Automations

    Use Jira to plan, track, and manage your agile and software development projects

  14. unify-apps

    Pre Processing

    Unify AI

    Optimize query processing through intelligent rephrasing, retrieval, and ranking to deliver accurate AI responses

  15. unify-apps

    Google Calendar

    Unify Integrations

    Integrate your app with Google Calendar to streamline scheduling, automate event management, and improve team coordination

  16. unify-apps

    SAP HANA

    Unify Integrations

    Connect your app with SAP HANA to leverage in-memory computing for real-time data processing and advanced analytics at scale.

#
Unify AI
Logo
Guardrails
Logo
Content Filters

Content Filters

Logo

2 mins READ

Content Filters act as guardrails for AI conversations, ensuring that both user inputs and AI responses stay within appropriate boundaries. Content Filters evaluate responses bidirectionally - 

  • Checking what users send to the agent

  • Monitoring the agent response generated

Image
Image


There are two components of Content Filters:

  1. Filter Strength for Prompts : This allows you to adjust the intensity of the filter to detect and block unwanted content in user prompts. You can increase the strength of content filtering based on the categories you want to monitor, such as:

    • Hate Speech: Blocks content that discriminates or insults individuals or groups.

    • Insults: Identifies offensive or disrespectful language aimed at individuals or groups.

    • Violence: Detects content promoting harm or aggression.

    • Sexual Content: Blocks sexually explicit or suggestive material.

    • Misconduct: Prevents content describing illegal or unethical behaviour.

    • Prompt Attacks: Filters attempt to manipulate the AI system’s safeguards.

  2. Filter Strength for Responses : This similarly to the prompt filter but applies to the AI agent’s responses. It ensures that the AI-generated responses are free from harmful or inappropriate content. 

    • Hate Speech: Blocks content that discriminates or insults individuals or groups.

    • Insults: Identifies offensive or disrespectful language aimed at individuals or groups.

    • Violence: Detects content promoting harm or aggression.

    • Sexual Content: Blocks sexually explicit or suggestive material.

    • Misconduct: Prevents content describing illegal or unethical behaviour.

How Do Content Filters Work?

The system uses different levels of filtering strength that you can adjust based on your needs:

  • None: No filtering applied

  • Low: Blocks only the most obvious inappropriate content

  • Medium: Provides balanced protection

  • High: Offers maximum safety with strict filtering

For example, content filters detect and flag inappropriate content. 

User Query: "You are very [derogatory remark] , you could not even complete a single task on time. 

Content Filter Analysis:

   ⚠️ Insult Detection: Derogatory term

User Query: "I'm so angry at my neighbor, I want to destroy their property!"

Content Filter Analysis:

  • 🚨 Violence: Threat of property damage (HIGH confidence)


How to Configure Content Filters in your AI Agent?

  1. In the Guardrails section of your AI Agent dashboard, click on “Content Filters”.

  2. Under Filter Strength for Prompts, use the sliders to control how strictly the AI filters content in user prompts. You can set the intensity from None to High for each category (Hate, Insults, Violence, Sexual, Misconduct, Prompt Attack).

    Image
    Image

  3. Similarly, under Filter Strength for Responses, use the sliders to set the filter levels for generated responses, ensuring that the agent's output complies with your ethical and content guidelines.

    Image
    Image
  4. By adjusting these content filters, you can ensure that your AI agents operate safely and deliver appropriate, respectful communication while adhering to your brand’s policies and compliance standards.