Real-time monitoring of directories for virus scanning is crucial for maintaining security in dynamic environments. Rust, combined with ClamAV and Tokio, provides a robust solution for this task. Let's explore how to build a real-time directory monitoring tool that automatically scans new or modified files for viruses.

Why real-time monitoring and Rust?

Real-time monitoring ensures immediate detection and response to potential threats. Rust's performance, safety, and concurrency features make it ideal for security tools. Coupled with ClamAV's powerful virus detection capabilities and Tokio's asynchronous operations, we can create an efficient and responsive monitoring system.

Prerequisites

Ensure you have the following installed:

  • Rust and Cargo (Rust 1.75 or newer recommended for compatibility with the specified dependency versions)
  • ClamAV (daemon and client libraries)

Setting up ClamAV daemon (clamd)

For the Rust application to connect, the clamd daemon must be running and configured. Here's a basic setup for Debian/Ubuntu systems:

  1. Install ClamAV:

    sudo apt-get update
    sudo apt-get install clamav-daemon
    
  2. Update Virus Definitions: It's crucial to have up-to-date virus definitions.

    sudo freshclam
    
  3. Configure clamd: Edit the configuration file, typically located at /etc/clamav/clamd.conf. Ensure the following lines (or similar, depending on your needs) are present and uncommented. For this example, we'll use a TCP socket.

    # Ensure TCPAddr and TCPSocket are enabled for network listening
    # Example: Listen on TCP socket on localhost port 3310
    TCPSocket 3310
    TCPAddr 127.0.0.1
    
    # If you use TCP, you might want to comment out LocalSocket
    # LocalSocket /var/run/clamav/clamd.ctl
    
    # Ensure the User directive is active and the user exists
    User clamav
    

    After making changes, you'll need to restart the clamd service.

  4. Start and Enable clamd:

    sudo systemctl start clamav-daemon
    sudo systemctl enable clamav-daemon # To start on boot
    

    You can check its status with sudo systemctl status clamav-daemon.

Project setup

  1. Create a new Rust project:

    cargo new realtime_virus_scanner
    cd realtime_virus_scanner
    
  2. Add dependencies to Cargo.toml: Open your Cargo.toml file and add the following dependencies. These versions are known to work together and align with the code examples below.

    [package]
    name = "realtime_virus_scanner"
    version = "0.1.0"
    edition = "2021"
    
    [dependencies]
    clamav-client = { version = "2.0.1", features = ["tokio"] }
    tokio = { version = "1.25.0", features = ["full"] }
    notify = "6.1.1"
    clap = { version = "4.4.8", features = ["derive"] }
    

    After saving Cargo.toml, Cargo will fetch these crates when you build or run the project.

Command-line interface with clap

We'll use clap with its derive macro for parsing command-line arguments. This allows users to specify the directory to monitor and the ClamAV daemon's address.

use clap::Parser;

#[derive(Parser, Debug)]
#[command(author, version, about = "Monitors a directory and scans new/modified files for viruses using ClamAV.", long_about = None)]
struct Args {
    /// Directory to monitor
    #[arg(short, long)]
    directory: String,

    /// ClamAV daemon address (e.g., 127.0.0.1:3310)
    #[arg(long, default_value = "127.0.0.1:3310")]
    clamd_addr: String,
}

// The main function will be async due to Tokio and await operations.
// The rest of the main function body will be shown in the "Putting it all together" section.

Real-time directory monitoring with notify

The notify crate will watch for file system events. We'll use tokio::sync::mpsc (multi-producer, single-consumer) channels to send file paths from notify's synchronous callback to our asynchronous scanning logic.

use notify::{Config, RecommendedWatcher, RecursiveMode, Watcher, event::EventKind};
use std::path::Path;
use tokio::sync::mpsc;
// We'll define ClamAVTcpConfig and scan_file_with_clamav later.

async fn watch_directory(
    dir_path: &str,
    clamd_address: &str,
    clamav_config: clamav_client::tokio::Tcp, // Pass pre-configured ClamAV TCP settings
) -> Result<(), Box<dyn std::error::Error>> {
    let (tx, mut rx) = mpsc::channel::<String>(100); // Tokio MPSC channel

    let mut watcher = RecommendedWatcher::new(
        move |res: Result<notify::Event, notify::Error>| {
            match res {
                Ok(event) => {
                    // We are interested in create events or modify events that are not just metadata changes.
                    if event.kind.is_create() || (event.kind.is_modify() && !event.kind.is_metadata()) {
                        for path_buf in event.paths {
                            if path_buf.is_file() { // Ensure it's a file
                                if let Some(p_str) = path_buf.to_str() {
                                    // try_send is non-blocking, suitable for sync notify callback
                                    if let Err(e) = tx.try_send(p_str.to_string()) {
                                        eprintln!("Failed to send path for scanning: {}. Path: {}", e, p_str);
                                    }
                                }
                            }
                        }
                    }
                }
                Err(e) => eprintln!("Watch error: {:?}", e),
            }
        },
        Config::default().with_poll_interval(std::time::Duration::from_secs(2)), // Optional: configure polling
    )?;

    watcher.watch(Path::new(dir_path), RecursiveMode::Recursive)?;
    println!("Watcher started on directory: {}", dir_path);

    // Async loop to receive file paths and spawn scanning tasks
    while let Some(file_path_to_scan) = rx.recv().await {
        println!("Change detected, queueing scan for: {}", file_path_to_scan);
        let config_clone = clamav_config.clone(); // Clone config for the spawned task
        tokio::spawn(async move {
            scan_file_with_clamav(config_clone, file_path_to_scan).await;
        });
    }
    Ok(())
}

Asynchronous virus scanning with clamav-client and tokio

When a file event is received, we'll use the clamav-client crate to scan the file. This operation is performed in a separate Tokio task to keep the monitoring responsive.

use clamav_client::tokio::{scan_file as clamav_scan_file, Tcp as ClamAVTcpConfig};
// std::path::Path was imported in the previous snippet.

async fn scan_file_with_clamav(clamd_config: ClamAVTcpConfig, file_path: String) {
    // Basic check, though notify should ideally only send file paths.
    if !Path::new(&file_path).is_file() {
        // eprintln!("Path is not a file, skipping scan: {}", file_path);
        return;
    }
    println!("Attempting to scan file: {}", file_path);

    match clamav_scan_file(&file_path, clamd_config, None).await { // None for default chunk size
        Ok(response_vec) => {
            if response_vec.is_empty() {
                 println!("Scan result for {}: ClamAV returned empty response (file likely clean, unreadable, or skipped by ClamAV).", file_path);
                 return;
            }
            let mut infected = false;
            for response in response_vec {
                // The clamav-client crate parses the response from clamd.
                // If response.virus_name is Some, a threat was detected.
                if let Some(virus) = response.virus_name {
                    println!("!!! VIRUS DETECTED in {}: {}", file_path, virus);
                    // Implement further actions here: quarantine, notify admin, etc.
                    infected = true;
                    break; // Found a virus, no need to check other responses for this file.
                }
            }
            if !infected {
                println!("File clean or no threat detected: {}", file_path);
            }
        }
        Err(e) => {
            eprintln!("Error scanning file {}: {}", file_path, e);
            // Handle specific errors, e.g., connection issues with clamd.
        }
    }
}

Putting it all together: src/main.rs

Here's the complete src/main.rs file incorporating all the pieces:

use clap::Parser;
use notify::{Config, EventKind, RecommendedWatcher, RecursiveMode, Watcher};
use std::path::Path;
use tokio::sync::mpsc;
use clamav_client::tokio::{scan_file as clamav_scan_file, Tcp as ClamAVTcpConfig};

#[derive(Parser, Debug)]
#[command(author, version, about = "Monitors a directory and scans new/modified files for viruses using ClamAV.", long_about = None)]
struct Args {
    /// Directory to monitor
    #[arg(short, long)]
    directory: String,

    /// ClamAV daemon address (e.g., 127.0.0.1:3310)
    #[arg(long, default_value = "127.0.0.1:3310")]
    clamd_addr: String,
}

async fn scan_file_with_clamav(clamd_config: ClamAVTcpConfig, file_path: String) {
    if !Path::new(&file_path).is_file() {
        return;
    }
    println!("Attempting to scan file: {}", file_path);

    match clamav_scan_file(&file_path, clamd_config, None).await {
        Ok(response_vec) => {
            if response_vec.is_empty() {
                 println!("Scan result for {}: ClamAV returned empty response (file likely clean, unreadable, or skipped).", file_path);
                 return;
            }
            let mut infected = false;
            for response in response_vec {
                if let Some(virus) = response.virus_name {
                    println!("!!! VIRUS DETECTED in {}: {}", file_path, virus);
                    infected = true;
                    break;
                }
            }
            if !infected {
                println!("File clean or no threat detected: {}", file_path);
            }
        }
        Err(e) => {
            eprintln!("Error scanning file {}: {}", file_path, e);
        }
    }
}

async fn watch_directory(
    dir_path: &str,
    clamd_address: &str,
    clamav_config: ClamAVTcpConfig,
) -> Result<(), Box<dyn std::error::Error>> {
    let (tx, mut rx) = mpsc::channel::<String>(100);

    let mut watcher = RecommendedWatcher::new(
        move |res: Result<notify::Event, notify::Error>| {
            match res {
                Ok(event) => {
                    if event.kind.is_create() || (event.kind.is_modify() && !event.kind.is_metadata()) {
                         for path_buf in event.paths {
                            if path_buf.is_file() {
                                if let Some(p_str) = path_buf.to_str() {
                                    if let Err(e) = tx.try_send(p_str.to_string()) {
                                        eprintln!("Failed to send path for scanning: {}. Path: {}", e, p_str);
                                    }
                                }
                            }
                        }
                    }
                }
                Err(e) => eprintln!("Watch error: {:?}", e),
            }
        },
        Config::default().with_poll_interval(std::time::Duration::from_secs(2)),
    )?;

    watcher.watch(Path::new(dir_path), RecursiveMode::Recursive)?;
    println!("Watcher started on directory: {}. Waiting for file changes...", dir_path);

    while let Some(file_path_to_scan) = rx.recv().await {
        println!("Change detected, queueing scan for: {}", file_path_to_scan);
        let config_clone = clamav_config.clone();
        tokio::spawn(async move {
            scan_file_with_clamav(config_clone, file_path_to_scan).await;
        });
    }

    // This part of the code will only be reached if the channel `tx` is dropped and `rx` becomes empty.
    // In this setup, `tx` is moved into the watcher closure and lives as long as `watcher`.
    // `watcher` itself lives until the end of this `watch_directory` function's scope.
    // For a long-running service, you'd ensure this function or the watcher is kept alive appropriately.
    Ok(())
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let args = Args::parse();

    println!("Real-time Virus Scanner starting...");
    println!("Monitoring directory: {}", args.directory);
    println!("ClamAV daemon address: {}", args.clamd_addr);

    let clamd_tcp_config = ClamAVTcpConfig {
        host_address: args.clamd_addr.clone(),
    };

    if let Err(e) = watch_directory(&args.directory, &args.clamd_addr, clamd_tcp_config).await {
        eprintln!("Critical error in watch_directory: {}", e);
        std::process::exit(1);
    }

    Ok(())
}

Error handling and Notifications

The provided code includes basic error handling by printing messages to stdout or stderr. For a production-grade application, consider implementing more robust strategies:

  • Structured Logging: Use a crate like tracing or log for structured logging, which can be configured to output to files or centralized logging systems.
  • Alerting: When a virus is detected or a critical error (e.g., cannot connect to clamd) occurs, send notifications via email, SMS, or a messaging platform like Slack.
  • File Actions: Decide on actions for infected files: quarantine to a safe directory, attempt disinfection (if supported and safe), or delete. These actions require careful permission handling.
  • Retry Mechanisms: For transient errors like network issues when communicating with clamd, implement a retry strategy with backoff.

Performance considerations

  • Concurrent Scans: Tokio tasks enable concurrent scanning. However, an uncontrolled number of concurrent scans might overload the clamd daemon or system resources. To manage this, you could use tokio::sync::Semaphore to limit the number of active scanning tasks.
  • ClamAV Daemon Performance: The performance of clamd itself is a factor. Ensure it's adequately resourced and configured (e.g., MaxThreads in clamd.conf).
  • File I/O: Disk I/O can be a bottleneck if many files are created or modified rapidly. Asynchronous operations help, but system I/O capacity is finite.
  • notify Event Buffer: The mpsc channel has a buffer (100 in this example). If scanning is slow and file events are rapid, this buffer could fill. Monitor or adjust as needed. Consider what happens if tx.try_send fails due to a full channel (currently, it just prints an error).

Conclusion

We've built a foundational real-time directory monitoring tool in Rust that leverages notify for file system events, clap for command-line arguments, tokio for asynchronous operations, and clamav-client for interacting with a ClamAV daemon. This setup provides an efficient way to detect and respond to potential virus threats as they appear, significantly enhancing your system's security posture. Remember to thoroughly test this tool in your specific environment and expand upon the error handling and notification mechanisms for production use.

Transloadit also utilizes ClamAV in its 🤖 /file/virusscan Robot for robust file filtering.