Database and Querying
Database
opens and queries databases. See First Database with Rust
for a tutorial.
Opening a Database
Basic Opening
#![allow(unused)] fn main() { use matchy::Database; // Simple - uses defaults (cache enabled, validation on) let db = Database::from("database.mxy").open()?; }
The database is memory-mapped and loads in under 1 millisecond regardless of size.
Builder API
The recommended way to open databases uses the fluent builder API:
#![allow(unused)] fn main() { use matchy::Database; // With custom cache size let db = Database::from("database.mxy") .cache_capacity(1000) .open()?; // Performance mode (skip validation, large cache) let db = Database::from("threats.mxy") .trusted() .cache_capacity(100_000) .open()?; // No cache (for unique queries) let db = Database::from("database.mxy") .no_cache() .open()?; }
Builder Methods
Method | Description |
---|---|
.cache_capacity(size) | Set LRU cache size (default: 10,000) |
.no_cache() | Disable caching entirely |
.trusted() | Skip UTF-8 validation (~15-20% faster) |
.open() | Load the database |
Cache Size Guidelines:
0
(via.no_cache()
): No caching - best for diverse queries100-1000
: Good for moderate repetition10,000
(default): Optimal for typical workloads100,000+
: For very high repetition (80%+ hit rate)
Note: Caching only benefits pattern lookups with high repetition. IP and literal lookups are already fast and don't benefit from caching.
Error Handling
#![allow(unused)] fn main() { match Database::open("database.mxy") { Ok(db) => { /* success */ } Err(MatchyError::FileNotFound { path }) => { eprintln!("Database not found: {}", path); } Err(MatchyError::InvalidFormat { reason }) => { eprintln!("Invalid database format: {}", reason); } Err(e) => eprintln!("Error: {}", e), } }
Querying
Method Signature
#![allow(unused)] fn main() { pub fn lookup<S: AsRef<str>>(&self, query: S) -> Result<Option<QueryResult>, MatchyError> }
Basic Usage
#![allow(unused)] fn main() { match db.lookup("192.0.2.1")? { Some(result) => println!("Found: {:?}", result), None => println!("Not found"), } }
QueryResult Types
QueryResult
is an enum with three variants:
IP Match
#![allow(unused)] fn main() { QueryResult::Ip { data: Option<HashMap<String, DataValue>>, prefix_len: u8, } }
Example:
#![allow(unused)] fn main() { match db.lookup("192.0.2.1")? { Some(QueryResult::Ip { data, prefix_len }) => { println!("Matched IP with prefix /{}", prefix_len); if let Some(d) = data { println!("Data: {:?}", d); } } _ => {} } }
Pattern Match
#![allow(unused)] fn main() { QueryResult::Pattern { pattern_ids: Vec<u32>, data: Vec<Option<HashMap<String, DataValue>>>, } }
Example:
#![allow(unused)] fn main() { match db.lookup("mail.google.com")? { Some(QueryResult::Pattern { pattern_ids, data }) => { println!("Matched {} pattern(s)", pattern_ids.len()); for (i, pattern_data) in data.iter().enumerate() { println!("Pattern {}: {:?}", pattern_ids[i], pattern_data); } } _ => {} } }
Note: A query can match multiple patterns. All matching patterns are returned.
Exact String Match
#![allow(unused)] fn main() { QueryResult::ExactString { data: Option<HashMap<String, DataValue>>, } }
Example:
#![allow(unused)] fn main() { match db.lookup("example.com")? { Some(QueryResult::ExactString { data }) => { println!("Exact match: {:?}", data); } _ => {} } }
Complete Example
use matchy::{Database, QueryResult}; fn main() -> Result<(), Box<dyn std::error::Error>> { let db = Database::open("database.mxy")?; // Query different types let queries = vec![ "192.0.2.1", // IP "10.5.5.5", // CIDR "test.example.com", // Pattern "example.com", // Exact string ]; for query in queries { match db.lookup(query)? { Some(QueryResult::Ip { prefix_len, .. }) => { println!("{}: IP match (/{prefix_len})", query); } Some(QueryResult::Pattern { pattern_ids, .. }) => { println!("{}: Pattern match ({} patterns)", query, pattern_ids.len()); } Some(QueryResult::ExactString { .. }) => { println!("{}: Exact match", query); } None => { println!("{}: No match", query); } } } Ok(()) }
Thread Safety
Database
is Send + Sync
and can be safely shared across threads:
#![allow(unused)] fn main() { use std::sync::Arc; use std::thread; let db = Arc::new(Database::open("database.mxy")?); let handles: Vec<_> = (0..4).map(|i| { let db = Arc::clone(&db); thread::spawn(move || { db.lookup(&format!("192.0.2.{}", i)) }) }).collect(); for handle in handles { handle.join().unwrap()?; } }
Performance
Query performance by entry type:
- IP addresses: ~7 million queries/second (138ns avg)
- Exact strings: ~8 million queries/second (112ns avg)
- Patterns: ~1-2 million queries/second (500ns-1μs avg)
See Performance Considerations for details.
Database Statistics
Get Statistics
Retrieve comprehensive statistics about database usage:
#![allow(unused)] fn main() { use matchy::Database; let db = Database::from("threats.mxy").open()?; // Do some queries db.lookup("1.2.3.4")?; db.lookup("example.com")?; db.lookup("test.com")?; // Get stats let stats = db.stats(); println!("Total queries: {}", stats.total_queries); println!("Queries with match: {}", stats.queries_with_match); println!("Cache hit rate: {:.1}%", stats.cache_hit_rate() * 100.0); println!("Match rate: {:.1}%", stats.match_rate() * 100.0); println!("IP queries: {}", stats.ip_queries); println!("String queries: {}", stats.string_queries); }
DatabaseStats Structure
#![allow(unused)] fn main() { pub struct DatabaseStats { pub total_queries: u64, pub queries_with_match: u64, pub queries_without_match: u64, pub cache_hits: u64, pub cache_misses: u64, pub ip_queries: u64, pub string_queries: u64, } impl DatabaseStats { pub fn cache_hit_rate(&self) -> f64 pub fn match_rate(&self) -> f64 } }
Helper Methods:
cache_hit_rate()
- Returns cache hit rate as a value from 0.0 to 1.0match_rate()
- Returns query match rate as a value from 0.0 to 1.0
Interpreting Statistics
Cache Performance:
- Hit rate < 50%: Consider disabling cache (
.no_cache()
) - Hit rate 50-80%: Cache is helping moderately
- Hit rate > 80%: Cache is very effective
Query Distribution:
- High
ip_queries
: Database is being used for IP lookups - High
string_queries
: Database is being used for domain/pattern matching
Cache Management
Clear Cache
Remove all cached query results:
#![allow(unused)] fn main() { use matchy::Database; let db = Database::from("threats.mxy").open()?; // Do some queries (fills cache) db.lookup("example.com")?; // Clear cache to force fresh lookups db.clear_cache(); }
Useful for benchmarking or when you need to ensure fresh lookups without reopening the database.
Helper Methods
Checking Entry Types
#![allow(unused)] fn main() { if let Some(QueryResult::Ip { .. }) = result { // Handle IP match } }
Or using match guards:
#![allow(unused)] fn main() { match db.lookup(query)? { Some(QueryResult::Ip { prefix_len, .. }) if prefix_len == 32 => { println!("Exact IP match"); } Some(QueryResult::Ip { prefix_len, .. }) => { println!("CIDR match /{}", prefix_len); } _ => {} } }
Database Lifecycle
Databases are immutable once opened:
#![allow(unused)] fn main() { let db = Database::open("database.mxy")?; // db.lookup(...) - OK // db.add_entry(...) - No such method! }
To update a database:
- Build a new database with
DatabaseBuilder
- Write to a temporary file
- Atomically replace the old database
#![allow(unused)] fn main() { // Build new database let db_bytes = builder.build()?; std::fs::write("database.mxy.tmp", &db_bytes)?; std::fs::rename("database.mxy.tmp", "database.mxy")?; // Reopen let db = Database::open("database.mxy")?; }
See Also
- DatabaseBuilder - Building databases
- Data Types Reference - Data value types
- Performance Considerations - Optimization