hbutils.encoding.int_hash_val
This module provides comprehensive validation utilities for hash functions, including tests for:
Determinism: Ensures consistent output for the same input
Type consistency: Validates consistent hashing across different input types
Avalanche effect: Measures how small input changes affect output
Uniform distribution: Analyzes hash value distribution patterns
Collision resistance: Tests for hash collisions
Empty input handling: Validates behavior with empty inputs
Performance characteristics: Measures hashing speed and throughput
The module is designed to validate integer-based hash functions that accept string, bytes, or bytearray inputs and return integer hash values.
- Example::
>>> from hbutils.encoding import int_hash_val_comprehensive >>> >>> print(int_hash_val_comprehensive('xs')) # validate existing hash functions ╔══════════════════════════════════════════════════════════════════════════════════════════════╗ ║ COMPREHENSIVE HASH FUNCTION VALIDATION REPORT ║ ╠══════════════════════════════════════════════════════════════════════════════════════════════╣ ║ Function Name: xs │ Overall Status: PASS ║ ║ Properties Tested: 7 │ Properties Passed: 7 (100.0%) ║ ╠══════════════════════════════════════════════════════════════════════════════════════════════╣ ║ PROPERTY STATUS ║ ╠══════════════════════════════════════════════════════════════════════════════════════════════╣ ║ ✓ Determinism │ PASS ║ ║ ✓ Type Consistency │ PASS ║ ║ ✓ Avalanche Effect │ PASS | Avalanche Effect: 42.9% ║ ║ ✓ Uniform Distribution │ PASS | Uniformity Score: 0.996 ║ ║ ✓ Collision Resistance │ PASS | Collision Rate: 0.0000 ║ ║ ✓ Empty Input │ PASS ║ ║ ✓ Performance │ PASS | Avg Throughput: 4.6 MB/s ║ ╠══════════════════════════════════════════════════════════════════════════════════════════════╣ ║ RECOMMENDATIONS ║ ╠══════════════════════════════════════════════════════════════════════════════════════════════╣ ║ ✓ Hash function meets all validation criteria - suitable for production use ║ ╚══════════════════════════════════════════════════════════════════════════════════════════════╝ DETAILED ANALYSIS: • All validation tests passed successfully • Hash function demonstrates good cryptographic properties • Suitable for general-purpose hashing applications >>> >>> def basic_good_hash(data) -> int: ... # Convert all input types to bytes ... if isinstance(data, str): ... data = data.encode('utf-8') ... elif isinstance(data, bytearray): ... data = bytes(data) ... hash_val = 0x811c9dc5 # FNV offset basis (32-bit) ... for byte in data: ... # Simple polynomial hash variant ... hash_val = ((hash_val * 33) ^ byte) & 0xffffffff ... # Add some bit mixing ... hash_val ^= hash_val >> 16 ... hash_val = (hash_val * 0x85ebca6b) & 0xffffffff ... hash_val ^= hash_val >> 13 ... return hash_val & 0xffffffff ... >>> print(int_hash_val_comprehensive(basic_good_hash)) ╔══════════════════════════════════════════════════════════════════════════════════════════════╗ ║ COMPREHENSIVE HASH FUNCTION VALIDATION REPORT ║ ╠══════════════════════════════════════════════════════════════════════════════════════════════╣ ║ Function Name: basic_good_hash │ Overall Status: PASS ║ ║ Properties Tested: 7 │ Properties Passed: 7 (100.0%) ║ ╠══════════════════════════════════════════════════════════════════════════════════════════════╣ ║ PROPERTY STATUS ║ ╠══════════════════════════════════════════════════════════════════════════════════════════════╣ ║ ✓ Determinism │ PASS ║ ║ ✓ Type Consistency │ PASS ║ ║ ✓ Avalanche Effect │ PASS | Avalanche Effect: 50.5% ║ ║ ✓ Uniform Distribution │ PASS | Uniformity Score: 0.996 ║ ║ ✓ Collision Resistance │ PASS | Collision Rate: 0.0000 ║ ║ ✓ Empty Input │ PASS ║ ║ ✓ Performance │ PASS | Avg Throughput: 2.5 MB/s ║ ╠══════════════════════════════════════════════════════════════════════════════════════════════╣ ║ RECOMMENDATIONS ║ ╠══════════════════════════════════════════════════════════════════════════════════════════════╣ ║ ✓ Hash function meets all validation criteria - suitable for production use ║ ╚══════════════════════════════════════════════════════════════════════════════════════════════╝ DETAILED ANALYSIS: • All validation tests passed successfully • Hash function demonstrates good cryptographic properties • Suitable for general-purpose hashing applications
int_hash_val_determinism
- hbutils.encoding.int_hash_val.int_hash_val_determinism(hash_func: str | Callable[[str | bytes | bytearray], int], test_data: List[str | bytes | bytearray]) DeterminismValidationResult[source]
Validate determinism: same input produces same output.
Tests whether the hash function consistently produces the same output for identical inputs across multiple invocations.
- Parameters:
hash_func (_HashFuncTyping) – The hash function to validate. Should accept str, bytes, or bytearray and return an integer hash value. Can be a string name or callable.
test_data (List[Union[str, bytes, bytearray]]) – List of test inputs to validate determinism
- Returns:
Determinism validation results
- Return type:
- Example::
>>> def simple_hash(data): ... return hash(data) & 0xFFFFFFFF >>> result = int_hash_val_determinism(simple_hash, ["test", b"data"]) >>> result.passed True
int_hash_val_type_consistency
- hbutils.encoding.int_hash_val.int_hash_val_type_consistency(hash_func: str | Callable[[str | bytes | bytearray], int]) TypeConsistencyValidationResult[source]
Validate type consistency: same content in different types should produce same hash.
Tests whether the hash function produces identical hash values for the same content when provided as string, bytes, or bytearray types.
- Parameters:
hash_func (_HashFuncTyping) – The hash function to validate. Should accept str, bytes, or bytearray and return an integer hash value. Can be a string name or callable.
- Returns:
Type consistency validation results
- Return type:
- Example::
>>> def simple_hash(data): ... if isinstance(data, str): ... data = data.encode('utf-8') ... return hash(bytes(data)) & 0xFFFFFFFF >>> result = int_hash_val_type_consistency(simple_hash) >>> result.passed True
int_hash_val_avalanche_effect
- hbutils.encoding.int_hash_val.int_hash_val_avalanche_effect(hash_func: str | Callable[[str | bytes | bytearray], int], sample_size: int = 100) AvalancheEffectValidationResult[source]
Validate avalanche effect: small input changes cause significant output changes.
Tests the avalanche effect property where a small change in input (single bit/character) should result in approximately 50% of the output bits changing. This is a key property of good hash functions.
- Parameters:
hash_func (_HashFuncTyping) – The hash function to validate. Should accept str, bytes, or bytearray and return an integer hash value. Can be a string name or callable.
sample_size (int) – Number of random samples to test, defaults to 100
- Returns:
Avalanche effect validation results
- Return type:
- Example::
>>> def simple_hash(data): ... return hash(data) & 0xFFFFFFFF >>> result = int_hash_val_avalanche_effect(simple_hash, sample_size=50) >>> result.change_percentage > 40.0 True
int_hash_val_uniform_distribution
- hbutils.encoding.int_hash_val.int_hash_val_uniform_distribution(hash_func: str | Callable[[str | bytes | bytearray], int], sample_size: int = 10000) UniformDistributionValidationResult[source]
Validate uniform distribution of hash outputs.
Tests whether the hash function produces uniformly distributed output values across the hash space. Divides the hash space into buckets and checks if hash values are evenly distributed.
- Parameters:
hash_func (_HashFuncTyping) – The hash function to validate. Should accept str, bytes, or bytearray and return an integer hash value. Can be a string name or callable.
sample_size (int) – Number of random samples to generate and hash, defaults to 10000
- Returns:
Uniform distribution validation results
- Return type:
- Example::
>>> def simple_hash(data): ... return hash(data) & 0xFFFFFFFF >>> result = int_hash_val_uniform_distribution(simple_hash, sample_size=1000) >>> result.uniformity_score > 0.95 True
int_hash_val_collision_resistance
- hbutils.encoding.int_hash_val.int_hash_val_collision_resistance(hash_func: str | Callable[[str | bytes | bytearray], int], sample_size: int = 100000) CollisionResistanceValidationResult[source]
Validate collision resistance.
Tests the hash function’s resistance to collisions by generating many random inputs and checking for duplicate hash values. A good hash function should have a very low collision rate.
- Parameters:
hash_func (_HashFuncTyping) – The hash function to validate. Should accept str, bytes, or bytearray and return an integer hash value. Can be a string name or callable.
sample_size (int) – Number of random samples to test, defaults to 100000
- Returns:
Collision resistance validation results
- Return type:
- Example::
>>> def simple_hash(data): ... return hash(data) & 0xFFFFFFFF >>> result = int_hash_val_collision_resistance(simple_hash, sample_size=10000) >>> result.collision_rate < 0.001 True
int_hash_val_empty_input
- hbutils.encoding.int_hash_val.int_hash_val_empty_input(hash_func: str | Callable[[str | bytes | bytearray], int]) EmptyInputValidationResult[source]
Validate empty input handling.
Tests whether the hash function correctly handles empty inputs of different types (empty string, empty bytes, empty bytearray) and produces consistent results.
- Parameters:
hash_func (_HashFuncTyping) – The hash function to validate. Should accept str, bytes, or bytearray and return an integer hash value. Can be a string name or callable.
- Returns:
Empty input validation results
- Return type:
- Example::
>>> def simple_hash(data): ... if isinstance(data, str): ... data = data.encode('utf-8') ... return hash(bytes(data)) & 0xFFFFFFFF >>> result = int_hash_val_empty_input(simple_hash) >>> result.consistent_empty_hash True
int_hash_val_performance
- hbutils.encoding.int_hash_val.int_hash_val_performance(hash_func: str | Callable[[str | bytes | bytearray], int], data_sizes: List[int] | None = None) PerformanceValidationResult[source]
Validate performance characteristics.
Measures the hash function’s performance across different input sizes, calculating average hashing time and throughput in MB/s.
- Parameters:
hash_func (_HashFuncTyping) – The hash function to validate. Should accept str, bytes, or bytearray and return an integer hash value. Can be a string name or callable.
data_sizes (List[int], optional) – List of data sizes (in bytes) to test, defaults to [100, 1000, 10000, 100000]
- Returns:
Performance validation results
- Return type:
- Example::
>>> def simple_hash(data): ... return hash(data) & 0xFFFFFFFF >>> result = int_hash_val_performance(simple_hash, data_sizes=[100, 1000]) >>> result.passed True >>> 100 in result.performance_data True
int_hash_val_comprehensive
- hbutils.encoding.int_hash_val.int_hash_val_comprehensive(hash_func: str | Callable[[str | bytes | bytearray], int]) ComprehensiveValidationResult[source]
Comprehensive validation of hash function properties.
Runs a complete suite of validation tests on the hash function, including: determinism, type consistency, avalanche effect, uniform distribution, collision resistance, empty input handling, and performance characteristics.
- Parameters:
hash_func (_HashFuncTyping) – The hash function to validate. Should accept str, bytes, or bytearray and return an integer hash value. Can be a string name or callable.
- Returns:
Comprehensive validation results
- Return type:
- Example::
>>> def simple_hash(data): ... if isinstance(data, str): ... data = data.encode('utf-8') ... return hash(bytes(data)) & 0xFFFFFFFF >>> result = int_hash_val_comprehensive(simple_hash) >>> result.hash_function_name 'simple_hash' >>> result.total_properties_tested 7
DeterminismValidationResult
- class hbutils.encoding.int_hash_val.DeterminismValidationResult(passed: bool, failed_cases: List[str], total_tested: int, failed_count: int)[source]
Results from determinism validation test.
- Parameters:
passed (bool) – Whether the determinism test passed
failed_cases (List[str]) – List of test cases that failed determinism check
total_tested (int) – Total number of test cases evaluated
failed_count (int) – Number of test cases that failed
TypeConsistencyValidationResult
- class hbutils.encoding.int_hash_val.TypeConsistencyValidationResult(passed: bool, failed_cases: List[str], total_tested: int, failed_count: int, consistent_hashes: Dict[str, int])[source]
Results from type consistency validation test.
- Parameters:
passed (bool) – Whether the type consistency test passed
failed_cases (List[str]) – List of test cases that failed type consistency check
total_tested (int) – Total number of test cases evaluated
failed_count (int) – Number of test cases that failed
consistent_hashes (Dict[str, int]) – Dictionary mapping test strings to their consistent hash values
AvalancheEffectValidationResult
- class hbutils.encoding.int_hash_val.AvalancheEffectValidationResult(passed: bool, avg_bit_changes: float, change_percentage: float, total_comparisons: int, bit_changes_list: List[int], min_changes: int, max_changes: int)[source]
Results from avalanche effect validation test.
- Parameters:
passed (bool) – Whether the avalanche effect test passed
avg_bit_changes (float) – Average number of bits changed across all comparisons
change_percentage (float) – Percentage of bits changed (avg_bit_changes / total_bits * 100)
total_comparisons (int) – Total number of hash comparisons performed
bit_changes_list (List[int]) – List of bit changes for each comparison
min_changes (int) – Minimum number of bits changed in any comparison
max_changes (int) – Maximum number of bits changed in any comparison
UniformDistributionValidationResult
- class hbutils.encoding.int_hash_val.UniformDistributionValidationResult(passed: bool, uniformity_score: float, bucket_stats: Dict[str, Any], sample_count: int, buckets: List[int])[source]
Results from uniform distribution validation test.
- Parameters:
passed (bool) – Whether the uniform distribution test passed
uniformity_score (float) – Score indicating distribution uniformity (0-1, higher is better)
bucket_stats (Dict[str, Any]) – Statistics about bucket distribution
sample_count (int) – Number of samples used in the test
buckets (List[int]) – List of counts for each bucket
CollisionResistanceValidationResult
- class hbutils.encoding.int_hash_val.CollisionResistanceValidationResult(passed: bool, collision_count: int, collision_rate: float, sample_size: int, unique_hashes: int, collision_pairs: List[Tuple[str, int]])[source]
Results from collision resistance validation test.
- Parameters:
passed (bool) – Whether the collision resistance test passed
collision_count (int) – Number of collisions detected
collision_rate (float) – Rate of collisions (collision_count / sample_size)
sample_size (int) – Total number of samples tested
unique_hashes (int) – Number of unique hash values generated
collision_pairs (List[Tuple[str, int]]) – List of (input, hash) tuples that collided
EmptyInputValidationResult
- class hbutils.encoding.int_hash_val.EmptyInputValidationResult(passed: bool, hash_results: List[int], consistent_empty_hash: bool, error_cases: List[Tuple[str, str]], empty_hash_value: int | None)[source]
Results from empty input validation test.
- Parameters:
passed (bool) – Whether the empty input test passed
hash_results (List[int]) – List of hash values for empty inputs
consistent_empty_hash (bool) – Whether all empty inputs produced the same hash
error_cases (List[Tuple[str, str]]) – List of (input_type, error_message) tuples for failed cases
empty_hash_value (Union[int, None]) – The consistent hash value for empty inputs, or None if inconsistent
PerformanceValidationResult
- class hbutils.encoding.int_hash_val.PerformanceValidationResult(passed: bool, performance_data: Dict[int, Dict[str, float]], tested_sizes: List[int], completed_sizes: List[int])[source]
Results from performance validation test.
- Parameters:
passed (bool) – Whether the performance test passed (completed without errors)
performance_data (Dict[int, Dict[str, float]]) – Dictionary mapping data sizes to performance metrics
tested_sizes (List[int]) – List of data sizes that were tested
completed_sizes (List[int]) – List of data sizes that completed successfully
ComprehensiveValidationResult
- class hbutils.encoding.int_hash_val.ComprehensiveValidationResult(passed: bool, not_passed_properties: List[str], hash_function_name: str, total_properties_tested: int, properties_passed: int, determinism: DeterminismValidationResult, type_consistency: TypeConsistencyValidationResult, avalanche_effect: AvalancheEffectValidationResult, uniform_distribution: UniformDistributionValidationResult, collision_resistance: CollisionResistanceValidationResult, empty_input: EmptyInputValidationResult, performance: PerformanceValidationResult)[source]
Results from comprehensive validation test.
- Parameters:
passed (bool) – Whether all validation tests passed
not_passed_properties (List[str]) – List of property names that failed validation
hash_function_name (str) – Name of the hash function being validated
total_properties_tested (int) – Total number of properties tested
properties_passed (int) – Number of properties that passed validation
determinism (DeterminismValidationResult) – Results from determinism validation
type_consistency (TypeConsistencyValidationResult) – Results from type consistency validation
avalanche_effect (AvalancheEffectValidationResult) – Results from avalanche effect validation
uniform_distribution (UniformDistributionValidationResult) – Results from uniform distribution validation
collision_resistance (CollisionResistanceValidationResult) – Results from collision resistance validation
empty_input (EmptyInputValidationResult) – Results from empty input validation
performance (PerformanceValidationResult) – Results from performance validation