hbutils.collection.sequence

Sequence collection utilities for deduplication and grouping operations.

This module provides lightweight helpers for manipulating sequences and iterables. The focus is on preserving ordering and offering flexible grouping behavior with optional post-processing of each group.

The module contains the following public functions:

  • unique() - Remove duplicate elements while preserving original order

  • group_by() - Group elements by a key function with optional post-processing

Note

The unique() function relies on hashing for membership checks. Elements must therefore be hashable, and the input sequence type must be constructible from a list for the result to be returned with the same type.

Example:

>>> from hbutils.collection.sequence import unique, group_by
>>> unique([1, 2, 3, 1, 2])
[1, 2, 3]
>>> group_by(['apple', 'pear', 'peach'], key=lambda x: x[0])
{'a': ['apple'], 'p': ['pear', 'peach']}

__all__

hbutils.collection.sequence.__all__ = ['unique', 'group_by']

Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.

unique

hbutils.collection.sequence.unique(s: Sequence[_ElementType]) Sequence[_ElementType][source]

Remove duplicate elements from a sequence while preserving original order.

This function iterates through the input sequence, keeping the first occurrence of each element. The returned sequence is constructed by calling type(s) on the list of unique items, preserving the original sequence type where possible.

Parameters:

s (Union[Sequence[_ElementType]]) – Original sequence to be deduplicated.

Returns:

Unique sequence with the original input type when constructible.

Return type:

Sequence[_ElementType]

Raises:

TypeError – If elements are unhashable or if type(s) cannot be constructed from a list.

Examples::
>>> from hbutils.collection import unique
>>>
>>> unique([1, 2, 3, 1])
[1, 2, 3]
>>> unique(('a', 'b', 'a', 'c', 'd', 'e', 'b'))
('a', 'b', 'c', 'd', 'e')
>>> unique([3, 1, 2, 1, 4, 3])
[3, 1, 2, 4]

group_by

hbutils.collection.sequence.group_by(s: Iterable[_ElementType], key: Callable[[_ElementType], _GroupType], gfunc: Callable[[List[_ElementType]], _ResultType] | None = None) Dict[_GroupType, _ResultType][source]

Group iterable elements by a key function with optional post-processing.

Elements from the input iterable are collected into lists keyed by the result of key. If gfunc is provided, each group list is passed to this function and the returned value is used as the group result. When gfunc is None, the raw lists are returned.

Parameters:
  • s (Iterable[_ElementType]) – Elements to be grouped.

  • key (Callable[[_ElementType], _GroupType]) – Callable that computes the grouping key for each element.

  • gfunc (Optional[Callable[[List[_ElementType]], _ResultType]]) – Optional post-processing function for each group. If None, group values are returned as raw lists. Defaults to None.

Returns:

Dictionary mapping group keys to processed group values.

Return type:

Dict[_GroupType, _ResultType]

Examples::
>>> from hbutils.collection import group_by
>>>
>>> foods = [
...     'apple', 'orange', 'pear',
...     'banana', 'fish', 'pork', 'milk',
... ]
>>> group_by(foods, len)  # group by length
{5: ['apple'], 6: ['orange', 'banana'], 4: ['pear', 'fish', 'pork', 'milk']}
>>> group_by(foods, len, len)  # group and get length
{5: 1, 6: 2, 4: 4}
>>> group_by(foods, lambda x: x[0])  # group by first letter
{'a': ['apple'], 'o': ['orange'], 'p': ['pear', 'pork'], 'b': ['banana'], 'f': ['fish'], 'm': ['milk']}
>>> group_by(foods, lambda x: x[0], len)  # group and get length
{'a': 1, 'o': 1, 'p': 2, 'b': 1, 'f': 1, 'm': 1}