hbutils.system.network.url

URL parsing and splitting utilities for network operations.

This module provides utilities for parsing and splitting URLs into their constituent components. It offers a convenient way to decompose URLs into scheme, host, path, query parameters, and fragments, with additional functionality to parse query parameters into dictionaries and extract path segments. The module wraps urllib.parse functionality with enhanced features for easier URL manipulation.

The module contains the following main components:

  • SplitURL - Dataclass representing a parsed URL with convenient properties

  • urlsplit() - Function to split a URL string into a SplitURL object

Example:

>>> from hbutils.system.network.url import urlsplit
>>> sp = urlsplit('https://example.com/path/to/file.txt?q=1&v=kdjf&q=2#frag')
>>> sp.scheme
'https'
>>> sp.host
'example.com'
>>> sp.query_dict
{'q': ['1', '2'], 'v': 'kdjf'}
>>> sp.path_segments
['', 'path', 'to', 'file.txt']
>>> sp.filename
'file.txt'

__all__

hbutils.system.network.url.__all__ = ['urlsplit', 'SplitURL']

Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.

SplitURL

class hbutils.system.network.url.SplitURL(url: str, scheme: str, host: str, path: str, query: str, fragment: str)[source]

A dataclass representing a parsed URL with its constituent components.

This class provides convenient access to URL components and additional properties for working with query parameters and path segments.

Variables:
  • url (str) – The original URL string.

  • scheme (str) – The URL scheme (e.g., 'http', 'https', 'ftp').

  • host (str) – The host/netloc part of the URL.

  • path (str) – The path component of the URL.

  • query (str) – The query string (without the leading '?').

  • fragment (str) – The fragment identifier (without the leading '#').

__repr__() str[source]

Get a detailed string representation of the SplitURL object.

Only non-empty components are included in the representation. The query portion is shown as the parsed dictionary from query_dict.

Returns:

A string representation showing all non-empty components.

Return type:

str

Example::
>>> sp = urlsplit('https://example.com/path?q=1#frag')
>>> repr(sp)
"SplitURL(scheme='https', host='example.com', path='/path', query={'q': '1'}, fragment='frag')"
__str__() str[source]

Get the original URL string.

Returns:

The original URL.

Return type:

str

property filename: str | None

Get the filename from the URL path.

Returns the last segment of the path. If the path is empty or ends with a trailing slash, the filename may be an empty string.

Returns:

The filename segment from the path.

Return type:

Optional[str]

Example::
>>> sp = urlsplit('https://example.com/path/to/file.txt')
>>> sp.filename
'file.txt'
>>> sp = urlsplit('https://example.com')
>>> sp.filename
''
property path_segments: List[str]

Get the path split into individual segments.

The path is split by '/' and each segment is URL-decoded.

Returns:

A list of decoded path segments.

Return type:

List[str]

Example::
>>> sp = urlsplit('https://example.com/path/to/file.txt')
>>> sp.path_segments
['', 'path', 'to', 'file.txt']
property query_dict: Dict[str, object | None]

Parse the query string into a dictionary.

When a query parameter appears multiple times, its values are stored as a list. Single-occurrence parameters are stored as single values, and parameters without explicit values are stored as None.

Returns:

A dictionary mapping parameter names to their values (or lists of values).

Return type:

dict

Example::
>>> sp = urlsplit('https://example.com?q=1&v=kdjf&q=2')
>>> sp.query_dict
{'q': ['1', '2'], 'v': 'kdjf'}

urlsplit

hbutils.system.network.url.urlsplit(url: str) SplitURL[source]

Split a URL into its constituent components.

This function parses a URL string and returns a SplitURL object containing the scheme, host, path, query parameters, and fragment. It provides enhanced functionality over urllib.parse.urlsplit() by offering convenient properties for accessing parsed query parameters and path segments.

Parameters:

url (str) – The URL string to split.

Returns:

A SplitURL object containing the parsed URL components.

Return type:

SplitURL

Examples::
>>> from hbutils.system import urlsplit
>>>
>>> sp = urlsplit('https://www.baidu.com/dslkjf/sdfhk/asdasd.png?q=1&v=kdjf&q=2#fff')
>>> sp
SplitURL(scheme='https', host='www.baidu.com', path='/dslkjf/sdfhk/asdasd.png', query={'q': ['1', '2'], 'v': 'kdjf'}, fragment='fff')
>>> repr(sp)
"SplitURL(scheme='https', host='www.baidu.com', path='/dslkjf/sdfhk/asdasd.png', query={'q': ['1', '2'], 'v': 'kdjf'}, fragment='fff')"
>>>
>>> sp.scheme
'https'
>>> sp.host
'www.baidu.com'
>>> sp.path
'/dslkjf/sdfhk/asdasd.png'
>>> sp.query
'q=1&v=kdjf&q=2'
>>> sp.fragment
'fff'
>>>
>>> sp.query_dict
{'q': ['1', '2'], 'v': 'kdjf'}
>>> sp.path_segments
['', 'dslkjf', 'sdfhk', 'asdasd.png']
>>> sp.filename
'asdasd.png'