Structure
WebURL
public struct WebURL
A Uniform Resource Locator (URL) is a universal identifier, which often describes the location of a resource.
URL parsing and serialization is compatible with the WHATWG URL Standard.
The WebURL
API is designed to meet the needs and expectations of Swift developers, expanding on the JavaScript API described in the standard
to add path-component manipulation, host objects, and more. Some of the component values have been tweaked to not include their leading or
trailing delimiters, and component setters are little stricter and more predictable, but in all other respects they should have the same behaviour.
For more information on the differences between this API and the JavaScript URL
class, see the WebURL.JSModel
type.
Relationships
Nested Types
WebURL.FormEncodedQueryParameters
A view of the
application/x-www-form-urlencoded
key-value pairs in a URL'squery
.WebURL.Host
A host is a domain, an IPv4 address, an IPv6 address, an opaque host, or an empty host. Typically a host serves as a network address, but it is sometimes used as opaque identifier in URLs where a network address is not necessary.
WebURL.JSModel
An interface to this URL whose properties have the same behaviour as the JavaScript
URL
class described in the URL Standard.WebURL.Origin
Origins are the fundamental currency of the Web's security model.
WebURL.PathComponents
A view of the components in a URL's path.
WebURL.UTF8View
A view of the UTF-8 code-units in a serialized URL.
Conforms To
Codable
Comparable
CustomStringConvertible
Equatable
Hashable
LosslessStringConvertible
Initializers
init(filePath:format:)
public init<S>(
filePath: S, format: FilePathFormat = .native
) throws where S: StringProtocol, S.UTF8View: BidirectionalCollection
Creates a file:
URL representation of a file path.
The given path must be absolute according to the given FilePathFormat
, and not contain any components
which traverse to their parent directories (..
components). Some minimal normalization is applied, according to the path format.
Note that even though a String
may be a valid source of a file path, some paths may be corrupted when storing or manipulating them as a String
.
Therefore, this library does not offer a corresponding API to reconstruct a file path as a String
.
Instead, developers are encouraged to use the FilePath
type from the swift-system
package.
The FilePath(url: WebURL)
initializer available in the WebURLSystemExtras
module is the best way to reconstruct a file path
from its URL representation.
Parameters
Name | Type | Description |
---|---|---|
filePath | S |
The file path from which to create the file URL. |
format | FilePathFormat |
The way in which |
Throws
URLFromFilePathError
init?(_:)
@inlinable @inline(__always)
public init?<S>(_ string: S) where S: StringProtocol, S.UTF8View: BidirectionalCollection
Constructs a URL by parsing the given string.
This parser is compatible with the WHATWG URL Standard; this means that whitespace characters may be removed from the given string, other characters may be percent-encoded based on which component they belong to, IP addresses rewritten in canonical notation, and paths lexically simplified, among other transformations defined by the standard.
init?(utf8:)
@inlinable @inline(__always)
public init?<UTF8Bytes>(utf8: UTF8Bytes) where UTF8Bytes: BidirectionalCollection, UTF8Bytes.Element == UInt8
Constructs a URL by parsing the given string, which is provided as a collection of UTF-8 code-units.
This parser is compatible with the WHATWG URL Standard; this means that whitespace characters may be removed from the given string, other characters may be percent-encoded based on which component they belong to, IP addresses rewritten in canonical notation, and paths lexically simplified, among other transformations defined by the standard.
init(from:)
public init(from decoder: Decoder) throws
Properties
formParams
public var formParams: FormEncodedQueryParameters
A mutable view of the application/x-www-form-urlencoded
key-value pairs in this URL's query
.
host
public var host: Host?
The host of this URL, if present.
A host is a domain, an IPv4 address, an IPv6 address, an opaque host, or an empty host. Typically a host serves as a network address, but it is sometimes used as opaque identifier in URLs where a network address is not necessary.
jsModel
public var jsModel: JSModel
An interface to this URL whose properties have the same behaviour as the JavaScript URL
class described in the URL Standard.
See the documentation for the WebURL.JSModel
type for more information.
origin
public var origin: Origin
The origin of this URL.
Origins are the fundamental currency of the Web's security model. Two actors in the Web platform that share an origin are assumed to trust each other and to have the same authority. Actors with differing origins are considered potentially hostile versus each other, and are isolated from each other to varying degrees.
pathComponents
public var pathComponents: PathComponents
A mutable view of this URL's path components.
Accessing this property is invalid and will trigger a runtime error if the URL has an opaque path (see WebURL.hasOpaquePath
).
utf8
@inlinable
public var utf8: UTF8View
A mutable view of the UTF-8 code-units of this URL's serialization.
description
public var description: String
scheme
public var scheme: String
The scheme of this URL, for example https
or file
.
A URL’s scheme
is a non-empty ASCII string that identifies the type of URL and can be used to dispatch a URL for further processing.
For example, a URL with the "http" scheme should be processed by software that understands the HTTP protocol. Every URL must have a scheme.
Some schemes (http, https, ws, wss, ftp, and file) are referred to as being "special"; the components of URLs with these schemes may have unique encoding requirements, or the components may carry additional meaning. Scheme usage is co-ordinated by the Internet Assigned Numbers Authority.
Setting this property may fail if the new scheme is invalid, or if the URL cannot be adjusted to the new scheme.
username
public var username: String?
The username of this URL, if present, as a non-empty, percent-encoded ASCII string.
When setting this property, any code-points which are not valid for use in the URL's user-info section will be percent-encoded. Setting this property may fail if the URL does not allow credentials.
password
public var password: String?
The password of this URL, if present, as a non-empty, percent-encoded string.
When setting this property, any code-points which are not valid for use in the URL's user-info section will be percent-encoded. Setting this property may fail if the URL does not allow credentials.
hostname
public var hostname: String?
The string representation of this URL's host, if present.
A URL's host can be a domain, an IPv4 address, an IPv6 address, an opaque host, or an empty host. Typically a host serves as a network address, but is sometimes used as an opaque identifier in URLs where a network address is not necessary.
When setting this property, the new contents will be parsed and normalized (e.g. domains will be percent-decoded and lowercased, and IP addresses will be rewritten in their canonical form). Unlike setting other components, not all code-points which are invalid for use in hostnames will be percent-encoded. If the new content contains a forbidden host code-point, the operation will fail.
port
public var port: Int?
The port of this URL, if present. Valid port numbers are in the range 0 ..< 65536
.
Setting this property may fail if the new value is out of range, or if the URL does not support port numbers. If the URL has a "special" scheme, setting the port to its known default value will remove the port.
portOrKnownDefault
public var portOrKnownDefault: Int?
The port of this URL, if present, or the default port of its scheme, if it has one.
path
public var path: String
The string representation of this URL's path.
A URL’s path is either an opaque ASCII string or a list of zero or more ASCII strings, usually identifying a location.
Paths which are lists begin with a forward-slash, and their components delimited by forward-slashes ("/").
Empty paths are lists if the URL has a hostname
.
When setting this property, the given path will be lexically simplified, and any code-points in the path's components that are not valid
for use will be percent-encoded. Setting this property will fail if the URL's path is opaque (see WebURL.hasOpaquePath
).
query
public var query: String?
The string representation of this URL's query, if present.
A URL's query
contains non-hierarchical data that, along with the path
, serves to identify a resource. The precise structure of the query string is not
standardized, but is often used to store a list of key-value pairs ("parameters").
This string representation does not include the leading ?
delimiter.
When setting this property, any code-points which are not valid for use in the URL's query will be percent-encoded.
Note that the set of code-points which are valid depends on the URL's scheme
.
fragment
public var fragment: String?
The fragment of this URL, if present, as a percent-encoded string.
A URL's fragment
is an optional string which may be used for further processing on the resource identified by the other components.
This string representation does not include the leading #
delimiter.
When setting this property, any code-points which are not valid for use in the URL's fragment will be percent-encoded.
hasOpaquePath
public var hasOpaquePath: Bool
Whether this URL has an opaque path.
URLs with opaque paths are non-hierarchical: they do not have a hostname, and their paths are opaque strings which cannot be split in to components. They can be recognized by the lack of slashes immediately following the scheme delimiter, for example:
-
mailto:bob@example.com
-
javascript:alert("hello");
-
data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==
It is invalid to set any authority components, such as username
, password
, hostname
or port
, on these URLs.
Modifying the path
or accessing the URL's pathComponents
is also invalid, and they only support limited forms of relative references.
URLs with special schemes (such as http/s and file) never have opaque paths.
Methods
fromFilePathBytes(_:format:)
@available(*, deprecated, renamed: "fromBinaryFilePath")
public static func fromFilePathBytes<Bytes>(
_ path: Bytes, format: FilePathFormat = .native
) throws -> WebURL where Bytes: BidirectionalCollection, Bytes.Element == UInt8
filePathBytes(from:format:nullTerminated:)
@available(*, deprecated, renamed: "binaryFilePath")
public static func filePathBytes(
from url: WebURL, format: FilePathFormat = .native, nullTerminated: Bool
) throws -> ContiguousArray<UInt8>
fromBinaryFilePath(_:format:)
@inlinable
public static func fromBinaryFilePath<Bytes>(
_ path: Bytes, format: FilePathFormat = .native
) throws -> WebURL where Bytes: BidirectionalCollection, Bytes.Element == UInt8
Creates a file:
URL representation of the given file path.
This is a low-level API, which accepts a file path as a collection of 8-bit code-units and paths to be represented which
are not valid UTF-8 text. Generally, users should prefer swift-system
's FilePath
type to store file paths rather than storing
them as collections of code-units, and the WebURL(filePath: FilePath)
initializer in the WebURLSystemExtras
module
is the preferred interface to this function.
Supported paths
The given path must be absolute (fully-qualified) according to the given FilePathFormat
, and not contain any components
which traverse to their parent directories (..
components). It also may not include NULL
bytes, so the terminator of null-terminated
string should not be included in the given collection of code-units.
These restrictions are considered a feature. URLs have no way of expressing relative paths, so such paths would first have to be resolved against a base path or working directory. Components which traverse upwards in the filesystem and unexpected NULL bytes are common sources of security vulnerabilities. Best practice is to first resolve and normalize such paths before turning them in to URLs, so you have an opportunity to check where the path actually points to.
Windows long paths
This function supports Win32 file namespace paths (a.k.a. "long" paths) which start with a drive letter or point to a UNC location.
These paths are identified by the prefix \\?\
. Note that this prefix is not preserved in the resulting URL or any paths created from that URL.
In a deliberate departure from Microsoft's documentation, empty and .
components will be resolved in these paths.
Other path components will not be trimmed or otherwise altered. This library considers this minimal structural normalization to be safe,
given that support is restricted to paths with drive letters or UNC locations. Empty components have no meaning on any filesystem,
and all commonly-used filesystems on Windows (including FAT, exFAT, NTFS, and ReFS) forbid files and directories named .
.
SMB/CIFS and NFS (the most common protocols used with UNC) likewise define that .
components refer to the current directory.
Additionally, ..
components and forward slashes are forbidden, as they represent components literally named ..
or with forward slashes in
their name. However, Windows APIs are not consistent, and will sometimes interpret these as traversal instructions or path separators regardless.
This ambiguity means that accepting such components could lead to hidden directory traversal and other security vulnerabilities.
Again, no commonly-used filesystem on Windows supports these file or directory names anyway, and the network protocols listed above also forbid them.
Encoding
While we may usually think of file and directory names strings of text, some operating systems do not treat them that way. Behaviours vary greatly between operating systems, from fully-defined code-units which are normalized by the system (Apple OSes), to fully-defined code-units which are not normalized (Windows), to opaque, numeric code-units with no defined interpretation (Linux, general POSIX).
In order to process a path, we need to at least understand its structure. The provided FilePathFormat
determines how the code-units will be interpreted -
e.g. which values are reserved as path separators, which normalization rules apply, and whether or not the code-units can be interpreted as textual code-points.
POSIX paths generally cannot be transcoded and should be provided to the function in their native 8-bit encoding. Windows paths should be transcoded from their native UTF-16 to UTF-8. Windows paths may also be provided in the system-specific ANSI encoding, but this is discouraged as the resulting URL will contain percent-encoded ANSI code-points which are not portable. This should only be done when legacy considerations demand it.
Normalization
Some basic lexical normalization is applied to the path, according to the given FilePathFormat
.
For example, empty path components may be collapsed and some file or directory names trimmed.
Refer to the documentation for FilePathFormat
for information about the kinds of normalization which each format defines.
Parameters
Name | Type | Description |
---|---|---|
path | Bytes |
The file path, as a |
format | FilePathFormat |
The way in which |
Throws
URLFromFilePathError
Returns
A URL with the file
scheme which can be used to reconstruct the given path.
binaryFilePath(from:format:nullTerminated:)
@inlinable
public static func binaryFilePath(
from url: WebURL, format: FilePathFormat = .native, nullTerminated: Bool
) throws -> [UInt8]
Reconstructs a file path from its URL representation.
This is a low-level function, which returns a file path as an array of 8-bit code-units. Correct usage requires detailed knowledge
of character sets and text encodings, and users should generally prefer higher-level APIs, such as FilePath(url: WebURL)
from the WebURLSystemExtras
module.
Accepted URLs
This function only accepts URLs with a file
scheme, whose paths do not contain percent-encoded path separators or NULL
bytes.
For the windows
format, the reconstructed path must be fully-qualified - meaning that it is either a UNC path, or begins with a drive-letter.
Note that not all URLs are compatible with all path formats. For example, there is no obvious way to construct a posix
path to a remote host.
Encoding
This function returns an array of 8-bit code-units, formed by percent-decoding the URL's contents.
-
To use the returned path on a POSIX system, request null-terminated code-units and map/reinterpret them to the platform's
CChar
type. POSIX requires that the Cchar
type be 8 bits, so this is safe. As paths on POSIX systems are opaque octets, the path should be used as-is. -
To use the returned path on a Windows system, some transcoding may be necessary as the system natively uses 16-bit code-units. This can be difficult, as in general there is no way to know how the file path was encoded, or how to interpret the bytes as textual code-points. A good strategy is to first interpret the bytes as UTF-8 and transcode to UTF-16. Should that fail, assume that code-units are from the system's active code-page and use the
MultiByteToWideChar
function to produce a UTF-16 file path.
Normalization
Some minimal normalization is applied to the path, such as removing empty path components.
Unlike the WebURL.fromBinaryFilePath
function, Windows path components are never trimmed.
Parameters
Name | Type | Description |
---|---|---|
url | WebURL |
The file URL. |
format | FilePathFormat |
The path format which should be constructed; either |
nullTerminated | Bool |
Whether the reconstructed file path bytes should include a null-terminator. |
Throws
FilePathFromURLError
Returns
The reconstructed file path bytes from url
.
resolve(_:)
@inlinable @inline(__always)
public func resolve<S>(_ string: S) -> WebURL? where S: StringProtocol, S.UTF8View: BidirectionalCollection
Parses the given string with this URL as its base.
This function supports a wide range of relative URL strings, producing the same result as an HTML <a>
tag on the page given by this URL.
let base = WebURL("http://example.com/karl/index.html")!
base.resolve("photos/img.jpg?size=200x200")! // "http://example.com/karl/photos/img.jpg?size=200x200"
base.resolve("/mary/lambs/1/fleece.txt")! // "http://example.com/mary/lambs/1/fleece.txt"
It should be noted that this method accepts protocol-relative URLs, which are able to direct to a different hostname, as well as absolute URL strings, which do not copy any information from their base URLs.
hash(into:)
public func hash(into hasher: inout Hasher)
encode(to:)
public func encode(to encoder: Encoder) throws
serialized(excludingFragment:)
public func serialized(excludingFragment: Bool = false) -> String
Returns the string representation of this URL.
Parameters
Name | Type | Description |
---|---|---|
excludingFragment | Bool |
Whether the fragment should be omitted from the result. The default is |
setScheme(_:)
@inlinable
public mutating func setScheme<S>(_ newScheme: S) throws where S: StringProtocol
Replaces this URL's scheme
with the given string.
Setting this component may fail if the new scheme is invalid, or if the URL cannot be adjusted to the new scheme.
setUsername(_:)
@inlinable
public mutating func setUsername<S>(_ newUsername: S?) throws where S: StringProtocol
Replaces this URL's username
with the given string.
Any code-points which are not valid for use in the URL's user-info section will be percent-encoded. Setting this component may fail if the URL does not allow credentials.
setPassword(_:)
@inlinable
public mutating func setPassword<S>(_ newPassword: S?) throws where S: StringProtocol
Replaces this URL's password
with the given string.
Any code-points which are not valid for use in the URL's user-info section will be percent-encoded. Setting this component may fail if the URL does not allow credentials.
setHostname(_:)
@inlinable
public mutating func setHostname<S>(_ newHostname: S?) throws
where S: StringProtocol, S.UTF8View: BidirectionalCollection
Replaces this URL's hostname
with the given string.
When setting this component, the new contents will be parsed and normalized (e.g. domains will be percent-decoded and lowercased, and IP addresses will be rewritten in their canonical form). Unlike setting other components, not all code-points which are invalid for use in hostnames will be percent-encoded. If the new content contains a forbidden host code-point, the operation will fail.
setPort(_:)
public mutating func setPort(_ newPort: Int?) throws
Replaces this URL's port
.
Setting this component may fail if the new value is out of range, or if the URL does not support port numbers. If the URL has a "special" scheme, setting the port to its known default value will remove the port.
setPath(_:)
@inlinable
public mutating func setPath<S>(_ newPath: S) throws where S: StringProtocol, S.UTF8View: BidirectionalCollection
Replaces this URL's path
with the given string.
When setting this component, the given path string will be lexically simplified, and any code-points in the path's components that are not valid
for use will be percent-encoded. Setting this component will fail if the URL's path is opaque (see WebURL.hasOpaquePath
).
setQuery(_:)
@inlinable
public mutating func setQuery<S>(_ newQuery: S?) throws where S: StringProtocol
Replaces this URL's query
with the given string.
When setting this property, any code-points which are not valid for use in the URL's query will be percent-encoded.
Note that the set of code-points which are valid depends on the URL's scheme
.
setFragment(_:)
@inlinable
public mutating func setFragment<S>(_ newFragment: S?) throws where S: StringProtocol
Replaces this URL's fragment
with the given string.
When setting this property, any code-points which are not valid for use in the URL's fragment will be percent-encoded.
Operators
==
public static func == (lhs: Self, rhs: Self) -> Bool
<
public static func < (lhs: Self, rhs: Self) -> Bool