@@ -179,11 +179,11 @@ This may save a few CPU cycles for every candidate domain lookup.
179
179
180
180
Example candidate domain: ` foo.bar.baz `
181
181
182
- | Step | Domain | Search in PSL? |
183
- | :----:| :------:| :------:|
184
- | 1 | ` foo.bar.baz ` | yes |
185
- | 2 | ` bar.baz ` | yes |
186
- | 3 | ` baz ` | no |
182
+ | Step | Domain | Search in PSL? |
183
+ | :----:| :------------- :| :-------- ------:|
184
+ | 1 | ` foo.bar.baz ` | yes |
185
+ | 2 | ` bar.baz ` | yes |
186
+ | 3 | ` baz ` | no |
187
187
188
188
It is unclear how much of a performance benefit such an optimization would give
189
189
in practice.
@@ -378,21 +378,21 @@ namespace publicSuffix {
378
378
// METHODS
379
379
380
380
// Determines if the given hostname is itself a known eTLD (i.e. in the PSL).
381
- export function isKnownPublicSuffix (
381
+ export function isKnownSuffix (
382
382
hostname : string ,
383
383
)
384
384
: boolean;
385
385
386
386
// Gets the known eTLD, if any, of a given hostname.
387
- export function getKnownPublicSuffix (
387
+ export function getKnownSuffix (
388
388
hostname : string ,
389
389
)
390
390
: string | null;
391
391
392
392
// Gets the registrable domain of a given hostname.
393
- export function getRegistrableDomain (
393
+ export function getDomain (
394
394
hostname : string ,
395
- options ? : RegistrableDomainOptions ,
395
+ options ? : DomainOptions ,
396
396
)
397
397
: string | null;
398
398
@@ -403,17 +403,17 @@ namespace publicSuffix {
403
403
// INTERFACES
404
404
405
405
// Options that may be passed to the API method to control its behaviour.
406
- interface RegistrableDomainOptions {
407
- // If true, the resulting registrable domain should be encoded as Unicode.
406
+ interface DomainOptions {
407
+ // If true, the returned domain should be encoded as Unicode.
408
408
// Default = false (Punycode)
409
409
unicode? : boolean ,
410
- // If true, an IP address is a registrable domain .
410
+ // If true, the returned domain may be an IP address .
411
411
// Default = false
412
412
allowIP? : boolean ,
413
- // If true, a known eTLD is a registrable domain .
413
+ // If true, the returned domain may be a known eTLD .
414
414
// Default = false
415
415
allowPlainSuffix? : boolean ,
416
- // If true, a hostname that lacks a known eTLD is a registrable domain .
416
+ // If true, the returned domain may lack a known eTLD.
417
417
// Default = false
418
418
allowUnknownSuffix? : boolean ,
419
419
}
@@ -489,13 +489,13 @@ whose effects are demonstrated using the following examples.
489
489
490
490
#### 2. API Methods
491
491
492
- ##### 2.1 Public Suffix
492
+ ##### 2.1 Known Suffix
493
493
494
- Method ` getKnownPublicSuffix ()` returns the input hostname's known eTLD (i.e. in the PSL)
494
+ Method ` getKnownSuffix ()` returns the input hostname's known eTLD (i.e. in the PSL)
495
495
if it has one, otherwise ` null ` .
496
496
497
- Method ` isKnownPublicSuffix ()` returns ` true ` if and only if the input hostname is itself
498
- a known eTLD. In other words, this method returns ` true ` if calling ` getKnownPublicSuffix ()`
497
+ Method ` isKnownSuffix ()` returns ` true ` if and only if the input hostname is itself
498
+ a known eTLD. In other words, this method returns ` true ` if calling ` getKnownSuffix ()`
499
499
with the input hostname returns the input hostname itself.
500
500
501
501
These methods are included in the API because the PSL algorithm returns the longest eTLD,
@@ -506,17 +506,17 @@ whose public suffix is 'io'.
506
506
507
507
###### Examples
508
508
509
- | Input hostname | Public Suffix |
509
+ | Input hostname | Known Suffix |
510
510
| ----------------| --------------:|
511
511
| github.io | github.io |
512
512
| foo.github.io | github.io |
513
513
| facebook.co.uk | co.uk |
514
514
| 192.168.2.1 | null |
515
515
| green.banana | null |
516
516
517
- ##### 2.2 Registrable Domain
517
+ ##### 2.2 Domain
518
518
519
- Method ` getRegistrableDomain ()` returns the input hostname's registrable domain,
519
+ Method ` getDomain ()` returns the input hostname's registrable domain,
520
520
as determined by running the PSL algorithm, otherwise ` null ` .
521
521
522
522
By default, this method returns ` null ` if the input hostname:
@@ -525,35 +525,42 @@ By default, this method returns `null` if the input hostname:
525
525
* is itself a known eTLD
526
526
* is an IP address - IPv4 or IPv6
527
527
528
- ##### 2.2.1 Options: Registrable Domain
528
+ ##### 2.2.1 Options: Domain
529
529
530
530
In order to support different use cases including those that need to determine
531
531
a hostname's "site", additional options are provided, allowing a more
532
- general-purpose interpretation of what constitutes a registrable domain
533
- that includes IP addresses and unknown eTLDs.
532
+ general-purpose interpretation of a domain to include not only registrable domains
533
+ but also IP addresses and domains with unknown (non-registrable) eTLDs.
534
534
535
535
Options ` allowIP ` , ` allowPlainSuffix ` and ` allowUnknownSuffix ` each target
536
536
a specific kind of input hostname lacking a registrable domain
537
537
in the strictest sense (i.e. having a known eTLD as stipulated by
538
538
the PSL algorithm), as follows:
539
539
540
- | Option | Kind of Input Hostname Targetted |
540
+ | Option | Kind of Input Hostname Targeted |
541
541
| --------------------| ---------------------------------:|
542
542
| allowIP | IP Address (IPv4 of IPv6) |
543
543
| allowPlainSuffix | is itself a known eTLD |
544
544
| allowUnknownSuffix | lacks a known eTLD |
545
545
546
546
The effect of each option when applied to an input hostname of the
547
- kind targetted by the option is to change the registrable domain
548
- from being ` null ` to being instead * the full input hostname itself* .
547
+ kind targeted by the option is to change the returned domain
548
+ from being ` null ` to being the following:
549
+
550
+ | Option | Returned Domain | Returned Domain Kind |
551
+ | --------------------| :------------------------------------------------:| :--------------------:|
552
+ | allowIP | input hostname | IP address |
553
+ | allowPlainSuffix | input hostname | eTLD |
554
+ | allowUnknownSuffix | last 2 labels, or input hostname if single label | eTLD+1 or eTLD |
549
555
550
556
###### Examples
551
557
552
- | Input hostname | Option = true | Registrable domain |
558
+ | Input hostname | Option = true | Returned domain |
553
559
| -------------------| --------------------| -------------------:|
554
560
| 192.168.2.1 | allowIP | 192.168.2.1 |
555
561
| github.io | allowPlainSuffix | github.io |
556
- | apple.pear.banana | allowUnknownSuffix | apple.pear.banana |
562
+ | apple.pear.banana | allowUnknownSuffix | pear.banana |
563
+ | banana | allowUnknownSuffix | banana |
557
564
558
565
##### 2.2.2 Options: Justification
559
566
@@ -562,7 +569,7 @@ not only domains on the internet having known eTLDs, but also
562
569
intranet hostnames having non-public (i.e. unknown) suffixes, or no suffix.
563
570
564
571
Reviewers of this proposal note that if it were the case that non-domains
565
- were included by default, ` getRegistrableDomain ()` would effectively
572
+ were included by default, ` getDomain ()` would effectively
566
573
return a string for almost every input.
567
574
568
575
As a result of the inclusion of unknown suffixes, the API implementation must
@@ -577,7 +584,7 @@ which may be an IP address or a domain name.
577
584
An example of such a use case is Firefox's [ Search vs Navigate] ( #4-search-vs-navigate ) ,
578
585
which involves determining if an entry in the URL bar is a navigable site,
579
586
or a search term. If this functionality was based purely on the return value
580
- of ` getRegistrableDomain ()` , i.e. navigate if nonnull or search if null,
587
+ of ` getDomain ()` , i.e. navigate if nonnull or search if null,
581
588
then IP addresses would incorrectly cause a search. By using the ` allowIP ` option,
582
589
the return value for an input IP address would be the IP address itself instead of null,
583
590
thereby causing the desired result of navigating instead of searching.
@@ -586,31 +593,6 @@ Option `allowPlainSuffix` only exists because there are domains that do not have
586
593
a registrable domain, due to themselves being PSL eTLDs, but can still be
587
594
navigated to, such as github.io and blogspot.com.
588
595
589
- ##### 2.2.3 Options: Discussion
590
-
591
- The effect of the options is that ` getRegistrableDomain() ` may return values
592
- that are not registrable domains in the strictest sense, e.g. they may
593
- be IP addresses.
594
-
595
- The author of this proposal is of the view that:
596
-
597
- 1 . Any method named ` getXYZ() ` should return a value of type ` XYZ ` . Therefore
598
- ` getRegistrableDomain() ` may not be the most suitable name, since it does
599
- not always return true registrable domains. Reviewers of this proposal
600
- feel this is not a significant enough issue to warrant alternative naming.
601
-
602
- 2 . This API should provide a way not just to get a hostname's
603
- registrable-domain-like value, but also to know what kind of value that is,
604
- be it an IP address, a domain name, or an intranet hostname lacking a known eTLD.
605
- Reviewers of this proposal are of the view that no compelling use case has been
606
- identified to support the need for such additional functionality. However,
607
- reviewers have conceded that IP addresses have to be special-cased, because for
608
- most domain inputs, one could split at dots to try and get a different domain level,
609
- but that logic does not make sense for IP addresses. By not providing a way of
610
- knowing whether the return value of ` getRegistrableDomain() ` is an IP address
611
- or a domain name, it is more difficult for users of this API to implement
612
- the special-casing that the reviewers have identified.
613
-
614
596
#### 3. IDN
615
597
616
598
All API methods should accept hostnames passed as input parameters using either
@@ -625,15 +607,15 @@ using Unicode encoding.
625
607
626
608
` domain ` = foo.bar.example.مليسيا
627
609
628
- | Option | Registrable Domain |
610
+ | Option | Returned Domain |
629
611
| ----------------------------| -----------------------:|
630
612
| unicode == false (default) | example.xn--mgbx4cd0ab |
631
613
| unicode == true | example.مليسيا |
632
614
633
615
#### 4. Invalid hostname
634
616
635
- The promises returned by this API's methods should reject with an error if a hostname
636
- passed as an input parameter meets any of the following criteria:
617
+ This API's methods should throw an error if a hostname passed as an input parameter
618
+ meets any of the following criteria:
637
619
638
620
* Contains a character that is invalid in an Internationalized Domain Name (IDN) - e.g. symbols, whitespace
639
621
* Is an empty string
@@ -642,10 +624,10 @@ passed as an input parameter meets any of the following criteria:
642
624
643
625
#### 5. Summary of behaviours
644
626
645
- The following table sets out the eventual settled state of the promise returned by
646
- ` getRegistrableDomain() ` for different classes of input ` hostname ` parameter:
627
+ The following table sets out the value returned by ` getDomain() ` for different
628
+ classes of input ` hostname ` parameter:
647
629
648
- | Input hostname | Description | Registrable domain |
630
+ | Input hostname | Description | Returned domain |
649
631
| :-------------------| :-------------------------------------------------| -----------------------:|
650
632
| example.net | eTLD+1 | example.net |
651
633
| www.example.net | eTLD+2 | example.net |
@@ -656,7 +638,7 @@ The following table sets out the eventual settled state of the promise returned
656
638
| foobar | no matching eTLD in PSL, single-label | null |
657
639
| foobar | as above, with ` allowUnknownSuffix = true ` | foobar |
658
640
| my.net.foobar | no matching eTLD in PSL, multi-label | null |
659
- | my.net.foobar | as above, with ` allowUnknownSuffix = true ` | my. net.foobar |
641
+ | my.net.foobar | as above, with ` allowUnknownSuffix = true ` | net.foobar |
660
642
| foobar.net | has an eTLD in the ICANN section | foobar.net |
661
643
| foobar.github.io | has an eTLD in the Private section | foobar.github.io |
662
644
| 127.0.0.1 | IP address, IPv4 | null |
@@ -680,21 +662,21 @@ The following table sets out the eventual settled state of the promise returned
680
662
#### 6. Sync vs Async
681
663
682
664
Browser extension APIs are most commonly async, with API methods returning Promises.
683
- Earlier versions of this proposal set out an async API, with ` getRegistrableDomain ()`
684
- returning a ` Promise<String > ` . However, some use cases require getting lists of
665
+ Earlier versions of this proposal set out an async API, with ` getDomain ()`
666
+ returning a ` Promise<string > ` . However, some use cases require getting lists of
685
667
registrable domains all in one go. In theory, this could be achieved by simply calling
686
- ` getRegistrableDomain ()` multiple times.
668
+ ` getDomain ()` multiple times.
687
669
688
670
The problem with this approach is that there is overhead associated with an extension
689
671
calling an async function on the parent browser. For example, obtaining the registrable domains
690
- of a list of 50 domains would involve making 50 async calls to the parent browser.
672
+ of a list of 50 hostnames would involve making 50 async calls to the parent browser.
691
673
A batching method would allow the same result to be obtained with a single async call.
692
674
693
- For this reason, batching method ` getRegistrableDomains ()` was added to this API.
675
+ For this reason, batching method ` getDomains ()` was added to this API.
694
676
The method accepted an array of hostnames as input and returning a promise resolving to
695
677
an array of registrable domains. A quick mockup of the two approaches was built using
696
678
a simplified implementation of this proposal's API in a modified Firefox, and the
697
- batching approach was about 2-3 times faster for 50 domains .
679
+ batching approach was about 2-3 times faster for 50 hostnames .
698
680
699
681
Unfortunately, while this offered a solution to the performance problem,
700
682
it added additional complexity to the API. To resolve this issue, the API
@@ -763,14 +745,17 @@ done by the host browser.
763
745
764
746
### Open Web API
765
747
766
- The purpose of this API is to eliminate the potential for inconsistency between
767
- the host browser and its hosted extensions. The simplest way of achieving this
768
- is for extensions to access this functionality via the host browser itself rather
769
- than via some external source, such as an Open Web API.
748
+ Implementing this proposal as an open web API is not realistic at this time because:
770
749
771
- It is then a determination for the host browser itself as to whether
772
- the functionality (used by both the host browser and its extensions)
773
- should ultimately be obtained by means of an Open Web API.
750
+ * Compared to web extension APIs, there is a higher bar for introducing web APIs,
751
+ and in the past there has not been sufficient interest in moving forward a proposal
752
+ like this one. Therefore the preferred approach is to start with extensions,
753
+ and it will always be possible to propose a web API later if this work proves
754
+ useful and there is appetite.
755
+
756
+ * The PSL is not appropriate for use in all circumstances. Extensions have a
757
+ very compelling set of use cases that match browser use cases, but there
758
+ is not a universal agreement this is the case more generally.
774
759
775
760
## Implementation Notes
776
761
@@ -818,3 +803,22 @@ is released, however this may not always be the case.
818
803
It may be useful to implement a notification mechanism so that extensions can take
819
804
appropriate action when the host browser's PSL dataset changes, to avoid having to
820
805
poll the ` getVersion() ` function provided by this API.
806
+
807
+ ### 3. Get Domain and Kind
808
+
809
+ While API method ` getDomain() ` by default returns registrable domains,
810
+ with additional options this method may return other types of domain:
811
+ IP addresses, intranet hostnames lacking known suffixes, and public suffixes themselves.
812
+ There is currently no straightfoward way for the method caller to determine
813
+ which of these kinds of value was returned from an invocation such as:
814
+ ` getDomain(hostname, { allowIP, allowUnknownSuffix, allowPlainSuffix }) ` .
815
+
816
+ It may be beneficial to provide an additional API method that would
817
+ return not only the domain value as returned by ` getDomain() ` ,
818
+ but also a designation of the kind of value returned:
819
+ ` RegistrableDomain ` , ` UnknownDomain ` , ` KnownSuffix ` , ` IPAddress ` .
820
+
821
+ An example use case would be if extension developers wanted to prepend
822
+ additional labels to the domain returned by ` getDomain() ` . This would
823
+ not make sense for returned IP addresses, so developers would need a
824
+ way of separating returned IP addresses from returned domain names.
0 commit comments