Talk:LMNL syntax

From LMNLWiki

Auto-generating

I'm undecided on the idea of auto-generated namespaces for prefixes that you can just use in the document. Users are inherently lazy, so if they can get away without declaring a namespace then they probably will; plus, they're likely to use one- or two-letter prefixes 'cos they're easier to type. Then they'll find, when they start merging or whatever with other peoples' documents that their names weren't actually unique after all. In addition, it means that users get no warning when they accidentally mis-spell a prefix. Why do we need them? Jeni 03:38, 10 September 2006 (EDT)

Heh. I think the second problem is worse than the first, which is easily fixed by providing the namespace declaration you should have had all along. Yup, users are lazy: I think we should aim for the solution that strikes the best balance between consistency and intelligibility (something XML namespaces lack? :-) and convenience for the users. Assuming namespaces always cause problems for lazy users (which I think they will), having a rule that says "if you don't declare a namespace the system declares it for you (and this is how)" is actually pretty consistent and intelligible ... it even provides a better explanation for why name clashes happen when they do. (Lazy users are also frequently obtuse, and don't see why they can't "own" unnamespaced names and use them freely even while other users want to do the same thing.)

I still don't get it. What are we trying to achieve by auto-declaring namespaces? Why is it important for users to be able to use arbitrary prefixes on their range names but not have to declare a namespace for them? Under what circumstances would a user use prefixes but not declare a namespace? I can see how it would work, and I can see how it can be defined consistently, I'm just not sure at the moment what problem this is supposed to solve. — Jeni 16:26, 10 September 2006 (EDT)

Heh (Wendell). Well, John probably has something to offer here. From my POV what we're trying to alleviate is the whole conceptual mishmash of having things in a namespace sometimes but not other times, but that "sometimes" having nothing to do whether a name is prefixed or not. We're all so used to namespaces that we've adapted to this funkiness long ago, but as a teacher of XML and XSLT newbies I'm used to blank stares on this topic, plus frustration (when they start to get it) with what is inevitably perceived as bad design, sometimes for the wrong reasons. One of the virtues of the current proposal is that it does away altogether with the mystifying "no-namespace namespace", that is, the namespace that has no identifier and is, or is not, the same namespace as the default namespace even though in some way it's not really there.

I think this will makes for clarity of explanation and better consistency both for users and for implementors. In practice, the only difference is that you can use prefixes with impunity, not always having declared them. As for what's under the hood, it's just what it would be if XML said that instead of the weird "no-namespace namespace", you'd get a W3C namespace. XSD would be much clearer, for example, if this was what was up. This might lead people to declare their namespaces more often, I think, and not just fake it so much, since the risks of going without declaring a namespace would be more obvious up front -- which might be a good thing. --Wendell 11:46, 11 September 2006 (EDT)

OK, I can see the rationale for having a default default namespace into which unprefixed names get put even if no default namespace has been declared. I still don't understand why namespaces should be auto-generated for prefixes that are used but not declared. — Jeni 09:06, 12 September 2006 (EDT)

As for providing misspelling warnings -- isn't that something an editing application should do, not a parser? Not a bad point though. I can see parsers wanting to help us by providing warnings about orphaned names. --Wendell 14:56, 10 September 2006 (EDT) (Also, only misspelled empty ranges or ranges that are misspelled on both ends will get by the parser -- same as XML localparts.)

If we were to have auto-generated namespaces, I'd rather a mechanism where [!ns "IRI"] defined the prefix that was used. So doing:

[!ns "http://www.example.com/"]

means that [foo] is in the namespace http://www.example.com/ and f:foo is in the namespace http://www.example.com/f. Jeni 03:38, 10 September 2006 (EDT)

Hm: that's an interesting idea. It's not incompatible with ranges that have no prefix and no namespace declaration falling back to a reserved namespace (so there's still no such thing as a "range in no namespace"). But please clarify how "[!ns "IRI"] define[s] the prefix that was used"?

Sorry, I used "prefix" in a non-technical way in a context where it's used in a technical way! I meant that

[!ns
                        "IRI"]

would provide (a) the namespace used for non-prefixed names and (b) the initial part of the namespaces used for prefixed names when the prefix isn't otherwise declared. — Jeni 16:26, 10 September 2006 (EDT) (Ah, okay --Wendell.)

Declaring

Just an idea: I wonder whether we should follow the same pattern for namespaces as we have for entities, namely have two kinds of declarations: a [!namespace] declaration for declaring a single namespace at a time, and a [!namespaces] declaration for pointing to a file that contains a bunch of namespace declarations. I kinda like the parallel with [!entity] and [!entities], plus it would make it easy for people to use standard prefixes without having to declare them in their documents each time.

On the other hand, external declarations are always a bit dodgy: what happens when the imported file is inaccessible? With entities, I think you get an error if you use an entity that hasn't been declared, so things don't fail if the file isn't there, just if you use an entity from the file that isn't there. With namespaces, if we have this idea of auto-generated namespaces then you wouldn't get an error at all, just have your ranges end up with different namespaces from the ones you'd intended. This is nasty. — Jeni 09:17, 11 September 2006 (EDT)

Heh (again). Well, see another proposal wherein I offer that even entity declarations should have fallbacks (to atoms with the same names). --Wendell 11:46, 11 September 2006 (EDT)

I think we need to consider this carefully. There will always be times when an imported file is inaccessible; the question is what happens then? Often, I find it a real pain to see wf errors in such cases, even while I recognize it's best for processors to play it safe in the general case. (In my particular case I'll often be validating etc. later, and right now I'd just like to use XSLT for some minor fixup thankyou.) I'd rather there be a warning and some rational fallback behavior in such cases. If XML had this (and I understand why it doesn't), the "standalone" declaration in the XML declaration might actually be useful, or maybe not even needed.

Maybe we need to distinguish not between two kinds of errors but two kinds of processors: "Draconian" and "Permissive". (Naturally a single parser could provide either mode with a switch.) Draconian processors would refuse to parse files where syntactic constructs implied external declarations that were missing (and perhaps offer messages saying whether they were called in but missing, or not even called in). Permissive processors would have clean, rational fallback behavior specified (bind namespaces and render entities as atoms). This would provide for processing of documents even when declarations were unavailable, so for example, the entities and namespaces used could be named by a transformation, or other useful work could be done.

Developers, system designers and users could use Permissive processors at their own risk, and Draconian processors when they needed the extra level of "lexical validation". I do like the idea of using a consistent pattern for both namespaces and entities. --Wendell

IDs

Should IDs be qualified names rather than NCNames? All other names are QNames, so I think the answer is yes. — Jeni 08:04, 10 September 2006 (EDT)

Should [foo=bar] be a shorthand for [foo [lmnl:id}bar{]]? — Jeni 08:04, 10 September 2006 (EDT)

Either of these would be okay with me. --Wendell 14:56, 10 September 2006 (EDT)

Actually, that should have been [foo [lmnl:id [namespace}http://lmnl.org/namespace{]}bar{]]</code> or something similar. — Jeni 16:26, 10 September 2006 (EDT)

I'm not happy with either of these ideas. IDs aren't part of the data model, and aren't even guaranteed to be unique in the document (though I think we should probably change the latter), so does it really make sense to talk about globally unique IDs? I'd say if you want global element labels (and I agree that those are a Good Thing), use lmnl:id, but don't mix them up with the syntactic IDs we use to keep straight which tag goes with what.

(I agree IDs should be unique in a document.) I just think users will find it confusing to have two different ways of setting IDs in the syntax, one of which is ignored in the data model. — Jeni 09:12, 12 September 2006 (EDT)