XML: opportunities and prospects
Destiny Web, and first of all rating of prospects of language XML being a basis of new Web-technologies, the broad audience of people involved in development and development of information systems is anxious. In this number{room} we place the first part of clause{article} about conceptual opportunities XML
The key moments
? Language and platform XML
? <Physics> and <logic> of XML-documents
? Metagiven{Metadata} XML
? XML and databases
? Semantics of resources XML
? Prospects XML
Discussion of these questions has taken place in second half of 2000 on pages of <Director>. However thus important points to which it is necessary to pay attention have not been mentioned.
Part 1. Platform XML and standards making her
Last years we are witnesses of <velvet> revolution developing in the World wide web which is connected with the advent of the new language standard of hypertext marking XML.
Michael Ruvimovich Kogalovsky - managing laboratory of systems of databases of Institute of problems of the market of the Russian Academy of Science, the scientific secretary of Moscow section ACM SIGMOD. He can write to the address: kogalov@cemi.rssi.ru
In our opinion, there is really a revolution as the intensive image forms considerably new Web-technolog.ies this revolution it is quite possible to count <velvet> as she is carried out circumspectly, with obviously expressed and practically carried out intentions to save all those huge information resources and those useful Web-guided applications which were intensively created in Web environment for its{her} short history.
Simultaneously with these internal changes Web it is subject also to strong influence of external processes, appreciably by them stimulirovannykh and connected with new tendency actively developing in last years in development of information systems - with integration of technologies of databases, text (documentary) systems, technologies Java, Web-technologies, technology of non-uniform distributed{allocated} objective CORBA environments on the basis of various approaches. The reasons of this tendency consist in aspiration not only to enrichment of functionalities of created large systems, but also to maintenance of integration, including at a semantic (semantic) level, the non-uniform information resources created and supported by means of various technologies.
Occuring in Web changes mention rather wide class of information systems. Not casually therefore, that further destiny Web, and first of all a rating of prospects of language XML [1-3] being a basis of new Web-technologies, ozabocheno the big number of the experts involved in development and development of information systems. In this connection taken place discussion of these questions on pages of the given application [4] is represented quite natural. However thus, apparently, have not been mentioned or some important points to which it is necessary to pay attention readers were insufficiently precisely accented.
Discussion of opportunities of language XML and prospects of his use is obviously important, as this language together with a complex of standards W3C forming his infrastructure, already became the standard de facto. Thus the sphere of his application constantly extends and includes not only directly Web-technologies, but also some adjacent areas of the information technologies guided on Web as on Wednesday of teleaccess to resources is perfect other sort and information interchange between the systems based on other technologies.
Language XML and XML-platform
Language XML and XML-platform
At a rating occuring in Web changes it would be error to be limited to consideration of opportunities only actually language XML. Alongside with creation of the standard of this language consortium W3C forming technical politics of development Web and developing standardized specifications for this Wednesday, actually simultaneously forms a new platform which basis is language XML. Functionality of this platform is defined{determined} by the whole complex of the interconnected standards, the part from which is already accepted W3C, and others are in a stage of development.
With rare exception other standards of platform XML are considered{examined} by developers as applications XML, mainly in connection with that, that for their specifications syntax of language XML is used. However the important role is played also with that circumstance, that new opportunities determined by these standards are entered by a concrete definition of sense and functionality of some syntactic components of language XML. In other words, in specifications of language XML and some other standards of a platform there is a number{line} of " open points » due to which the natural way of expansion of functionalities XML to various aspects is provided. Specifications of some such expansions also are defined{determined} by a number{line} of standards of platform XML (fig. 1).
These standards allow to define{determine}, in particular, set allowable in the XML-document tegov marking and their attributes, associiruja with them by default some semantics (the standard of space of names XML - Namespaces in XML [5]), enrich opportunities of the description available in language with help DTD of structure of XML-documents (the standard of specifications of the circuit - XML Schema [6-8]), allow to define{determine} hyperlinks between documents and-or their fragments (language standards of indexes and language of hyperlinks - XPointer [9] and XLink [10]), enable to describe semantics of XML-documents with a various degree formalizovannosti (the standard of the environment of definition of resources - RDF [11-12]), to operate performance of XML-documents on party of the client (standards of cascade tables of styles CSS [13] and expanded language of tables of styles XSL [14]), to describe transformations of XML-documents (the language standard of the description of transformations of XML-documents - XSLT [15] - a special part of standard XSL).
Besides the standard of objective model DOM [16] for XML-and the HTML-documents, determining functions of the interface of applied programming for their processing is created.
Are developed also the language standard of searches of resources XML (XML-QL) for what requirements to base model and language [17-18] are formulated and a number{line} of available applicants [19], and the standard of a digital signature for XML-documents (XML-Signature [20]) is studied.
The special place in a considered{an examined} complex of standards borrows{occupies} recently accepted W3C standard XHTML 1.0 [21]. He gives one of possible{probable} ways of maintenance of continuity of development of Web environment, allowing to use on platform XML the information resources which have been saved up within the framework of technologies HTML. This standard supports means XML functionality of the current version of language HTML (HTML 4.01) with three various levels of completeness.
The considered{examined} complex of standards of platform XML includes also a lot of auxiliary standards. We shall result some examples. Standard XML Information Set (Infoset) [22] represents the abstract description of those data which make the XML-document. Specification XML leans{bases} on him [1]. Standard XPath [23] defines{determines} concept of a fragment of the XML-document, used in languages XPointer and XSLT. In standard XML Inclusions (XInclude) [24] the model and syntax for the description of merge of XML-documents are submitted. Standard XML Fragment Interchange [25] allows to describe a context of fragments of the XML-document and due to this to look through and edit them outside of the full text of the document. We shall mention also standard Canonical XML [26] in which the method is offered, allowing to establish equivalence of two XML-documents with various syntactic performance. This opportunity is essential, in particular, to use of the digital signature [20].
Even the resulted incomplete list of structure of standards of platform XML and their purpose show, that at a rating of prospects of use of language XML it is wrongful to be limited to consideration only actually functionality of this language, and it is necessary to consider all set of the standards making formed platform XML.
As XML it is interconnected to other standards of a platform
As it was marked, in the specification of language XML a number{line} of " open points » which provide interrelation XML c with other standards of a platform basing him, and also if necessary and with standards, to her not concerning is provided.
The following approach is in general used.
Main « the open point » language consists that XML is a meta language, as well as language SGML generated it. As against language HTML in his specification functional specialization of elements of XML-documents, their attributes and semantics of values of attributes is not fixed. By a concrete definition of functionality and syntax of elements of XML-documents it is possible to expand functionality of language XML.
The second « an open point » in XML is an opportunity to use so-called spaces of names - the predetermined called sets of the names used as names of types of elements and attributes of elements of XML-documents. Definition of space of names allows also in the obvious or implicit image associirovat` with names of attributes of set of allowable values of these attributes.
It is supposed thus, that to each name belonging to the given space of names, and also values of attributes there corresponds{meets} some semantics determined by default or the obvious image. The way of definition of semantics thus in standard Namespaces in XML is not fixed in any way. These definitions can be based on various other standards or the methods required for any concrete application.
Ctandarty platforms XML expanding functionality of language, are under construction by this principle. The space of names with the preserved name, determining names of special types of elements of XML-documents and their attributes is entered. Semantics of these elements and their attributes and syntactic agreements are defined{determined} in specifications of these additional standards. Names belonging to these space are considered standard.
As example of use of the considered mechanism of expansion XML standard XLink [10] which allows to use the reference elements providing a various sorts of a hyperlink between XML-documents in XML-documents of a special kind can serve. In language XML the concept of a hyperlink is not supported.
What data can represent XML
The statement stated in much{many} publications that XML allows to describe the data of the most various nature requires specification often. This statement cleduet to understand thus.
XML allows to carry out a marking of text files, transforming the linear text in a hypertext. The various files of other nature named often binary files, are not objects of a hypertext marking means XML. However specifications of language XML allow to integrate such information resources into a hypertext with the help of links to files containing them, generating thus gipermedijnye the information resources making contents of pages Web. Along with links to the resources contained in binary files, the XML-document can contain directly their text description or to refer to other XML-documents which contain it . These last, in turn, can contain the binary files integrated into them, etc. Any other opportunities of the description and data presentation such as images, etc. language XML does not provide audio, video.
XML and HTML: what language is more complex
In discussion [4] brought an attention to the question on comparison of complexity of standards HTML and XML. Affirmed, that language XML is more primitive HTML. It would be desirable to specify this moment.
If to compare volume of descriptions of these languages (the documentation of standards XML and HTML) specifications XML borrow{occupy} in some times less places. It is necessary to spend less work and time that them to understand and remember. And in this sense XML it is easier, than HTML.
However XML is not more primitive at all HTML. If to compare with functionalities of these languages it is necessary to take into account first of all, that though they and have the general{common} roots - the known international standard of the generalized language of a marking of documents SGML [27], but nevertheless concern to different levels of abstraction.
Language XML is the meta language being, as is known, a subset of standard SGML. As well as SGML, he is intended for generation of various concrete languages of a marking by definition of concrete sets tegov (in XML - types of elements of the document). These languages determined with help XML are, thus, his concrete definitions.
As to language HTML, it - concrete (not expanded) language. Functionality tegov marking in him is fixed, as against XML. HTML was created as the elementary concrete definition SGML, representing a powerful meta language. HTML can be determined also by means XML (we shall recollect standard XHTML) and consequently he represents also one of concrete definitions XML.
By virtue of the abstraktnosti XML it is open for expansions (that is reflected in his name), and for this reason he is considerably more conservative in comparison with HTML where addition of functionalities demands passage of procedure of acceptance of the new version of the standard. Versions of XML-browsers will appear much less often, than for HTML which still continues to develop.
Multilevel performance of XML-documents
Language XML provides that mnogourovnevost` data presentations which is "congenital" feature of systems of databases. We shall recollect axiomatic for experts in the field of databases of concept of "physical" and "logic" data presentation or external, conceptual, and internal circuits in three-circuit technology ANSI/X3/SPARC.
More particularly, XML supports first of all a "physical" level of performance of the XML-document - the description of structure of his storage. As building blocks for him serve so-called to essence of language XML - files and fragments of files of the various nature (files with XML-specifications, for example, file DTD for any type of documents, or binary files schedules, audio or the videodata, lines repeating inside the XML-document, etc.). The structure of storage of the XML-document represents hierarchy such suhhnostej. It is important to note, that in language XML it is not provided the separate description of physical performance of the XML-document. This performance - self-described. It is built - in the document.
Further, alongside with "intrinsic" (physical) "logic" performance of XML-documents is supported. The logic structure of the XML-document represents hierarchy of the structural elements making his maintenance{contents} selected tegami of a marking. While physical performance of XML-documents as already was specified, is self-described, for their logic performance is provided an opportunity of the separate obvious description. For this purpose definition such as documents - DTD serves.
Fig. 2. An example of the XML-document. The selected fragments of syntax XML concern to the description of physical performance of the document
Thus, though XML also supports two-level performance of documents, but, as against technologies of databases in their modern embodiment, in XML the description of physical (stored{kept}) performance not "otchuzhdeno" from the document, and it is built - in him (fig. 2). This circumstance in an essential degree limits in XML environment of an opportunity of support of independence of the data.
The top level of information architecture of performance of XML-documents is a description of his semantics. The opportunities stipulated for it it is planned to consider in the second part of clause{article}.

|