4D v13.4

Overview of XML DOM Commands

Home

 
4D v13.4
Overview of XML DOM Commands

Overview of XML DOM Commands  


 

 

4D includes a set of commands used for parsing objects containing XML (eXtensible Markup Language) data.

The XML language is a data exchange standard. It is based on the use of tags and enables precise description of the data exchanged as well as their structure. XML files are Text format files; their content is parsed by the applications importing the data. Many applications now support this format.

For more information about XML, refer, for instance, to the http://xml.org and http://www.w3.org sites.

For XML support, 4D uses a library named Xerces.dll developed by the Apache Foundation company. 4D supports XML version 1.0.

Note: 4D allows direct importing and exporting of data in XML format using the import/export editor.

The commands of this theme are prefixed DOM. In fact, 4D offers two separate sets of XML commands, prefixed DOM and SAX: DOM (Document Object Model) and SAX (Simple API XML) are two different parsing modes for XML documents.

  • The DOM mode parses an XML source and builds its structure (its “tree”) in memory. Because of this, access to each element of the source is extremely fast. However, since the entire tree structure is stored in memory, the processing of large XML documents may lead to the memory capacity being exceeded and thus provoke errors.
  • The SAX mode does not build a tree structure in memory. In this mode, “events” (such as the start and end of an element) are generated when parsing the source. This mode lets you parse XML documents of any size, regardless of the amount of memory available. The SAX commands are grouped together in the "XML SAX" theme. For more information, please refer to the Overview of XML SAX Commands section.
For more information on XML standards, consult the following sites: http://www.saxproject.org/?selected=event and http://www.w3schools.com/xml/.

Objects created, modified or parsed by the 4D DOM commands can be text, URLs, documents or BLOBs. The DOM commands used for opening XML objects in 4D are DOM Parse XML source and DOM Parse XML variable.

Many commands then let you read, parse and write the elements and attributes. Errors are recovered using the DOM Parse XML variable command (common to both XML standards).

The XML GET ERROR command lets you close the source in the end.

Note about use of XML BLOB parameters: XML structures are based on Text type data so it is recommended to use Text type fields or variables to work with them. For historical reasons, the XML commands of 4D (for example DOM Parse XML variable) accept BLOB type parameters. In previous versions of 4D, the size of Text type variables was limited to 32 KB. Starting with version 11 of 4D, Text fields and variables can contain up to 2 GB of data. Since the previous limitation was removed, it is now highly inadvisable to store text in BLOBs. The use of BLOBs is reserved for processing binary data. In conformity with XML specifications, beginning with 4D v12, binary data are automatically encoded in Base64, even when the BLOB contains text.

Three XML DOM commands (DOM Create XML element, DOM Find XML element and DOM SET XML ELEMENT VALUE) accept XPath notation for accessing XML elements.
XPath notation comes from the XPath language, designed to navigate within XML structures. It allows the setting of elements directly within an XML structure via a "pathname" type syntax, without necessarily having to indicate the complete pathname in order to reach it. For example, given the following structure:

   <RootElement>
      <Elem1>
         <Elem2>
            <Elem3 Font=Verdana Size=10> </Elem3>
         </Elem2>
      </Elem1>
   </RootElement>

XPath notation allows you to access element 3 using the /RootElement/Elem1/Elem2/Elem3 syntax.

4D also accepts indexed XPath elements using the Element[ElementNum] syntax. For example, given the following structure:

   <RootElement>
      <Elem1>
         <Elem2>aaa</Elem2>
         <Elem2>bbb</Elem2>
         <Elem2>ccc</Elem2>
      </Elem1>
   </RootElement>

XPath notation allows you to access the “ccc” value using the /RootElement/Elem1/Elem2[3] syntax.

For an illustration of XPath notation, please refer to the examples in the DOM Create XML element and DOM Find XML element.

The following character sets are supported by the XML DOM and XML SAX commands of 4D:

  • ASCII
  • UTF-8
  • UTF-16 (Big/Small Endian)
  • UCS4 (Big/Small Endian)
  • EBCDIC code pages IBM037, IBM1047 and IBM1140 encodings,
  • ISO-8859-1 (or Latin1)
  • Windows-1252.

The XML language uses a number of specific terms and acronyms. This non-exhaustive list details the main XML concepts used by the commands and functions of 4D.

Attribute: an XML sub-tag associated with an element. An attribute always contains a name and a value (see diagram below).

Child: In an XML structure, an element in a level directly below another.

DTD: Document Type Declaration The DTD records the set of specific rules and properties that the XML must follow. These rules define, more particularly, the name and content of each tag as well as its context. This formalization of the elements can be used to check whether an XML document is in compliance (in which case, it is declared “valid”).
The DTD may be included in the XML document (internal DTD) or in a separate document (external DTD). Note that the DTD is not mandatory.

Element: an XML tag. An element always contains a name and a value. Optionally, an element may contain attributes (see diagram).

ElementRef: XML reference used by the 4D XML commands to specify an XML structure. This reference is made up of 8 coded characters in hexadecimal form, which means that its length is either 16 or 32 characters depending on whether you use a 32- or 64-bit system. It is recommended to declare XML references using the C_TEXT directive.

Parent: In an XML structure, an element in a level directly above another.

Parsing, parser: The act of analyzing the contents of a structured object in order to extract useful information. The commands of the “XML” theme are used to parse the contents of any XML objects.

Root: An element located at the first level of an XML structure.

Sibling: In an XML structure, an element at the same level as another.

Structure XML: structured XML object. This object can be a document, a variable, or an element.

Validation: An XML document is “validated” by the parser when it is “well-formed” and in compliance with the DTD specifications. See also Well-formed.

Well-formed: An XML document is declared “well-formed” by the parser when it complies with the generic XML specifications. See also Validation.

XML: eXtensible Markup Language. A computerized data exchange standard enabling the transfer of data as well as their structure. The XML language is based on the use of tags and a specific syntax, in keeping with the HTML language. However, unlike the latter, the XML language allows the definition of customized tags.

XSL: eXtensible Stylesheet Language. A language permitting the definition of style sheets used to process and display the contents of an XSL document.

Many functions in this theme return an XML element reference. If an error occurs during function execution (for example, if the root element reference is not valid), the OK variable is set to 0 and an error is generated.

In addition, the reference returned in this case is a sequence of 16 zero "0" characters ("0000000000000000").

 
PROPERTIES 

Product: 4D
Theme: XML DOM

 
SEE ALSO 

BASE64 DECODE
BASE64 ENCODE