Why does an invalid file successfully pass validation (C#)?

Good day to all! I ran into a problem that I can't properly connect the validation of the xml file via the xsd schema. The problem is that I have successfully validated a file that does not match the schema. I hope that knowledgeable people will be able to tell me where I made a mistake.

Scheme:

<?xml version="1.0" encoding="utf-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="books">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="book">
          <xs:complexType>
            <xs:sequence>

              <xs:element name="name">
                <xs:simpleType>
                  <xs:restriction base="xs:string">
                    <xs:minLength value="1" />
                  </xs:restriction>
                </xs:simpleType>
              </xs:element>

              <xs:element name="author">
                <xs:simpleType>
                  <xs:restriction base="xs:string">
                    <xs:minLength value="1" />
                  </xs:restriction>
                </xs:simpleType>
              </xs:element>

              <xs:element name="price" type="xs:positiveInteger" />

              <xs:element name="id" type="xs:int" />
            </xs:sequence>
          </xs:complexType>
        </xs:element>        
      </xs:sequence>
    </xs:complexType>
    <xs:unique name='unique_field_id'>
      <xs:selector xpath='xs:book' />
      <xs:field xpath='id' />
    </xs:unique>
  </xs:element>
</xs:schema>

XML file that passes validation successfully:

<?xml version="1.0" encoding="utf-8"?>
<books xmlns="localhost" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="/XML/ BooksListSchema.xsd">
  <book xmlns="">
    <name></name>
    <author>Author 3</author>
    <price>3200</price>
    <id>1003</id>
  </book>
  <book xmlns="">
    <name>Book 4</name>
    <author>Author 4</author>
    <price>-3000</price>
    <id>547</id>
  </book>
  <book xmlns="">
    <name>Book 1</name>
    <author>Athour 11</author>
    <price>1111</price>
    <id>547</id>
  </book>
</books>

The code that checks the file for validity:

private void ValidaionFile (HttpPostedFileBase fileData) {
            string SchemaPath = Server.MapPath("~/XML/BooksListSchema.xsd");

            var xDoc = new XmlDocument();
            xDoc.Load(fileData.InputStream);

            xDoc.Schemas.Add("", SchemaPath);
            xDoc.Validate(ValidationEventHandler);
        }

        private static void ValidationEventHandler(object sender, ValidationEventArgs e) {
            XmlSeverityType type = XmlSeverityType.Warning;
            if (Enum.TryParse<XmlSeverityType>("Error", out type)) {
                if (type == XmlSeverityType.Error) throw new Exception(e.Message);
            }
        }
Author: Евгений, 2019-02-03

1 answers

Perhaps I will describe what and how in detail.

Let's start with the validator logic .NET, slightly different from the usual logic, for example, SAX-parser. In particular, if the namespace specified in the XML element is not found among the loaded schemas during validation, the validator ignores this element and considers it valid. This makes some sense, but it can be confusing. Unfortunately, I didn't find any mention of this behavior in the documentation, but based on practical use.

In your case, you are loading the <books xmlns="localhost" ... element that has the namespace explicitly specified. However, when loading the schema xDoc.Schemas.Add("", SchemaPath);, you loaded it without specifying the namespace, so the validator ignored the books element and its contents when checking (the basic validation of the XML structure occurs when loading the document).

There are two ways for validation to work:

  1. Leave the schema and its loading unchanged and remove all references to xmlns namespaces from the XML being checked.

  2. Leave the source file as it is, change the schema definition to this

    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
       targetNamespace="localhost"
       xmlns="localhost"
       elementFormDefault="qualified">
    

    And change the schema loading in the validator:

    xDoc.Schemas.Add("localhost", SchemaPath);
    

Without going into the details of the schema definition, the given snippet of the XML validation code works correctly and throws all the necessary exceptions.

 3
Author: rdorn, 2019-02-03 16:55:50