Lies, Damn Lies & MSDN Documentation
The MSDN documentation for XmlDocument.LoadXml() states “This method does not do DTD or Schema validation. If you want validation to occur, use the Load method and pass it an XmlValidatingReader. See XmlDocument for an example of load-time validation”. As it turns out, this is only a half-truth.
If you happen to be dealing with XHTML (or any other standard that has a publicly accessible DTD), then you’ll most likely have a DOCTYPE declaration at the top of the document, like
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
However, if you try to load an XHTML document containing the DOCTYPE declaration into an XmlDocument object using LoadXml(), the underlying XmlTextReader will try to resolve the URIs in the DOCTYPE. This is usually fine in a development environment, but can cause major headaches when working in a restricted production environment, where, for example, the Web severs cannot access the Internet in general, or the W3C site in particular.
So, what do to?
Well, you have two options:
1. Set the XmlResolver object in the XmlDocument to null, i.e.
XmlDocument xdXhtmlContent = new XmlDocument(); xdXhtmlContent.XmlResolver = null;
2. Copy the DTD(s) locally and create your own XmlUrlResolver that overrides the URI for them (e.g. to a local path):
public class FileSystemDTDUriResolver : XmlUrlResolver
{
/// <summary>
/// Resolves the absolute URI from the base and relative URIs.
/// </summary>
/// <param name="baseUri">The base URI used to resolve the relative URI.</param>
/// <param name="relativeUri">The URI to resolve. The URI can be absolute or relative. If absolute, this value effectively replaces the baseUri value. If relative, it combines with the baseUri to make an absolute URI.</param>
/// <returns></returns>
/// <remarks>The baseUri and relativeUri are given to match the method signature of XmlUrlResolver. However, for the purpose of an XmlDocument's XmlResolver, the Uri cannot be relative.</remarks>
public override Uri ResolveUri(Uri baseUri, string relativeUri)
{
if (relativeUri.ToLower().EndsWith(".dtd"))
{
string sRequestedDtd = relativeUri.Substring(relativeUri.LastIndexOf('/')).TrimStart('/');
string sDTDFilesLocation = ConfigurationManager.AppSettings["DTDFileSystemLocation"].Replace('\\', '/'); //C:\Webfiles\w3xml\DTD\
if (sDTDFilesLocation.LastIndexOf('/') != sDTDFilesLocation.Length - 1)
{
sDTDFilesLocation += "/";
}
return new Uri("file:///" + sDTDFilesLocation + sRequestedDtd);
}
return base.ResolveUri(baseUri, relativeUri);
}
}
Then set the XmlResolver as:
xdXhtmlContent.XmlResolver = new FileSystemDTDUriResolver();
One thing to note about this approach is that you cannot use a relative Uri with the XmlResolver, so you can either specify a file system path (using file:///) or (if you modify the sample code slightly) an absolute local URL, e.g. http://localhost/w3xml/DTD/
Further reading:
- http://groups.google.co.uk/group/microsoft.public.dotnet.xml/browse_frm/thread/f476f6a3fd610e78/09c99a4917a3c416?q=loading+xml+document+no+dtd&rnum=9#09c99a4917a3c416
- http://www.eggheadcafe.com/forumarchives/NETxml/Nov2005/post24343352.asp
- http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpconResolvingExternalResources.asp
About this entry
You’re currently reading “Lies, Damn Lies & MSDN Documentation,” an entry on rebus
- Published:
- April 4, 2009 / 2:27 am
- Tags:
- C#, DTD, XmlDocument, XmlResolver
No comments yet
Jump to comment form | comments rss [?] | trackback uri [?]