Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

progress bar update during XMLTextReader routine

316 views
Skip to first unread message

Anthony Ogden

unread,
Jun 4, 2003, 4:25:46 PM6/4/03
to
Is there an easy way I can update a progressbar to show how much of an XML
file I have parsed?

I'm thinking get the number of lines in the file first, then starting
parsing and increase a counter,
update the progress bar accordingly.

What's the easiest, quickest way to find number of lines of text first?

This is a big file... 1GB.


Derek Harmon

unread,
Jun 4, 2003, 11:23:21 PM6/4/03
to
"Anthony Ogden" <anthon...@removethis.blueyonder.co.uk> wrote in message news:7BsDa.606$ay6...@news-binary.blueyonder.co.uk...

> Is there an easy way I can update a progressbar to show how much of an XML
> file I have parsed?

Sure, inherit from XmlTextReader and for every Read( ) check your position
within the stream. Then re-set the ProgressBar::Value property every so
often to notify it of the XmlTextReader's progress.

> I'm thinking get the number of lines in the file first, then starting
> parsing and increase a counter, update the progress bar accordingly.

Not that good, an XML file could be entirely one-line. The best measure
is probably your byte offset into the stream length (e.g., file position
divided by file size). Better approximations may involve elements
parsed divided by total number of elements, but then you have the
problem of figuring out the total number of elements. This still may
have shortcomings based on how 'unbalanced' your XML is, for
example:

<doc>
<littleElem/>
<littleElem/>
<littleElem/> <!-- Yay! I'm 75% done! -->
<REALLY_BIG_ELEMENT>
<Over100MegOfStuffInside/>
</REALLY_BIG_ELEMENT> <!-- Maybe I wasn't 75% done before. -->
</doc>

whereas the byte offset in any particular serialization has it's own
peculiar issues depending on the encoding and canonicalization of
your document.

> What's the easiest, quickest way to find number of lines of text first?

The easiest, quickest way is for the application that generates the file
to write the number of lines in the file at the beginning of the file.

Otherwise, you must either take a guess (by sampling a portion of the
file, counting the number of lines within, and extrapolating for the remainder
of the file) or read the entire file sequentially.

> This is a big file... 1GB.

Here's a demonstration (this won't win me any awards for GUI of the Month,
but...) that updates progress about every kilobyte (though I rather suspect
the FileStream's Position is based on an internally used 4 kilobyte read buffer,
so perhaps every 4 KB is more accurate).

- - - XmlProgressForm.cs
using System;
using System.Drawing;
using System.IO;
using System.Windows.Forms;
using System.Xml;

public class XmlProgressForm : Form, IDisposable
{
private FileStream fStream;
private ProgressBar progBar;
private Label procStmt;

public XmlProgressForm( string xmlFilename)
{
try
{
fStream = new FileStream( xmlFilename, FileMode.Open);
}
catch ( FileNotFoundException)
{
fStream = null;
Console.WriteLine( "The XML file '{0}' could not be opened.", xmlFilename);
}

// Call construction and initialization methods.
CreateComponents( );
InitComponents( );
}

public static void Main( string[ ] args )
{
XmlProgressForm theForm;

if ( args.Length != 1 )
{
Console.WriteLine( "Usage: XmlProgress.exe <filename.xml>");
return;
}

theForm = new XmlProgressForm( args[ 0]);
theForm.Visible = true;

// In a real-world application, this would kick-off
// the XML parsing as a background thread.
//
theForm.Message = String.Format( "Parsing '{0}' Now ...", args[ 0]);
theForm.Run();
theForm.Message = "Done Processing ...";

System.Threading.Thread.Sleep( 1000); // Just to read message.
theForm.Close( );
}

protected void CreateComponents( )
{
progBar = new ProgressBar( );
procStmt = new Label( );
}

protected void InitComponents( )
{
procStmt.Location = new Point( 80, 64);
procStmt.Visible = true;

// ProgressBar measured in byte offset compared to
// total file length.
//
progBar.Value = 0;
progBar.Maximum = (int)(fStream.Length >> 10);

progBar.Location = new Point( 40, 120);
progBar.ClientSize = new Size( 240, 16);

this.ClientSize = new Size( 320, 240);
this.Text = "XML Progress Bar";
this.Controls.AddRange( new Control[ ] { progBar, procStmt} );
}

protected override void Dispose( bool disposeComponents)
{
if ( disposeComponents == true )
{
progBar.Dispose( );
procStmt.Dispose( );
}

if ( fStream != null )
{
fStream.Close( );
fStream = null;
}

base.Dispose( disposeComponents);
}

public void Run( )
{
XmlReader reader = new XmlProgressReader( fStream) as XmlReader;
IProgressReporter parseProgress = reader as IProgressReporter;

// I'm not really parsing anything, just demonstrating
// how to query the progress.
//
while ( reader.Read( ) )
{
// right arithmetic shift by 10 is equivalent
// to dividing by 1024 (2^10), ie, how many
// kilobytes have been parsed?
//
progBar.Value = (int)(parseProgress.Progress >> 10);

// This is just to slow it down so I can see the
// progress bar update for a small XML document.
//
System.Threading.Thread.Sleep( 25);
}
}

public string Message
{
set
{
procStmt.Text = value;
procStmt.Size = new Size( procStmt.PreferredWidth, procStmt.PreferredHeight);
this.Refresh( );
}
}
}

public interface IProgressReporter
{
long Progress { get; }
}

public class XmlProgressReader : XmlTextReader, IProgressReporter
{
private Stream myStream;

public XmlProgressReader( Stream s) : base( s)
{
myStream = s;
}

public long Progress
{
get
{
return myStream.Position;
}
}
}
- - -

It goes without saying that the XML parsing activity contained in Run( ) in this demo
program should be executed asynchronously as a background process, and perform
the progBar.Value updates via a callback delegate in the UI thread.

The key points here are that I subclass XmlTextReader to operate on a FileStream.
From the FileStream I can get the two quanities I need, the fixed length of the XML
file and the variable position within that file (again, I think XmlTextReader reads in
4K chunks). Each step of the Progress Bar is meant to correspond to 1 KB of the
XML file having been parsed.

All that's required in my subclass is to pass the Stream to the base class constructor,
and I expose a property (implementing IProgressReporter) named Progress that
reads the Position property of the underlying file Stream. I encourage you to separate
the XmlReader functionality of XmlProgressReader from the IProgressReporter
functionality; it will make it cleaner to pass a reference of IProgressReporter to your
callback when you implement the asynchronous updating of the UI since the Progress
Bar logic doesn't need to know anything about the XmlReader.

I've inserted some Thread.Sleep( ) statements just to make progress bar increments
apparent to the naked eye with smaller XML documents for demonstration purposes.
These should be removed from the final product.


Derek Harmon


Anthony Ogden

unread,
Jun 5, 2003, 4:25:27 AM6/5/03
to
Excellent !

Thanks very much Derek. I did realise later that reading each line was
probalby no good as a count,
as you say, the XML tags are not all necessarily on separate lines ... hehe.

Great example, many thanks.


"Derek Harmon" <lore...@msn.com> wrote in message
news:e6By1HxK...@TK2MSFTNGP09.phx.gbl...

0 new messages